PROVIDE Project Technical Paper 2005:1
February 2005
Once this is done the household identifier (hhid) together with the household-level
expenditure variables are saved as a new file named domworkerh.dta, the -h referring to the
fact that this is now a household-level database.
3.2.2. Home production for home consumption (homegrown.do)
The next do-file (homegrown.do) starts with the original homegrown.dta database. Various
modifications have to be done. As was the case with domworker.dta this file also allows for
multiple observations with the same hhid when a household produces multiple products or
keeps more than one type of livestock. Consequently it is necessary to create household-level
income or expenditure variables before exporting some of the information to the household-
level IES 2000 database. However, before these new variables can be created there are
various problems in homegrown.dta that need to be addressed first. Apart from containing
missing values, there were also numerous duplicate entries and various types of
inconsistencies. The fact that some commercial farmers also reported under this section is
problematic.25
A further problem, and perhaps a more serious error on the part of Statistics South Africa,
is the treatment of the value of sales of livestock and produce. In the IES 2000 this is
regarded as an expense and the value of sales is added to the expenditure side in the summary
expenditure tables. Arguably the value of sales could be regarded as the input cost of produce
sold, but an additional variable for value of inputs is also included in the database. This has
led us to believe that their treatment of sales is incorrect. The correct procedure would have
been to add the value of sales to the income side. Furthermore, the value of consumption of
home produced goods should be added to the expenditure side. However, this is completely
ignored by Statistics South Africa. The input cost of home production is correctly reported by
Statistics South Africa on the expenditure side of the household account.26
The do-file is divided into seven parts.27 After opening the homegrown.dta database and
identifying multi-product households (parts 1 and 2) the occurrence of missing values is
investigated in part 3. Missing values are usually problematic since they represent those cases
where respondents failed to answer a certain question. However, in the case of the
homegrown.dta database (and the IES 2000 database in general) various variables contain
25 Commercial farmers typically have relatively large sales figures as they produce mainly for the market. This
skews the data, since the majority of the respondents in homegrown.dta are small subsistence farmers or
normal households producing goods for own consumption. The idea behind this section of the
questionnaire was to capture information about these particular households and not commercial farmers.
26 Statistics South Africa was unable to confirm or deny this author’s belief that the treatment of these expenses
and incomes is incorrect in the IES 2000.
27 This section provides a brief summary of the functions of the do-file. A more detailed discussion appears in
PROVIDE (____-b) (not published yet).
32
© PROVIDE Project