Creating a 2000 IES-LFS Database in Stata



PROVIDE Project Technical Paper 2005:1
in the database, provided that these levels seemed realistic.29 The entire part 6 is included as a
separate do-file named
hphcdrop.do.

February 2005


Finally, in part 7, household-level variables were created for value of produce and
livestock sold and consumed (
valprodcons, valprodsale, vallivecons, valliveprod). These
values, together with the household-level input costs (
P2205TOT) are saved as
homegrownh.dta, which is subsequently merged with the other household-level files.

3.2.3. Person-level data file (person.do)

The next do-file is person.do. This do-file opens person.dta, which contains all the
information about each individual in each household, such as employment data and general
demographic information. Variable
race is slightly problematic since about 159 individuals
report race as ‘unspecified’ (code 5 or 9). Since the SAM household and factor accounts are
all disaggregated along racial lines, information about race is important. One option is to have
a separate racial category labelled ‘undefined’, but this is not justifiable given that only 0.15%
of the 104,153 individuals in
person.dta do not specify their race. Another option is to drop
observations with unspecified race from the sample, but this is also undesirable if it is
possible to work around the problem.

Closer inspection revealed that some of the ‘unspecified’ individuals live in households
where the head of the household did report his or her racial group. These individuals’ race
was changed to that of the head of the household. If the head of the household’s race is
unspecified, it is changed to that of the second household member (if available). After this
adjustment 134 individuals remain unspecified. These people live in 39 households in which
all members are unspecified. Unfortunately the whole process only ‘saves’ 25 individuals and
5 households.

Next, the do-file adds labels to variables and creates a few new ones, such as variable
region, which maps the province variable (prov) to the four SAM regions. New variables are
also created for the number of children (variable
K), the number of adults (variable A), the
total household size (variable
H)30, and the adult equivalent household size (variable E)31.

29 In some cases home per capita consumption levels were extremely high. One explanation for this is that own
produce (such as maize) is possibly used for livestock feed, in which case it should have been reported as
an input cost. Consumption levels were truncated at certain levels when they appeared unrealistically
high.

30 Although the original person.dta comes complete with a household size variable this variable appears to be
incorrect. Consequently it is re-calculated here.

31 The adult equivalence scale adjusts the actual household size to take into account differences in size and
structure of households. The adjusted household size variable
E is constructed using the formula
E=(A+αK)θ . May (1995, cited in Woolard and Leibbrandt, 2001) suggest that α = 0.5 and θ = 0.9 are
plausible values for South Africa. Some sensitivity analysis around these values will be done at a later
stage.

34

© PROVIDE Project



More intriguing information

1. The Integration Order of Vector Autoregressive Processes
2. The name is absent
3. Accurate, fast and stable denoising source separation algorithms
4. XML PUBLISHING SOLUTIONS FOR A COMPANY
5. TWENTY-FIVE YEARS OF RESEARCH ON WOMEN FARMERS IN AFRICA: LESSONS AND IMPLICATIONS FOR AGRICULTURAL RESEARCH INSTITUTIONS; WITH AN ANNOTATED BIBLIOGRAPHY
6. The name is absent
7. The name is absent
8. Are Public Investment Efficient in Creating Capital Stocks in Developing Countries?
9. Wounds and reinscriptions: schools, sexualities and performative subjects
10. Conflict and Uncertainty: A Dynamic Approach
11. Regional science policy and the growth of knowledge megacentres in bioscience clusters
12. The name is absent
13. Firm Closure, Financial Losses and the Consequences for an Entrepreneurial Restart
14. Spatial Aggregation and Weather Risk Management
15. The name is absent
16. Manufacturing Earnings and Cycles: New Evidence
17. THE WELFARE EFFECTS OF CONSUMING A CANCER PREVENTION DIET
18. The name is absent
19. Biological Control of Giant Reed (Arundo donax): Economic Aspects
20. The name is absent