Creating a 2000 IES-LFS Database in Stata



PROVIDE Project Technical Paper 2005:1
workers and consequently did not answer this section. There were also 40 observations in
domworkerh.dta for which no match could be found in general.dta.

February 2005


general & |

domworkerh |

Freq.

Percent

Cum.

1 |

24134

91.75

---------

91.75

2 |

40

0.15

91.90

3 |
_____________I____

2131

8.10

100.00

Total |       26305      100.00

While 5 of these 40 observations report zero expenditure, the remaining 35 observations
report expenditure ranging from R1,020 to R48,600, with an average of R10,195. The
tabulation of
merge1b shows 38 observations in general.dta not found in homegrownh.dta.
One can again safely assume that these households did not partake in any home production
for home consumption. However, 4 observations were found in
homegrownh.dta that were
not in
general.dta. These households report zero expenditure on inputs, zero sales and very
low consumption of own produce and livestock (output appears below).

general & |

homegrownh |       Freq.      Percent         Cum.

------------+-----------------------------------

1 |           38          0.14          0.14

2  |             4          0.02          0.16

3 |       26267        99.84       100.00

------------+-----------------------------------

Total |       26309      100.00

hhid v~inputs v~prodsale v~prodcons

v~livesale

0

0

0

0


v~livecons

3000

0

0

0


7353.  3.251e+12           0           0         248

7413.  4.061e+12            0            0            0

10924.  5.032e+12           0           0          45

11446.  5.072e+12           0           0          75

Finally, the merge between general.dta and personh.dta revealed that 46 observations
were only found in
general.dta. Whereas with the previous merges this was not a problem
(one could simply assume that the relevant expenditures were zero) it is more problematic
here since demographic information (race, gender, age, province) and employment data are
now missing for 46 observations. This renders these 46 observations virtually unusable. Many
of these ‘mismatched’ observations are dropped from the sample at a later stage.

general & |

personh |       Freq.      Percent         Cum.

------------+-----------------------------------

1 |           46          0.17          0.17

3 |       26263        99.83       100.00

------------+-----------------------------------
Total |       26309      100.00

3.2.5. Cleaning the data (cleanup.do)

After merging the datasets cleanup.do is run. As discussed in section 2.3 the IES 2000 dataset
is plagued by numerous data problems. Do-file
cleanup.do aims to rectify some of the minor
ones, such as the simple adding-up problems. It also checks for consistency in the reported

37

© PROVIDE Project



More intriguing information

1. The name is absent
2. The name is absent
3. Urban Green Space Policies: Performance and Success Conditions in European Cities
4. The name is absent
5. Review of “From Political Economy to Economics: Method, the Social and Historical Evolution of Economic Theory”
6. The Impact of Hosting a Major Sport Event on the South African Economy
7. Environmental Regulation, Market Power and Price Discrimination in the Agricultural Chemical Industry
8. Confusion and Reinforcement Learning in Experimental Public Goods Games
9. The name is absent
10. ASSESSMENT OF MARKET RISK IN HOG PRODUCTION USING VALUE-AT-RISK AND EXTREME VALUE THEORY
11. Orientation discrimination in WS 2
12. The name is absent
13. Does Presenting Patients’ BMI Increase Documentation of Obesity?
14. IMMIGRATION POLICY AND THE AGRICULTURAL LABOR MARKET: THE EFFECT ON JOB DURATION
15. Non-farm businesses local economic integration level: the case of six Portuguese small and medium-sized Markettowns• - a sector approach
16. The name is absent
17. The mental map of Dutch entrepreneurs. Changes in the subjective rating of locations in the Netherlands, 1983-1993-2003
18. The name is absent
19. Parent child interaction in Nigerian families: conversation analysis, context and culture
20. The name is absent