PROVIDE Project Technical Paper 2005:1
February 2005
3.1. Reading in the data ( readin.do )
The raw IES 2000 data is supplied by Statistics South Africa in a series of ASCII text files.
These fixed-width files are read into Stata using dictionary files specifying the location
(column number) and length of each variable as it appears in each row of the ASCII files. The
do-file readin.do calls up all the dictionary files. The IES 2000 ASCII files are converted to
Stata files and saved as person.dta, personwgt.dta, generalorig.dta, generalwgt.dta,
domworker.dta and homegrown.dta. 22, 23 These files are merged at a later stage to form
person- and household-level IES 2000 files. The LFS 2000:2 ASCII files are also converted to
Stata files and saved as lfsperson.dta, which is merged with the data from worker.txt to form
lfs2000_2p.dta, and lfshouse.dta, which is merged with the data from stratum_psu.txt to form
lfs2000_2h.dta. Finally, lfs2000_2p.dta and lfs2000_2h are merged to form a file called
lfs2000_2.dta, which contains person- and household-level LFS 2000:2 data.
3.2. Forming a household-level IES 2000 dataset ( ies2000h.do )
The main aim of do-file ies2000h.do is to create the household-level file ies2000h.dta. It
starts by merging general.dta with domworkerh.dta, homegrownh.dta and personh.dta. Four
do-files are called up within ies2000h.do in order to create or prepare these data files for
merging.
3.2.1. Domestic workers (domworker.do)
Unlike the other household-level data files, the original file domworker.dta does not
necessarily only contain a single entry per household. If a certain household has more than
one domestic worker a new entry with the same household identification number (variable
hhid) is added to the database. It is therefore necessary to create a household-level version of
this file where each entry or observation reports the total expense for all domestic workers
employed by the household. This avoids double counting when merging files. The following
command adds up domestic worker expenses for observations with the same hhid number.24
for var P*: by hhid, sort: egen Xh = sum(X)
22 To save computing time this do-file can be skipped by placing an asterisk at the beginning of the command
line do readin.do, provided that the various *.dta files already exist in the relevant folder.
23 Originally only four ASCII files are supplied with the IES 2000 data. Two new files (personwgt.dta and
generalwgt.dta ) were obtained from Ingrid Woolard (HSRC). These files contain newly released person -
level and household-level weights for the IES 2000. At present they are not ‘official’ yet and cannot be
used. Also note that general.txt is now read in and saved in Stata as generalorig.dta.
24 Note that P* refers to all the variables starting with P-, i.e. the expenditure variables in domworker.dta.
31
© PROVIDE Project
More intriguing information
1. Weather Forecasting for Weather Derivatives2. The name is absent
3. Strategic Effects and Incentives in Multi-issue Bargaining Games
4. Tissue Tracking Imaging for Identifying the Origin of Idiopathic Ventricular Arrhythmias: A New Role of Cardiac Ultrasound in Electrophysiology
5. Impacts of Tourism and Fiscal Expenditure on Remote Islands in Japan: A Panel Data Analysis
6. Biological Control of Giant Reed (Arundo donax): Economic Aspects
7. Commuting in multinodal urban systems: An empirical comparison of three alternative models
8. A methodological approach in order to support decision-makers when defining Mobility and Transportation Politics
9. Apprenticeships in the UK: from the industrial-relation via market-led and social inclusion models
10. Public Debt Management in Brazil