PROVIDE Project Technical Paper 2005:1
industries or activities that are based on the same mapping of the commodity accounts
(variable activities).34
February 2005
Unfortunately the ISIC 93 codes used for variable stccode were not in all cases
disaggregated enough in order to map factors to each of the 96 industry categories. In some
cases, for example food production, the activity disaggregation went one step beyond the
factor code disaggregation in variable stccode (see activities.do). The problem cannot be
fixed in Stata. After the data has been extracted to a spreadsheet to form the factor-activity
sub-matrix the Supply and Use Tables (SUT 2000) were used to find the relative value-added
payments from activities to factors for those industries that are not disaggregated enough. The
value-added payments are then allocated to the more disaggregated activity accounts in the
ratios calculated from the Supply and Use Tables.
In order to obtain household-level labour income data the person-level labour income data
has to be converted to household-level data. The following statement in Stata is used to
achieve this.
for var P*: by hhid, sort: egen Xh = sum(X)
Only the observations relating to the head of the household is kept to create a household-
level database that contains, among other things, total household-level income from labour,
the race and gender of the head of the household, and information relating to the adult
equivalence scales. This file is saved as personh.dta to distinguish it from the person-level
person.dta.
3.2.4. General income and expenditure file (general.do)
Once domworker.do, homegrown.do and person.do has been run, the file generalorig.dta,
which contains the bulk of the household-level income and expenditure data, is opened and
merged with generalwgt.dta. The resulting file is saved as general.dta. The do-file
programme now returns to ies2000h.do and merges general.dta with the household-level files
domworkerh.dta, homegrownh.dta and personh.dta. The merge processes are done in
succession. Variables merge1a, merge1b and merge1c show the merge results. Tabulating
merge1a shows that there were 24,134 observations in general.dta not found in
domworkerh.dta. It can be safely assumed that these households did not employ domestic
34 The IES 2000 metadata file is somewhat confusing in this regard. It appears as if variables stccode and
jobcode were meant to be used jointly to form a single occupation code variable based on ISCO 88. This
is in fact how it was done in the LFS 2000:2 (see variable Q41Occup). In the LFS a second set of
questions was then asked relating to the type of goods produced at the workplace. This information was
then used to derive the activity code based on ISIC 93 (variable Q42Indus in LFS 2000:2). However, a
comparison of the IES and LFS suggests that variable stccode is the same as the LFS industry code
variable, while jobcode is the same as the LFS occupation code variable. It is therefore assumed that
stccode (activities) and jobcode (factors) are correctly defined and coded in IES 2000.
36
© PROVIDE Project