Creating a 2000 IES-LFS Database in Stata



PROVIDE Project Technical Paper 2005:1

February 2005


Table 8 continued...

Uncoded

True
missing

Total
"missing"
(A + B)

Miscoded

Not
missing, not
miscoded

Total obs.

(C + D + E)

Public transport

P1504Q01

25,259

9

25,268

113

882

26,263

P1504Q02

25,263

9

25,272

109

882

26,263

P1504Q03

25,263

9

25,272

109

882

26,263

P1504Q04

25,262

9

25,271

110

882

26,263

P1504Q05

25,263

9

25,272

109

882

26,263

P1504Q06

25,263

10

25,273

109

881

26,263

P1504Q07

25,263

9

25,272

109

882

26,263

P1504Q08

25,263

9

25,272

109

882

26,263

P1504total

25,258

9

25,267

114

882

26,263

Cost of other sport/recreation

goods

P2003Q01

24,726

-

24,726

102

1,435

26,263

P2003Q02

24,736

1

24,737

92

1,434

26,263

P2003Q03

24,735

2

24,737

93

1,433

26,263

P2003Q0401

24,731

1

24,732

97

1,434

26,263

P2003Q05

24,734

-

24,734

94

1,435

26,263

P2003Q06

24,735

-

24,735

93

1,435

26,263

P2003Q07

24,735

1

24,736

93

1,434

26,263

P2003Total_______________________

21,405

-

21,405

3,423

1,435

26,263

From Table 8 it is clear that the numbers of true missing values are quite low. The only
sections that contain more than ten missing values in certain variables are the two sections on
monthly housing costs. Changing missing values to zeroes is justifiable given the small
number of true missing values. It is better to rather lose information content of a few true
missing variables by changing it to zeroes than lose the entire observation due to the adding-
up restrictions in Stata. As explained previously missing values in expenditure categories
from
domworkerh.dta and homegrownh.dta were also changed to zero given that these were
optional section in the questionnaire.

The only other anomaly in Table 8 is variable P2003Total. Answering this question was
not optional, and hence the large number of uncoded observations is strange. However, as
explained below, it was established that this was a result of a coding error. Since this is a sub-
total it could simply be recalculated.

Another concern relates to questions 3.1 and 3.2 in the housing section. The questionnaire
requests respondents to answer either the section labelled monthly housing cost
if rented or
the section labelled monthly housing cost
if owned. In 21 cases households answered both
sections. It does not seem highly unlikely that some households own property (excluding
holiday homes) as well as rent property. The list below shows the reported values for each of
the questions in these two sections. There appears to be no duplication (apart from record
number 26015 - levy is reported twice). The error is small enough and is unlikely to affect
results in any great deal, and is consequently ignored.

41

© PROVIDE Project



More intriguing information

1. Output Effects of Agri-environmental Programs of the EU
2. fMRI Investigation of Cortical and Subcortical Networks in the Learning of Abstract and Effector-Specific Representations of Motor Sequences
3. Explaining Growth in Dutch Agriculture: Prices, Public R&D, and Technological Change
4. The name is absent
5. The name is absent
6. A Bayesian approach to analyze regional elasticities
7. Educational Inequalities Among School Leavers in Ireland 1979-1994
8. The name is absent
9. CGE modelling of the resources boom in Indonesia and Australia using TERM
10. THE ECONOMICS OF COMPETITION IN HEALTH INSURANCE- THE IRISH CASE STUDY.