Creating a 2000 IES-LFS Database in Stata



PROVIDE Project Technical Paper 2005:1
to various reasons, ranging from poor data capturing, inconsistent or erroneous reporting and
deliberate misrepresentation by respondents. In the welfare literature total expenditure is often
used as a more accurate measure of welfare. However, in the case of the IES 2000 there is no
reason to believe that expenditure was captured or reported more accurately than income,
since, as we show below, there appears to be no consistency in the way in which income is
over- or underreported in the data.
40

February 2005


On average total income (totinc) and total expenditure (totexp) do not differ that much.
The (unweighted) average
totinc is R34,470, compared to the average totexp of R32,759. If
households are grouped into those over-reporting income, those where income equals
expenditure and those underreporting income, it can be seen than most households (16,590
out of 26,215) over-report income, with income exceeding expenditure, on average, by
36.2%. This is an interesting result given evidence that households usually tend to
underreport income in these types of household surveys. Only 17 households report the exact
same income and expenditure, while 9,608 households underreport income. For these
households expenditure exceeds income by an average of 33.6%.

. sum totinc totexp

Variable |

Obs

Mean

Std. Dev.

Min

Max

------------

totinc
totexp

|-
|

26215

26215

34470.39

32759.18

92908.21

84078.72

0

12

5602178

7568643

. sum totinc

totexp if

totinc > totexp

Variable

|

Obs

Mean

Std. Dev.

Min

Max

------------

totinc
totexp

|
|

16590

16590

34906.31

25634.41

106543.5

59721.15

300

42

5602178

3751763

. sum totinc

totexp if

totinc == totexp

Variable

|

Obs

Mean

Std. Dev.

Min

Max

------------

totinc
totexp

|
|

17

17

5098.941

5098.941

4518.258

4518.258

1440

1440

18864

18864

. sum totinc

totexp if

totinc < totexp

Variable
------------

|

Obs

Mean

Std. Dev.

Min

Max

totinc
totexp

|
|

9608

9608

33769.68

45110.37

62846.27

113529.9

0

12

1713000

7568643

Below is the detailed summary statistics of variable diff, defined as totinc minus totexp.
Variable
diff ranges from -R5.86 million to R5.50 million, with a mean value of R1,711
(income is over-reported on average). Graphically the distribution of
diff looks fairly
symmetrical (see Figure 10), but bear in mind that the
x-axis in the figure is truncated. In
reality the distribution is skewed to the right. These absolute differences are, however,

40 The data reported here comes from ies2000h_orig.dta after dropping mismatched observations from the
various merges [
drop if merge1a == 2 | merge1b == 2 | merge1c == 1 | merge0b == 1].

53

© PROVIDE Project



More intriguing information

1. The name is absent
2. The Nobel Memorial Prize for Robert F. Engle
3. Infrastructure Investment in Network Industries: The Role of Incentive Regulation and Regulatory Independence
4. EU enlargement and environmental policy
5. LIMITS OF PUBLIC POLICY EDUCATION
6. The name is absent
7. Disturbing the fiscal theory of the price level: Can it fit the eu-15?
8. The name is absent
9. Testing Gribat´s Law Across Regions. Evidence from Spain.
10. THE EFFECT OF MARKETING COOPERATIVES ON COST-REDUCING PROCESS INNOVATION ACTIVITY