PROVIDE Project Technical Paper 2005:1
to various reasons, ranging from poor data capturing, inconsistent or erroneous reporting and
deliberate misrepresentation by respondents. In the welfare literature total expenditure is often
used as a more accurate measure of welfare. However, in the case of the IES 2000 there is no
reason to believe that expenditure was captured or reported more accurately than income,
since, as we show below, there appears to be no consistency in the way in which income is
over- or underreported in the data.40
February 2005
On average total income (totinc) and total expenditure (totexp) do not differ that much.
The (unweighted) average totinc is R34,470, compared to the average totexp of R32,759. If
households are grouped into those over-reporting income, those where income equals
expenditure and those underreporting income, it can be seen than most households (16,590
out of 26,215) over-report income, with income exceeding expenditure, on average, by
36.2%. This is an interesting result given evidence that households usually tend to
underreport income in these types of household surveys. Only 17 households report the exact
same income and expenditure, while 9,608 households underreport income. For these
households expenditure exceeds income by an average of 33.6%.
. sum totinc totexp
Variable | |
Obs |
Mean |
Std. Dev. |
Min |
Max | |
------------ totinc |
|- |
26215 26215 |
34470.39 32759.18 |
92908.21 84078.72 |
0 12 |
5602178 7568643 |
. sum totinc |
totexp if |
totinc > totexp | ||||
Variable |
| |
Obs |
Mean |
Std. Dev. |
Min |
Max |
------------ totinc |
| |
16590 16590 |
34906.31 25634.41 |
106543.5 59721.15 |
300 42 |
5602178 3751763 |
. sum totinc |
totexp if |
totinc == totexp | ||||
Variable |
| |
Obs |
Mean |
Std. Dev. |
Min |
Max |
------------ totinc |
| |
17 17 |
5098.941 5098.941 |
4518.258 4518.258 |
1440 1440 |
18864 18864 |
. sum totinc |
totexp if |
totinc < totexp | ||||
Variable |
| |
Obs |
Mean |
Std. Dev. |
Min |
Max |
totinc |
| |
9608 9608 |
33769.68 45110.37 |
62846.27 113529.9 |
0 12 |
1713000 7568643 |
Below is the detailed summary statistics of variable diff, defined as totinc minus totexp.
Variable diff ranges from -R5.86 million to R5.50 million, with a mean value of R1,711
(income is over-reported on average). Graphically the distribution of diff looks fairly
symmetrical (see Figure 10), but bear in mind that the x-axis in the figure is truncated. In
reality the distribution is skewed to the right. These absolute differences are, however,
40 The data reported here comes from ies2000h_orig.dta after dropping mismatched observations from the
various merges [drop if merge1a == 2 | merge1b == 2 | merge1c == 1 | merge0b == 1].
53
© PROVIDE Project