12
Stata Technical Bulletin
STB-48
related to factors such as age, retirement and differential pensions. The latter result is not surprising since, by design, the social
housing sector is mainly for “low income” people. Observe that inequality within tenure groups accounts for very much more
of total inequality than inequality between tenure groups does.
Repeated application of these decomposition methods to data for several years can be used to account for trends over time in
income inequality; see Jenkins (1995) who used subgroup partitions defined by labor market status, age, household composition,
etc. to study trends during the 1970s and 1980s. In essence one examines whether trends in overall inequality are more closely
related to changes in subgroup inequalities, subgroup mean incomes, or subgroup population shares.
geivars: Generalized Entropy inequality indices, with sampling variances
geivars estimates members of the Generalized Entropy class GE(α) for a = —1,0,1, 2, see above for definitions, together
with their asymptotic sampling variances. Unit record (micro level) data are required.
The formulas for the sampling variances are taken directly from Cowell (1989). His formulas were derived assuming that
the income receiving units (households) are treated as a random sample from a bivariate distribution of income and a household
weight variable (e.g., household size). It is the assumptions about, and treatment of, weights which causes complexities of
estimation of sampling variances. (The issues overlap with, but are not the same as, those addressed by Stata’s svy programs.)
We require estimates of income inequality among all persons in the household population. In effect there is a random sample
of households with “self weighting” by household size, where the weights are similar to Stata’s fweights. Thus the variance
formulas do not also adjust for the effects of complex survey design features (stratification and clustering), formulas for this case
are rather complicated and the subject of current research. These problems do not arise, of course, if the data are unweighted.
Derivation of the formulas for the asymptotic variances use the result that the GE(α) indices can be written as functions
of sample moments. For further details, see Cowell (1989).
geivars output includes the estimates of the four indices, and three sets of variance estimates for each index, corresponding
to different informational assumptions. Vq is the variance in the case where both mean income and household size are known.
½( = ⅛ + ʌi) is the variance in the case where the former is not known, and ½(= V1 + Δ2) is the variance in the case where
both are unknown and estimated from the sample. (ʌɪ and Δ2 are contributions to the sampling variance arising from relaxing
the informational assumptions: see Cowell 1989.) In each case the asymptotic r ratio = GE(α)∕y,[V(α)] and associated p value
are also reported.
Syntax
geivars varname [weight] [if exp] [in range]
fweights are allowed.
Example
The specialist nature of the variance formulas led me to construct a slightly different version of the 1991 UK dataset in
order to match the assumptions. I use the same household income variable eybhc, but the data are now organized by household
rather than family (the household is the sampling unit in the original survey). The grossing-up weights have been neglected in
order to focus on the self-weighting aspect. As a result, the inequality estimates are not comparable with those shown earlier.
In this example, it turns out that the sampling variances of all four inequality indices are all quite small, regardless of which
informational assumption is made. These need not be the case in general, especially if the calculations are done for subgroups
with relatively few members.
. geivars eybhc [fw=number]
Warning: eybhc has 17 values = 0. Not used in calculations
Generalized entropy inequality measures, GE(a), with asym. s.e.s
a I |
-1 |
0 |
1 |
2 |
— GE(a) I |
2.83066 |
0.18896 |
0.19095 |
0.25465 |
VarO I |
6.51258 |
0.00156 |
0.00655 |
0.00066 |
s.e.0 I |
2.55198 |
0.03949 |
0.08094 |
0.02562 |
asym. t I |
1.10920 |
4.78552 |
2.35920 |
9.93927 |
P > ∣t∣ I |
0.26739 |
0.00000 |
0.01835 |
0.00000 |
deltal I |
-0.00176 |
-0.00050 |
-0.00645 |
-0.00043 |
Varl I |
6.51082 |
0.00106 |
0.00010 |
0.00023 |
s.e.l I |
2.55163 |
0.03253 |
0.01011 |
0.01506 |