The name is absent



24


Stata Technical Bulletin


STB-48


607.

223.64366

0.36338

Std. Dev.

188.66945

707.

259.48672

0.46592

757.

282.24383

0.52353

Variance

35596.15976

807.

310.64428

0.58654

Half CV^2

0.32290

907.

406.43660

0.73643

Gini coeff.

0.33721

957.

520.03530

0.83332

p90∕pl0

4.50111

997.

894.92777

0.94313

p75∕p25

2.11675

The likelihood values and estimates of the percentiles, inequality indices and other distribution parameters are remarkably
similar for both models.

All the estimates are also very similar to their nonparametric counterparts. For example, the nonparametric estimate of the
Gini coefficient is 0.333 and of the GE(2) index (half the squared coefficient of variation), 0.362: see the output from ineqdeco
in Jenkins (1999). Other nonparametric statistics can be derived by summary, detail:

. summarize eybhc [fw=wgt] if eybhc>0, detail

Equiv. net income BHC

Percentiles

Smallest

Γ/.

41.10482

.0076653

57.

79.116

1.938724

107.

92.79689

2.631398

Obs

55687900

257.

127.8417

2.808512

Sum of Wgt.

55687900

507.

195.036

Mean

233.7762

757.

287.5094

Largest
1846.438

Std. Dev.

198.8109

907.

402.397

2013.499

Variance

39525.79

957.

504.1051

3024.663

Skewness

14.44232

997.

818.264

7740.044

Kurtosis

484.1126

The greatest difference between the parametric and nonparametric estimates is at the very bottom and, especially, the very
top of the distribution. The latter difference is almost certainly due to the presence of a single high income outlier; note for
example the large under-estimation of the top-sensitive index GE(2) = half the squared coefficient of variation. In some cases,
one might argue that the parametric estimates were more reliable on the grounds that income data in the extreme tails of the
distribution are not reliable.

Goodness-of-fit may also be assessed graphically using probability plots. The psm, qsm, pdagum, and qdagum programs
written by Cox (1999) provide these using estimates produced by smfit and dagumfit.

The similarity of estimates in the example appears contrary to the claim sometimes made in the literature that the Dagum
distribution typically provides a better fit than the Singh-Maddala one. Results can perhaps be reconciled by observing that in
virtually all cases reported to date, estimates have been derived from grouped (banded) income data rather than unit record data
as here.

Other criteria besides goodness-of-fit may be relevant to a choice between smfit and dagumfit. The main difference I
have found is in convergence stability and time. In all the applications I have experimented with, smf it has converged quickly
in only a few iterations from the default starting values. By contrast, dagumf it typically took many more iterations and in
fact sometimes failed to converge using the default starting values (try fitting the Dagum distribution to the variable price in
auto.dta). In the illustration shown above, smfit took about a minute to converge using a Pentium P1/166 PC running Stata 5.0
for Windows 95, but dagumfit required almost 18 minutes. Part of the problem is that it is difficult to specify good default
starting values for dagumfit. In all the cases where the program did not converge, experimentation with a range of alternative
starting values led eventually to convergence. Use of the trace option is therefore recommended in all initial fits.

Acknowledgments

This work forms part of the scientific research program of the Institute for Social and Economic Research, and was supported
by core funding from the University of Essex and the
UK Economic and Social and Economic Research Council. The programs
are revisions and extensions of some presented at the 4th
UK Stata Users’ Group meeting. Markus Jantti and Nick Cox made
helpful comments on earlier versions of the programs.

References

Cox, N. J. 1999. gr35: Diagnostic plots for assessing Singh-Maddala and Dagum distributions fitted by MLE. Stata Technical Bulletin 48: 2-4.

Dagum, C. 1977. A new model of personal income distribution: specification and estimation. Economie Appliquée 30: 413-437.

--. 1980. The generation and distribution of income, the Lorenz curve and the Gini ratio. Economie Appliquée 33: 327-367.

Jenkins, S. P. 1999. sg104: Analysis of income distributions. Stata Technical Bulletin 48: 4-18.



More intriguing information

1. Tourism in Rural Areas and Regional Development Planning
2. The name is absent
3. A multistate demographic model for firms in the province of Gelderland
4. Moi individuel et moi cosmique Dans la pensee de Romain Rolland
5. On Social and Market Sanctions in Deterring non Compliance in Pollution Standards
6. Sex-gender-sexuality: how sex, gender, and sexuality constellations are constituted in secondary schools
7. Picture recognition in animals and humans
8. The Values and Character Dispositions of 14-16 Year Olds in the Hodge Hill Constituency
9. The Structure Performance Hypothesis and The Efficient Structure Performance Hypothesis-Revisited: The Case of Agribusiness Commodity and Food Products Truck Carriers in the South
10. Volunteering and the Strategic Value of Ignorance
11. DIVERSITY OF RURAL PLACES - TEXAS
12. Business Cycle Dynamics of a New Keynesian Overlapping Generations Model with Progressive Income Taxation
13. References
14. The name is absent
15. Do imputed education histories provide satisfactory results in fertility analysis in the Western German context?
16. Regionale Wachstumseffekte der GRW-Förderung? Eine räumlich-ökonometrische Analyse auf Basis deutscher Arbeitsmarktregionen
17. Studies on association of arbuscular mycorrhizal fungi with gluconacetobacter diazotrophicus and its effect on improvement of sorghum bicolor (L.)
18. Can a Robot Hear Music? Can a Robot Dance? Can a Robot Tell What it Knows or Intends to Do? Can it Feel Pride or Shame in Company?
19. The name is absent
20. The Employment Impact of Differences in Dmand and Production