The name is absent

Stata Technical Bulletin

snp15.1

Update to somersd

Roger Newson, Guy’s, King’s and St Thomas’ School of Medicine, London, UK, [email protected]

Abstract: somersd calculates confidence intervals for rank-order statistics. It has been improved, streamlined, debugged, and
intensively certified.

Keywords: Somers’ D, Kendall’s tau, rank correlation, confidence intervals, nonparametric methods.

Syntax

somersd Vaariip Weighht∖ [if exp [in range [, clusterVααrαmne) level (#) taua tdist

transf (transformation-name) cimatrix(aewmatrix) ]

where transformationагате is one of

iden I z I asin ∣ rho ∣ zrho

fweights, iweights, and pweights are allowed.

New options

cimatrix(aew_mrtrix) specifies an output matrix to be created, containing estimates and confidence limits for the untransformed
Somers’ D, Kendall’s τ_a or Greiner’s p parameters. If transf () is specified, then the confidence limits will be asymmetric
and based on symmetric confidence limits for the transformed parameters. This option (like level) may be used in replay
mode as well as in non-replay mode.

New saved results

somersd now saves additionally the name of the program called by predict in the macro e(predict).

Remarks

somersd was introduced in Newson (2000). The program calculates confidence intervals for the rank order statistics Somers’
D and Kendall’s τ_a for the first variable of aarlist as a predictor of each of the other variables in aarlist, with estimates and
jackknife covariances saved as estimation results. The new version contains the following improvements:

1. The new option cimatrix has been added (mostly for programmers).

2. The program somers_p has been added as the predict program for somersd, and it warns the user that predict should
not be used after somersd.

3. somersd has been streamlined. If cluster () is not specified, then processing time is now quadratically dependent on the
number of distinct value combinations in varlist, instead of being quadratically dependent on the number of observations
as before. This makes a vast difference to the time taken to process discrete variables in data sets with thousands of
observations.

4. A bug has been corrected, which formerly caused incorrect output when the taua option was used with unequal fweights.
(This bug was not present in the earlier version of somersd circulated via the Ideas list, and there was no excuse for me
to allow it to creep in when upgrading somersd for the STB.)

5. The certification script used to certify somersd is now much more comprehensive than before, ruling out the above bug and
a large range of others. (See the online help cscript.) Amongst other checks, it checks its jackknife confidence intervals
for Kendall’s τ_a with those produced by ktau and jknife (Gould 1995). The latter programs produce the same confidence
limits as somersd, taua tdist in the most simple case, without weights, clustering or transformations. However, ktau
and jknife take much longer, requiring a time cubically dependent on the number of observations.

Acknowledgments

I would like to thank Bill Gould of Stata Corporation for suggesting the somers_p program, and Bill Gould and Ken Higbee
of Stata Corporation for a great deal of very helpful advice on designing certification scripts.

References

Gould, W. 1995. sg34: Jackknife estimation. Stata Technical Bulletin 24: 25-29. Reprinted in Stata Technical Bulletin Reprints, vol. 4, pp. 165-170.

Newson, R. 2000. snp15: somersd—Confidence limits for nonparametric statistics and their differences. Stata Technical Bulletin 55: 47-55.

More intriguing information

1. ISSUES AND PROBLEMS OF IMMEDIATE CONCERN
2. The name is absent
3. Labour Market Flexibility and Regional Unemployment Rate Dynamics: Spain (1980-1995)
4. El Mercosur y la integración económica global
5. El impacto espacial de las economías de aglomeración y su efecto sobre la estructura urbana.El caso de la industria en Barcelona, 1986-1996
6. Food Prices and Overweight Patterns in Italy
7. Nach der Einführung von Arbeitslosengeld II: deutlich mehr Verlierer als Gewinner unter den Hilfeempfängern
8. Language discrimination by human newborns and by cotton-top tamarin monkeys
9. The name is absent
10. The Macroeconomic Determinants of Volatility in Precious Metals Markets