Update to a program for saving a model fit as a dataset



14


Stata Technical Bulletin


STB-58


Description

This insert provides the program clad for estimating Powell’s (1984) censored least absolute deviations estimator (CLAD)
and bootstrap estimates of its sampling variance. The CLAD estimator is a generalization of the least absolute deviations (LAD)
estimator, which is implemented in Stata in the command qreg. Unlike the standard estimators of the censored regression model
such as tobit or other maximum likelihood approaches, the CLAD estimator is robust to heteroscedasticity and is consistent and
asymptotically normal for a wide class of error distributions. See Arabmazar and Schmidt (1981) and Vijverberg (1987) for
empirical examples of the magnitude of the bias resulting from the tobit estimator in the presence of nonnormal error distributions.

This program sidesteps the issue of programming analytical standard errors and provides instead bootstrapped estimates of
the sampling variance. Rogers (1993) shows that the standard errors reported by Stata for qreg are not robust to violations of
homoscedasticity or independence of the residuals and proposes a bootstrap alternative. We follow Rogers for the CLAD estimator
and propose two bootstrap estimates of the standard errors. The first is the standard bootstrap which assumes that the sample
was selected using a simple random design. The second is a bootstrap estimate which assumes that the sample was selected in
two stages and which replicates the design by bootstrapping in two stages.

An advantage of the two-stage bootstrap estimates available in clad is that if the sample was collected using a two-stage
process, then the estimated standard errors will be robust to this design effect. Kish (1995) and Cochran (1997) show the
importance of correcting mean values for design effects. Scott and Holt (1982) show that the magnitude of the bias for the
estimated variance-covariance matrix for ordinary least squares estimates can be quite large when it is erroneously assumed that
the data were collected using a simple random sample; if in fact a two-stage design had been used.

Syntax

clad varlist [if exp [in range] [, reps(#) psu (.aaraeme) 11[(#)] ul[(#)] dots saving filename)
replace level(#) quantile(#) iterate(#) wlsiter(#) ]

Options

reps(#) specifies the number of bootstrap replications to be performed. The default value is 100.

psu Varramne) specifies the variable identifying the primary sampling unit. If no variable is specified, then the bootstrap replication
is a single-stage, simple random draw on the sample.

11[(#)] and u 1[(#)] are as in Stata’s tobit command and indicate the censoring point. 11() indicates left censoring and
ul() indicates right censoring. If 11 or ul is specified without a specific censoring value, then clad assumes that the
lower limit is the minimum observed in the data (if 11 is specified) and the upper limit is the maximum (if ul is specified).
If nothing is specified for a lower or upper bound, clad assumes that the lower limit is zero. clad only functions with
lower or upper censoring; one cannot specify censoring at both the lower and upper bound.

dots prints a dot to the screen for each bootstrap replication; thereby allowing the user to estimate, after a few replications, the
time to completion.

Savingfileaame) creates a Stata datafile (.dta file) containing the bootstrap sample of the parameter estimates.

replace overwrites the Stata datafile specified in saving(), if it already exists.

level (#) specifies the confidence level, in percent, for confidence intervals. The default is level (95) or as set by set level.

quant ile(#) specifies the quantile to be estimated and should be a number between 0 and 1, exclusive. Numbers larger than

1 are interpreted as a percent. The default value of 0.5 corresponds to the median.

iterate (#) specifies the maximum number of iterations that will be allowed to find a solution. The default value is 16,000,
and the range is 1 to 16,000.

wlsiter (#) specifies the number of weighted least squares iterations that will be attempted before the linear programming iterations
are started. The default value is 1. If there are convergence problems—something we have never observed—increasing this
value should help.

Examples

To illustrate the use of clad, we use data from the 1988 Ghana Living Standard Survey (GLSS) and consider a somewhat
nonsensical regression. The sample considered is 1,581 households, and the dependent variable, Ioffinc, is the log of household,
nonfarm income. Since some households are fully engaged in farming, this variable has 528 observations with zeros recorded.
This variable is regressed on the log of the size of the household, lsize, and two geographic dummy variables, urban and
coastal. When we issue clad we obtain the results below.



More intriguing information

1. Cryothermal Energy Ablation Of Cardiac Arrhythmias 2005: State Of The Art
2. sycnoιogιcaι spaces
3. Qualifying Recital: Lisa Carol Hardaway, flute
4. Rent Dissipation in Chartered Recreational Fishing: Inside the Black Box
5. Estimation of marginal abatement costs for undesirable outputs in India's power generation sector: An output distance function approach.
6. CAPACITAÇÃO GERENCIAL DE AGRICULTORES FAMILIARES: UMA PROPOSTA METODOLÓGICA DE EXTENSÃO RURAL
7. Pursuit of Competitive Advantages for Entrepreneurship: Development of Enterprise as a Learning Organization. International and Russian Experience
8. The name is absent
9. The name is absent
10. Transfer from primary school to secondary school
11. Alzheimer’s Disease and Herpes Simplex Encephalitis
12. AMINO ACIDS SEQUENCE ANALYSIS ON COLLAGEN
13. The name is absent
14. The name is absent
15. The name is absent
16. The Works of the Right Honourable Edmund Burke
17. Insurance within the firm
18. The value-added of primary schools: what is it really measuring?
19. Importing Feminist Criticism
20. The name is absent