Update to a program for saving a model fit as a dataset



Stata Technical Bulletin

15


. clad Ioffinc Isize urban coastal, 11(0) reps(200)

Initial sample size = 1581

Final sample size = 1580

Pseudo R2 = .05048178

Bootstrap statistics

Variable I

Reps

Observed

Bias

Std. Err.

[957. Conf

. Interval]

Isize I

I
I

200

1.149846

.0554115

.2544479

.6480861

.7073701

.6859084

1.651606

1.689895

1.624102

(N)

(P)

(BC)

urban I

I
I

200

2.375166

.0128999

.3375226

1.709586

1.642076

1.677854

3.040746

3.120919

3.184893

(N)

(P)

(BC)

coastal I

I
I

200

1.287741

-.0094159

.2830439

.7295905

.7311435

.7339153

1.845891

1.863342

1.90661

(N)

(P)

(BC)

const I

I
I

200

6.443694

-.0810437

.6198413

5.221394

4.956254

5.371459

7.665994

7.557803

7.730506

(N)

(P)

(BC)

N = normal, P = percentile, BC = bias-corrected

The first line of output tells us that the original sample size is 1,581 and in the second line we learn that the algorithm for
estimation dropped one case from the sample. An important caveat to the pseudo ^-squared reported on the third line, is that
this is the reported statistic from the last iteration of the qreg command on the final sample size. It is not the pseudo ^-squared
for the original sample, but we have opted to report this statistic to provide some indication of how the model is performing.

In the example above, no sample design information is passed to clad and the program calls Stata’s bsample utility to
resample the data 200 times. In order to maintain the same sample size in each bootstrap resample, clad ignores observations
where the dependent variable is missing. The results from bsample are then passed to the bstat command to generate the
standard Stata bootstrap output. For more information about the normal, percentile, and bias-corrected percentile confidence
intervals, see bstrap in the Stata manuals. For an introduction to the bootstrap principle, see Efron and Tibshirani (1993). In
order to reproduce results from clad, it is necessary first to set the random number seed; see generate in the Stata reference
manuals for more information.

The reported standard errors above will be correct if the sample comes from a simple random draw. This is not the case
with the GLSS data, which was collected using a two-stage design. clad can generate bootstrap estimates of the standard errors
which are robust to the two-stage design by passing the information about the primary sampling unit (PSU) to clad. For example,
we correct the standard errors above for this aspect of the sample in the example below.

. clad Ioffinc Isize urban coastal, 11(0) reps(200) psu(clust)

Initial sample size = 1581

Final sample size = 1580

Pseudo R2 = .05048178

Bootstrap statistics

Variable I

Reps

Observed

Bias

Std. Err.

[957. Conf

. Interval]

Isize I

I
I

200

1.149846

.0916958

.395014

.3708959

.6573149

.6507832

1.928797

2.076703

2.053507

(N)

(P)

(BC)

urban I

I
I

200

2.375166

.0562143

.6152112

1.161996

1.285434

1.12299

3.588336

3.658858

3.495041

(N)

(P)

(BC)

---------+-.

coastal I

I
I

200

1.287741

.0386539

.5439033

.2151873

.2898641

.0728349

2.360294

2.466994

2.216781

(N)

(P)

(BC)

const I

I
I

200

6.443694

-.1804084

1.04149

4.389922

3.942665

4.440762

8.497466

8.130428

8.347237

(N)

(P)

(BC)

N = normal, P = percentile, BC = bias-corrected

It is worth noting that introducing information about the sample design only affects the estimates of the standard errors.



More intriguing information

1. Perceived Market Risks and Strategic Risk Management of Food Manufactures: Empirical Results from the German Brewing Industry
2. The name is absent
3. The name is absent
4. SLA RESEARCH ON SELF-DIRECTION: THEORETICAL AND PRACTICAL ISSUES
5. LABOR POLICY AND THE OVER-ALL ECONOMY
6. Name Strategy: Its Existence and Implications
7. Detecting Multiple Breaks in Financial Market Volatility Dynamics
8. Group cooperation, inclusion and disaffected pupils: some responses to informal learning in the music classroom
9. The name is absent
10. Impacts of Tourism and Fiscal Expenditure on Remote Islands in Japan: A Panel Data Analysis
11. Education Responses to Climate Change and Quality: Two Parts of the Same Agenda?
12. Synchronisation and Differentiation: Two Stages of Coordinative Structure
13. The name is absent
14. Problems of operationalizing the concept of a cost-of-living index
15. Multiple Arrhythmogenic Substrate for Tachycardia in a
16. Regional science policy and the growth of knowledge megacentres in bioscience clusters
17. Effort and Performance in Public-Policy Contests
18. Cancer-related electronic support groups as navigation-aids: Overcoming geographic barriers
19. THE ANDEAN PRICE BAND SYSTEM: EFFECTS ON PRICES, PROTECTION AND PRODUCER WELFARE
20. Mortality study of 18 000 patients treated with omeprazole