Stata Technical Bulletin
15
. clad Ioffinc Isize urban coastal, 11(0) reps(200)
Initial sample size = 1581
Final sample size = 1580
Pseudo R2 = .05048178
Bootstrap statistics
Variable I |
Reps |
Observed |
Bias |
Std. Err. |
[957. Conf |
. Interval] | |
Isize I I |
200 |
1.149846 |
.0554115 |
.2544479 |
.6480861 .7073701 .6859084 |
1.651606 1.689895 1.624102 |
— (N) (P) (BC) |
urban I I |
200 |
2.375166 |
.0128999 |
.3375226 |
1.709586 1.642076 1.677854 |
3.040746 3.120919 3.184893 |
— (N) (P) (BC) |
coastal I I |
200 |
1.287741 |
-.0094159 |
.2830439 |
.7295905 .7311435 .7339153 |
1.845891 1.863342 1.90661 |
— (N) (P) (BC) |
const I I |
200 |
6.443694 |
-.0810437 |
.6198413 |
5.221394 4.956254 5.371459 |
7.665994 7.557803 7.730506 |
— (N) (P) (BC) |
N = normal, P = percentile, BC = bias-corrected
The first line of output tells us that the original sample size is 1,581 and in the second line we learn that the algorithm for
estimation dropped one case from the sample. An important caveat to the pseudo ^-squared reported on the third line, is that
this is the reported statistic from the last iteration of the qreg command on the final sample size. It is not the pseudo ^-squared
for the original sample, but we have opted to report this statistic to provide some indication of how the model is performing.
In the example above, no sample design information is passed to clad and the program calls Stata’s bsample utility to
resample the data 200 times. In order to maintain the same sample size in each bootstrap resample, clad ignores observations
where the dependent variable is missing. The results from bsample are then passed to the bstat command to generate the
standard Stata bootstrap output. For more information about the normal, percentile, and bias-corrected percentile confidence
intervals, see bstrap in the Stata manuals. For an introduction to the bootstrap principle, see Efron and Tibshirani (1993). In
order to reproduce results from clad, it is necessary first to set the random number seed; see generate in the Stata reference
manuals for more information.
The reported standard errors above will be correct if the sample comes from a simple random draw. This is not the case
with the GLSS data, which was collected using a two-stage design. clad can generate bootstrap estimates of the standard errors
which are robust to the two-stage design by passing the information about the primary sampling unit (PSU) to clad. For example,
we correct the standard errors above for this aspect of the sample in the example below.
. clad Ioffinc Isize urban coastal, 11(0) reps(200) psu(clust)
Initial sample size = 1581
Final sample size = 1580
Pseudo R2 = .05048178
Bootstrap statistics
Variable I |
Reps |
Observed |
Bias |
Std. Err. |
[957. Conf |
. Interval] | |
Isize I I |
200 |
1.149846 |
.0916958 |
.395014 |
.3708959 .6573149 .6507832 |
1.928797 2.076703 2.053507 |
— (N) (P) (BC) |
urban I I |
200 |
2.375166 |
.0562143 |
.6152112 |
1.161996 1.285434 1.12299 |
3.588336 3.658858 3.495041 |
— (N) (P) (BC) |
---------+-. coastal I I |
200 |
1.287741 |
.0386539 |
.5439033 |
.2151873 .2898641 .0728349 |
2.360294 2.466994 2.216781 |
— (N) (P) (BC) |
const I I |
200 |
6.443694 |
-.1804084 |
1.04149 |
4.389922 3.942665 4.440762 |
8.497466 8.130428 8.347237 |
— (N) (P) (BC) |
N = normal, P = percentile, BC = bias-corrected
It is worth noting that introducing information about the sample design only affects the estimates of the standard errors.