The name is absent

Stata Technical Bulletin

STB-4

program define example3

* residual resampling regression bootstrap

* assumes variables ”Y” and ^,,X^,' in ’’source.dta”

set more 1
drop „all
set maxobs 2000

* If source.dta contains > 2,000 cases, set maxobs higher,
quietly use source.dta
quietly drop if Y==. ∣ X==.
quietly regress Y X
capture predict Yhat
capture predict e, resid
quietly replace e=e/sqrt(l-((_result(3)+l)/_result(l)))
* Previous two commands obtain full-sample regression
* residuals, and ’’fatten” them, dividing by:

* sqrt(l - K/_N)

* where K is # of model parameters and _N is sample size,
macro define _coefX=_b[X]
quietly save, replace
capture erase bootdat3.log
log using bootdat3.log
log off
set seed Illl

macro define „bsample 1
while ⅝-bsample<1001 -(
quietly use source.dta, clear
quietly generate ee=e[int(_N*uniform())+l]
quietly generate YY=Yhat+ee
quietly regress YY X

* We resample residuals only, then generate bootstrap

* Y values (called YY) by adding bootstrap residuals (ее)
* to predicted values from the original-sample

* regression (Yhat). Finally, regress these bootstrap
* YY values on original-sample X.

macro define _bSE=_b[X]∕sqrt(„result(6))
log on
display ⅝_-bsample
display „b[_cons]
display _b[X]
display 7»_bSE

display (_b [X]-%_coefX)/%_bSE
display
log off
macro define _bsample=%_bsample+l
>
log close
drop „all

infile bsample bcons bcoefX bSE StucoefX using bootdat3.log
label variable bsample ’’bootstrap sample number”
label variable bcons ’’sample Y-intercept, bO”
label variable bcoefX ’’sample coefficient on X, bl”
label variable bSE ’’sample standard error of bl”
label variable StucoefX ’’studentized coefficient on X”
label data ’’regression boot∕residual resampling”
save boot3.dta, replace
end

To summarize our results in the regression of New York air pollution (Y) on population density (X):

slope standard error

original sample	5.67∙10^-6	7.13∙10^-r
bootstrap—data resampling	6.24∙10^-6	21.0∙10^-r
bootstrap—residual resampling	5.66∙10^-e	7.89∙10^-r

Since they both assume fixed X and i.i.d. errors, results from residual resampling resemble results from the original-sample
regression (but with about 10% higher standard error). In contrast, data resampling obtains a standard error almost three times
the original-sample estimate, and a radically nonnormal distribution (skewness=3.6, kurtosis=18.3) centered right of the original-
sample regression slope. The differences in sampling distributions seen in Figure 2 dramatize how crucial the fixed-X and i.i.d.
errors assumptions are.

More intriguing information

1. An Attempt to 2
2. Heterogeneity of Investors and Asset Pricing in a Risk-Value World
3. A production model and maintenance planning model for the process industry
4. The name is absent
5. The Trade Effects of MERCOSUR and The Andean Community on U.S. Cotton Exports to CBI countries
6. From Communication to Presence: Cognition, Emotions and Culture towards the Ultimate Communicative Experience. Festschrift in honor of Luigi Anolli
7. ARE VOLATILITY EXPECTATIONS CHARACTERIZED BY REGIME SHIFTS? EVIDENCE FROM IMPLIED VOLATILITY INDICES
8. WP 48 - Population ageing in the Netherlands: Demographic and financial arguments for a balanced approach
9. Does Presenting Patients’ BMI Increase Documentation of Obesity?
10. Fiscal federalism and Fiscal Autonomy: Lessons for the UK from other Industrialised Countries