12
Stata Technical Bulletin
STB-4
. reg mhi3mo imale iagecont married inonwht
(obs=3265)
Source I |
SS |
df |
MS |
Number of obs = 3265 F( 4, 3260) = 116.13 |
_________-|-_ |
— | |||
Model I |
182843.927 |
4 |
46710.9817 |
Prob > F = 0.0000 |
Residual ∣ |
i283iεε.i6 |
3260 |
393.605878 |
R-square = 0.1247 Adj R-square = 0.1236 |
________ __|__ |
— | |||
Total I |
1468999.09 |
3264 |
449.141878 |
Root MSE = 19.84 |
Variable ∣ |
Coefficient |
Std. Error |
t Prob > It I Mean | |
— | ||||
mhi3mo I |
71.36854 | |||
---------+- |
— | |||
imale I |
4.819749 |
.7416599 |
6.094 0.000 .3840735 | |
iagecont I |
.4287276 |
.0224545 |
19.093 0.000 53.66473 | |
married I |
2.806007 |
.7322122 |
3.832 0.000 .5911179 | |
inonwht I |
3.26364 |
.8643187 |
3.764 0.000 .2073507 | |
_cons I |
44.29177 |
1.332665 |
33.235 0.000 1 — |
We see that the effect of being married is still significant, but not as large as we saw in the raw data. There are also positive
effects for being male and nonwhite. Age is a strong positive predictor. The effect is about 17 points for 40 years of age, which
is about 1 standard deviation for the mental health index. Since age has such a dramatic effect, it is important to measure its
effect accurately and there is no reason to suspect that the effect is purely linear. One solution might be to include age squared,
but there is also no reason to suspect a quadratic effect and, with this amount of data, it would be best to let the data “select”
the functional form. One way is to decompose age into a set of splines:
. gen ages45 = max(0,iagecont-45)
. gen ages55 = max(0,iagecont-55)
. gen ages65 = max(0,iagecont-65)
These variables, along with iagecont, allow us to fit a connected set of line segments with hinges at 45, 55, and 65 years of
age. If we include these four variables in a linear regression, the “age effect” is modeled as
E = ∕3iagecont ⅛ ∕‰5ages45 + ∕⅜gages55 + ∕⅝sages65
For persons less than age 45, the effect is simply E = ∕3iagecont because the other three variables are defined as zero, and
the slope is β.
At age 45, ages45 kicks in, taking on the value 0 (age 45), 1 (age 46), 2 (age 47), and so on. Thus, the age effect is
E = ∕3iagecont + ∕‰5ages45. The line joins with the iagecont < 45 line at age 45 because ages45 is zero there, but the
slope (the change in E for a one-year change in age) is now β + ∕¾5.
At age 55, the process repeats as ages55 kicks in, taking on the value 0 (age 55), 1 (age 56), and so on. The age effect is
E = ∕3iagecont + ∕‰5ages45 + .⅛5ages55. Again the line joins at the age 55 hinge where ages55 is zero, but the slope is
now β + β45 + /З55.
The process repeats once more at age 65. We now estimate our regression, obtaining estimates for β, ∕‰5, ∕⅜5, and β^:
. reg mhi3mo imale iagecont married inonwht ages*
(obs=3265)
Source I SS df MS Number of obs = 3265
---------+- Model I |
186044.231 1279954.86 |
7 3267 |
— 26677.7473 392.986833 |
F( 7, 3267) Prob > F |
= 67.63 = 0.0000 = 0.1269 — A 1 OEtA |
---------+- |
— |
Adj R-square |
“ V∙IzOv | ||
Total I |
1465999.09 |
3264 |
449.141878 |
Root MSE |
= 19.824 |
Variable I |
Coefficient |
Std. Error |
t Prob > It I |
Mean | |
---- ------ ^^ ^^+” |
— | ||||
mhi3mo ∣ |
71.36864 | ||||
^^ ^^ ^^—^^ ^^—+- |
— | ||||
imale I |
4.555014 |
.7413871 |
6.144 0.000 |
.3840736 | |
iagecont I |
.3367814 |
.0832666 |
4.033 0.000 |
63.66473 | |
married I |
2.698273 |
.7639907 |
3.679 0.000 |
.6911179 | |
inonwht I |
3.342804 |
.8681187 |
3.861 0.000 |
.2073607 | |
ages45 I |
.1939794 |
.2070006 |
0.937 0.349 |
11.87662 | |
ages55 I |
.092176 |
.2676469 |
0.344 0.731 |
6.902229 | |
ages65 I |
-.4430367 |
.2194174 |
-2.019 0.044 |
1.962236 | |
_cons I |
47.33282 |
3.030649 |
16.618 0.000 |
1 — |