Suppose data are available on xit and xit and we focus on the regressions
yit = θxit + uit (RI)
(7)
yit = bxit + cwt + vit (RII)
which differ in that the latter is augmented by wt = PiN=1 uit/N to proxy the
unobserved random factor zt. It can be shown that, for our baseline DGP, this
proxy is equal to the first principal component of u up to a scaling factor. To
study the properties of these regressions, we define the sequential probability
limits (plim) for T →∞and any fixed N
SyNy =plimT-1 tyi2t; SyNx =plimT-1 t yitxit (8)
T→∞ T→∞
and likewise for the other variables in (6) whose (co)variances are:
E (xi2t) = σ2x = σd2 + σz2
E(di2t) = σ2d
E(εi2t)=σε2
E(xitxjt) = σz2
E(xitdit) = σ2d
E(xitzt) = σz2
E(zt2) = σz2
E(yi2t) = β2σ2d +(β + γ)2σz2 +σε2
E(xityit) = βσd2 +(β + γ)σz2
E(yitzt)=(β+γ)σz2
E(yityjt)=(β + γ)2σz2
E(ztdit) = E(ztεit) = E(ditεit) = 0
The auxiliary regression zt = δxit +ηit implies that yit = (β+γδ)xit+γηit+εit .
Hence, the OLS estimator θ measures (β + γδ) and the residuals Uit estimate
uit = γηit + εit, with variance σ2u ≡ E(ui2t)=γ2 (1 - δ)2σz2 + γ2 δ2 σ2d + σε2 and
covariance σij ≡ E(uitujt) = γ2 (1 - δ)2σz2. Then we have wt = PiN=1(γηit +
εit)/N = γ(1 — δ)zt — γδdt + εt where dt = N-1 ɪɪi dit and it follows that
E(wt2) = γ2(1 - δ)2σz2 + γ2δ2N-1σ2d + N-1σε2
E(yitwt)=(β+γ)(1 — δ)γσz2 — βγδN-1σ2d + N-1σε2
E(xitwt) = γ(1 — δ)σz2 — γδN-1σ2d; E(ztwt) = γ(1 — δ)σz2
For our baseline DGP we have
SyNy = β2σd2 + (β + γ)2σz2 +σε2
SwNw = γ2(1 — δ)2σz2 + γ2δ2N-1σ2d + N-1σε2
SxNw = γ[(1 — δ)σz2 — N -1δσd2]
SyNw = γ[(β + γ)(1 — δ)σZ — N-1βδσd] + N-1σε2
SxNx = σ2d + σz2
SxNz = σz2
SyNx = βσd2 +(β+γ)σz2
SN
zw
= γ(1 — δ)σz2
These results are used in the next sections to analyze the question of how
ʌ ʌ
well does the OLS estimator b (and by comparison θ) measure the true β.