The name is absent



Stata Technical Bulletin

of Variu1y∕∏j). The
observations. Thus,


aweighted regression reports s2, an estimate of Var(ujy∕∏j-jN∕ ∑k nwh where N is the number of

N 2=s?

(1)


Efe«feSn

The logic for this adjustment is as follows: Consider the model:

У = βo + βιXι + β2X2 + U

Assume that, were this model estimated on individuals, Var( u ) = <r2, a constant. Assume that individual data is not available;
what is available are averages
(yj,xij,x2j) for j = 1,..., N, and that each average is calculated over n,j observations. Then
it is still true that

ÿj = βo+ βlXlj + β2X2j + Uj

where Uj is the average of n,j mean 0 variance σ%l deviates and so itself has variance σ? = σ2∕n7-. Thus, multiplying through
by y^7 produces

yjy/nj = βθy∕nj + βlXljy∕nj + β2X2jy∕nj + Ujy∕nj

and Variu ,ʌ/nj) = σ2. The mean square error .∙β reported by estimating this transformed regression is an estimate of σ2.
Alternatively, the coefficients and covariance matrix could be obtained by aweighted regress. The only difference would be
in the reported mean square error, which per equation 1, is <r2∕n. On average, each observation in the data reflects the averages
calculated over
n =           individuals, and thus this reported mean square error is the average variance of an observation

in the data set. One can retrieve the estimate of by multiplying the reported mean square error by n.

More generally, aweights are used to solve general heteroskedasticity problems. In these cases, one has the model

Vj = βt>+ βιxij + β2X2j + Uj

and the variance of Uj is thought to be proportional to a,j. If the variance is proportional to a,j, it is also proportional to cuij,
where a is any positive constant. Not quite arbitrarily, but with no loss of generality, let us choose a = ∑⅛(1∕<⅛)∕N, the
average value of the inverse of
a,j. We can then write Var(M7) = kwheσe, where к is the constant of proportionality that is no
longer a function of the scale of the weights.

Dividing this regression through by the ʌ/ɑj,

y3V°β = Λ∕√¾ + βιXij∕V^i + β2X2j∕y∕a^j +uj∕λβcΓj

produces a model with Var(uj/ʌ/ɑj) = fccισ2, which is the constant part of Var(ω7). Notice in particular that this variance is a
function of
a, the average of the reciprocal weights; if the weights are scaled arbitrarily, then so is this variance.

We can also estimate this model by typing

. regress у x1 x2 [aweight=l∕α]

This will produce the same estimates of the coefficients and covariance matrix; the reported mean square error is, per equation
1, [-^∕∑fe(1∕αfe)]⅛ασ^2 =σ^2. Note that this variance is independent of the scale of αj∙.

dm19 Merging raw data and dictionary files

Jonathan Nash, CS First Boston, FAX 212-318-0748

I maintain several Stata data sets of selected financial market data. These data are updated regularly by tapping the massive
financial market data sets maintained by CS First Boston. The data from the First Boston data sets are converted to dictionary
files and then merged (by date) into my existing Stata data sets.

Stata’s merge command merges a .dta file into the current data set but it will not handle raw data or dictionary files.
My do-files for updating my data sets contained code for infiling my dictionary files, sorting them by date, saving them to
temporary data sets, using my existing data set, and—finally—merging the new data into the existing data.



More intriguing information

1. The name is absent
2. Word Sense Disambiguation by Web Mining for Word Co-occurrence Probabilities
3. Moffett and rhetoric
4. The name is absent
5. Der Einfluß der Direktdemokratie auf die Sozialpolitik
6. Foreign direct investment in the Indian telecommunications sector
7. The name is absent
8. Campanile Orchestra
9. The name is absent
10. Backpropagation Artificial Neural Network To Detect Hyperthermic Seizures In Rats
11. Analyzing the Agricultural Trade Impacts of the Canada-Chile Free Trade Agreement
12. Social Irresponsibility in Management
13. Housing Market in Malaga: An Application of the Hedonic Methodology
14. European Integration: Some stylised facts
15. The name is absent
16. The name is absent
17. AN EMPIRICAL INVESTIGATION OF THE PRODUCTION EFFECTS OF ADOPTING GM SEED TECHNOLOGY: THE CASE OF FARMERS IN ARGENTINA
18. Synchronisation and Differentiation: Two Stages of Coordinative Structure
19. Evolving robust and specialized car racing skills
20. Reform of the EU Sugar Regime: Impacts on Sugar Production in Ireland