technology equations and model investment behavior.
3.1 Identifying the Distribution of the Latent Variables
We use a general notation for all measurements to simplify the econometric analysis. Let
Za,k,t,j be the jth measurement at time t on measure of type a for factor k. We have
measurements on test scores and parental and teacher assessments of skills (a = 1), on
investment (a = 2) and on parental endowments (a = 3). Each measurement has a cognitive
and noncognitive component so k ∈ {C, N }. We initially assume that measurements are
additively separable functions of the latent factors θk,t and Ik,t :
Z1,k,t,j = μ1,k,t,j + α1,k,t,j θk,t + ε1,k,t,j (3.1)
Z2,k,t,j = μ2,k,t,j + α2,k,t,j1k,t + ε2,k,t,j, (3.2)
where E(εa,k,t,j) = 0, j ∈ {1,...,Ma,k,t},t∈{1,...,T},k∈ {C,N},a∈ {1, 2}
and where εa,k,t,j are uncorrelated across the j.12 Assuming that parental endowments are
measured only once in period t = 1, we write
Z3,k,1,j = μ3,k,1,j + α3,k,1,jθk,P + ε3,k,1,j, 13,14
(3.3)
E (ε3,k,1,j) = 0,j ∈ {1, . . . , M3,k,1}, and k ∈ {C, N}.
The αa,k,t,j are factor loadings. The parameters and variables are defined conditional on
X . To reduce the notational burden we keep X implicit. Following standard conventions
in factor analysis, we set the scale of the factors by assuming αa,k,t,1 = 1 and normalize
E(θk,t) = 0 and E (Ik,t) = 0 for all k ∈ {C, N}, t = 1, . . . , T. Separability makes the
identification analysis transparent. We consider a more general nonseparable model below.
Given measurements Za,k,t,j, we can identify the mean functions μa,k,t,j, a ∈ {1, 2, 3}, t ∈
12 An economic model that rationalizes the investment measurement equations in terms of family inputs is
presented in Web Appendix 2. See also Cunha and Heckman (2008).
13This formulation assumes that measurements a ∈ {1, 2, 3} proxy only one factor. This is not strictly
required for identification. One can identify the correlated factor model if there is one measurement for
each factor that depends solely on the one factor and standard normalizations and rank conditions are
imposed. The other measurements can be generated by multiple factors. This follows from the analysis
of Anderson and Rubin (1956) who give precise conditions for identification in factor models. Carneiro,
Hansen, and Heckman (2003) consider alternative specifications. The key idea in classical factor approaches
is one normalization of the factor loading for each factor in one measurement equation to set the scale of the
factor and at least one measurement dedicated to each factor.
14In our framework, parental skills are assumed to be constant over time as a practical matter because we
only observe parental skills once.