The name is absent



Stata Technical Bulletin

The purpose of the PUMS file is to provide researchers with direct access to household-by-household and person-by-person
data. Individual household and person data are not available in Summary Tape Files, the other major Census product. Individual
Census responses are confidential; thus, these data have been statistically modified to protect the confidentiality of individuals.
They are designed, however, to provide unbiased estimates and to maintain the covariance structure among variables to the extent
possible.

The data are divided into two file types: household records and person records. Data in the person records include
demographic, socio-economic, family, education, and employment characteristics. Household records include such things as
mortgage or rent payment, size and type of dwelling, number and age of all residents in the household, location of the household,
and relationships among household members. The combined file for each state contains over 500 variables. For the state of
Oregon—to choose an example with which we are familiar—more than 140,000 people are represented.

The ability to read PUMS data into Stata generates an unlimited number of potential uses. The data contain information
useful in almost any field, from advertising to demography. By selecting only the variables that are of interest, the researcher
can optimize memory use and eliminate superfluous information. However, due to the size of the file, it is still recommended
that Intercooled Stata be used whenever possible.

Included on the distribution diskette are two versions of a Stata dictionary we created to read 1990 PUMS data into Stata.
One of the dictionaries, pumshl.dct, is used to read household data into Stata. The other dictionary, pumspl.dct, is used
to read person data. In pumshl.dct, person-level variables are commented out in the dictionary header. The reverse is true in
pumspl.dct. On our system, the PUMS data file is stored in the subdirectory d:\pums. You will need to modify the top line
of each dictionary to provide the file location of your PUMS data file. PUMS data files have names of the form pumsax
xx.txt,
where
xx = the state initials. For example, the file for Oregon is pumsaxor.txt.

Because the file is divided into two record types and because each record is set up with a hierarchical structure in which
each person record is subordinate to the associated household record, it is necessary to read the household data in separately
from the person data. The file structure is

Record Type


Serial Number


Data


and so on for each household.


HH serial number

HH serial number

HH serial number


Household characteristics

Person 1’s characteristics

Person 2’s characteristics


We reproduce a portion of the dictionary here:


dictionary using d:\pums\pumsaxor.txt
*


* HOUSEHOLD RECORDS


* -Colunm(I) strl rectype %ls

* .column(2) long SerialNo %7f

* .column(9) byte Sample %lf


* -Column(IO)

* -Column(Il)

* -column(13)

* -column(18)

* -column(20)

* -column(24)

* -column(27)

* -column(29)

* -column(33)

* -column(35)

* -column(39)

* -column(41)

* -column(42)

* -column(43)

* -column(44)

* -column(45)

* -column(46)


byte division %lf


byte state %2f
long puma %5f


byte areatype %2f


int msapmsa %4f
int psa %3f


int subsmpl %2f
int houswgt ⅝4f
byte persons %2f
byte gqtype %lf
byte unitsl %2f
byte husflag % If
byte pdsflag % If
byte rooms %lf
byte tenure %lf
byte acreage %lf
byte commuse %lf


Household variables continue
* -column(203) byte amoblhme %lf

* PERSON RECORDS

* -column(9) byte relatl %2f



More intriguing information

1. Dynamic Explanations of Industry Structure and Performance
2. Human Rights Violations by the Executive: Complicity of the Judiciary in Cameroon?
3. Evidence on the Determinants of Foreign Direct Investment: The Case of Three European Regions
4. For Whom is MAI? A theoretical Perspective on Multilateral Agreements on Investments
5. CONSUMER PERCEPTION ON ALTERNATIVE POULTRY
6. A Dynamic Model of Conflict and Cooperation
7. A production model and maintenance planning model for the process industry
8. Epistemology and conceptual resources for the development of learning technologies
9. How to do things without words: Infants, utterance-activity and distributed cognition.
10. AMINO ACIDS SEQUENCE ANALYSIS ON COLLAGEN