The name is absent

Stata Technical Bulletin

The purpose of the PUMS file is to provide researchers with direct access to household-by-household and person-by-person
data. Individual household and person data are not available in Summary Tape Files, the other major Census product. Individual
Census responses are confidential; thus, these data have been statistically modified to protect the confidentiality of individuals.
They are designed, however, to provide unbiased estimates and to maintain the covariance structure among variables to the extent
possible.

The data are divided into two file types: household records and person records. Data in the person records include
demographic, socio-economic, family, education, and employment characteristics. Household records include such things as
mortgage or rent payment, size and type of dwelling, number and age of all residents in the household, location of the household,
and relationships among household members. The combined file for each state contains over 500 variables. For the state of
Oregon—to choose an example with which we are familiar—more than 140,000 people are represented.

The ability to read PUMS data into Stata generates an unlimited number of potential uses. The data contain information
useful in almost any field, from advertising to demography. By selecting only the variables that are of interest, the researcher
can optimize memory use and eliminate superfluous information. However, due to the size of the file, it is still recommended
that Intercooled Stata be used whenever possible.

Included on the distribution diskette are two versions of a Stata dictionary we created to read 1990 PUMS data into Stata.
One of the dictionaries, pumshl.dct, is used to read household data into Stata. The other dictionary, pumspl.dct, is used
to read person data. In pumshl.dct, person-level variables are commented out in the dictionary header. The reverse is true in
pumspl.dct. On our system, the PUMS data file is stored in the subdirectory d:\pums. You will need to modify the top line
of each dictionary to provide the file location of your PUMS data file. PUMS data files have names of the form pumsaxxx.txt,
where xx = the state initials. For example, the file for Oregon is pumsaxor.txt.

Because the file is divided into two record types and because each record is set up with a hierarchical structure in which
each person record is subordinate to the associated household record, it is necessary to read the household data in separately
from the person data. The file structure is

Record Type

Serial Number

Data

and so on for each household.

HH serial number

Household characteristics

Person 1’s characteristics

Person 2’s characteristics

We reproduce a portion of the dictionary here:

dictionary using d:\pums\pumsaxor.txt
*

* HOUSEHOLD RECORDS

* -Colunm(I) strl rectype %ls

* .column(2) long SerialNo %7f

* .column(9) byte Sample %lf

* -Column(IO)

* -Column(Il)

* -column(13)

* -column(18)

* -column(20)

* -column(24)

* -column(27)

* -column(29)

* -column(33)

* -column(35)

* -column(39)

* -column(41)

* -column(42)

* -column(43)

* -column(44)

* -column(45)

* -column(46)

byte division %lf

byte state %2f
long puma %5f

byte areatype %2f

int msapmsa %4f
int psa %3f

int subsmpl %2f
int houswgt ⅝4f
byte persons %2f
byte gqtype %lf
byte unitsl %2f
byte husflag % If
byte pdsflag % If
byte rooms %lf
byte tenure %lf
byte acreage %lf
byte commuse %lf

Household variables continue
* -column(203) byte amoblhme %lf

* PERSON RECORDS

* -column(9) byte relatl %2f

More intriguing information

1. Secondary school teachers’ attitudes towards and beliefs about ability grouping
2. NATURAL RESOURCE SUPPLY CONSTRAINTS AND REGIONAL ECONOMIC ANALYSIS: A COMPUTABLE GENERAL EQUILIBRIUM APPROACH
3. The name is absent
4. Infrastructure Investment in Network Industries: The Role of Incentive Regulation and Regulatory Independence
5. Plasmid-Encoded Multidrug Resistance of Salmonella typhi and some Enteric Bacteria in and around Kolkata, India: A Preliminary Study
6. The name is absent
7. Ex post analysis of the regional impacts of major infrastructure: the Channel Tunnel 10 years on.
8. AMINO ACIDS SEQUENCE ANALYSIS ON COLLAGEN
9. The name is absent
10. On Evolution of God-Seeking Mind