Stata Technical Bulletin
STB-22
. list state pop in -5/1
state |
PoP | |
46. |
N. Dakota |
652717 |
47. |
Delaware |
594338 |
48. |
Vermont |
511456 |
49. |
Wyoming |
469557 |
50. |
Alaska |
401851 |
dm23 Saving a subset of the current data set
David Mabb, Health Services Advisory Group, FAX 602-241-0757
savin is a utility that extends the save command by allowing you to save a subset of the current data set. The syntax of
savin is
savin [ varlist ] [ if exp ] [ in range ] using filename [ , nolabel replace ]
Like Stata’s save command, savin leaves the current data set undisturbed. The nolabel and replace options also work just
as in Stata’s save command. Unlike the save command, savin requires you to type the using keyword before the filename.
Discussion
I frequently need to make different data subsets from a primary Stata data file. The process usually involves using the
primary file, keeping just the information I want, saving the subset with a new name, and then retrieving the primary file again.
Typically, this process is performed in do-files as follows
. use main
(Primary data set)
. keep mpn age sex
. keep if sex==l
(27 observations deleted)
. save male
file male.dta saved
. use main
(Primary data set)
. keep mpn age sex
. keep if sex==2
(23 observations deleted)
. save female
file female.dta saved
. use main
(Primary data set)
With savin, this process is simplified to
. use main
(Primary data set)
. savin mpn age sex if sex==l using male
file male.dta saved
. savin mpn age sex if sex==2 using female
file female.dta saved
At the end of this sequence of commands, the primary data set (main.dta) is still the current data set.
dt2 Reading public use microdata samples into Stata
Charles L. Sigmund, M.S., Oregon Employment Department, EMAIL [email protected]
D. H. Judson, Ph.D., University of Nevada, Reno, EMAIL [email protected]
The Public Use Microdata Sample-A (PUMS-A, hereafter simply PUMS) of the U.S. Census is a data set containing a 5 percent
sample of responses to the long-form census questionnaires for each state. These data are available for each state on CD-ROM
from the U.S. Census Bureau at minimal cost. (Call Customer Services, 301-763-4100, for more information on obtaining these
CDs.)