Stata Technical Bulletin
11
Stata would have problems, however, if the same data arrangement appeared in a dictionary file:
. type ini. det
dictionary {
int xl
int x2
int x3
}
11 12 13
21 22
23 31 32 33
. infile using ini
dictionary
int xl
int x2
int x3
(4 observations read)
list |
xl |
x2 |
x3 |
1. |
11 |
12 |
13 |
2. |
21 |
22 | |
3. | |||
4. |
23 |
31 |
32 |
Stata’s dictionary files are the preferred form for storing and documenting raw data. The dictionary subcommands can handle
most kinds of formatted data including multi-line records and data sets without carriage returns ([5d] infile). Nonetheless, Murphy’s
law guarantees that you will occasionally confront data sets that confound Stata’s dictionary capabilities. More commonly, you
will have a data set that Stata’s dictionary features can handle but only with difficulty. Clearly, life would be simpler if all raw
data sets were rectangular, as in the first example.
I have written a C program called block that makes my life simpler. block takes an arbitrary ASCII file as input and
produces as output the same information arrayed rectangularly. The following example illustrates how to use block.
C:> type ini
11 12 13
21 22
23 31 32 33
C:> block
Name of the input file: ini
Name of the output file: outl
Number of columns: 3
Read in 9 fields from ini
Wrote out 3 rows of 3 columns to outl
C:> type outl
11 12 13
21 22 23
31 32 33
block handles non-rectangular data gracefully:
C:> type in2
11 12 13
21 22
23 31 32 33 44
C:> block
Name of the input file: in2
Name of the output file: out2
Number of columns: 3
Read in 10 fields from in2
Wrote out 3 rows of 3 columns to out2
WARNING: last row not complete.
C:> type out2
11 12 13
21 22 23
31 32 33
44
More intriguing information
1. Elicited bid functions in (a)symmetric first-price auctions2. Howard Gardner : the myth of Multiple Intelligences
3. Education Responses to Climate Change and Quality: Two Parts of the Same Agenda?
4. The Mathematical Components of Engineering
5. Impact of Ethanol Production on U.S. and Regional Gasoline Prices and On the Profitability of U.S. Oil Refinery Industry
6. The name is absent
7. Deletion of a mycobacterial gene encoding a reductase leads to an altered cell wall containing β-oxo-mycolic acid analogues, and the accumulation of long-chain ketones related to mycolic acids
8. The name is absent
9. Death as a Fateful Moment? The Reflexive Individual and Scottish Funeral Practices
10. The Provisions on Geographical Indications in the TRIPS Agreement