Stata Technical Bulletin
STB-33
Regular use of library
Age left full-time education |
No |
Yes |
Total |
Below 16 years |
124 |
21 |
145 |
16 years |
73 |
30 |
103 |
17-18 years |
55 |
29 |
84 |
19 years or older |
27 |
41 |
68 |
Total |
279 |
121 |
400 |
Source of data: Tivers (1985, 173)
We type in the data just as for tabi, with backslashes separating the rows of the table:
. tab2i 124 21 ∖ 73 30 ∖ 55 29 ∖ 27 41
residuals
row |
col |
observed |
expected |
Pearson |
adjusted |
1 |
1 |
124 |
101.138 |
2.273 |
5.177 |
1 |
2 |
21 |
43.862 |
-3.452 |
-5.177 |
2 |
1 |
73 |
71.843 |
0.137 |
0.288 |
2 |
2 |
30 |
31.157 |
-0.207 |
-0.288 |
3 |
1 |
55 |
58.590 |
-0.469 |
-0.959 |
3 |
2 |
29 |
25.410 |
0.712 |
0.959 |
4 |
1 |
27 |
47.430 |
-2.966 |
-5.920 |
4 |
2 |
41 |
20.570 |
4.505 |
5.920 |
Pearson chi2(3) = 46.9646 Pr = 0.000
The chi-squared statistic is overwhelmingly significant and the pattern of residuals, especially the adjusted residuals, clearly
shows a monotonic relationship. In fact, Tivers gave a result for Goodman-Kruskal gamma, which might be thought more
appropriate by some analysts than chi-squared for a relationship between variables on ordinal scales. (See the entry for tabulate
in the Stata Reference Manuals for an explanation of gamma.)
tab2i has one option: replace indicates that the variables listed by the command are to be left as the current data in
place of whatever data were there. These variables are row and column indices, observed and expected frequencies, and Pearson
and adjusted residuals.
Discussion
There are several other possible definitions of residuals in the literature. For more information on this or other points, see
a standard text on categorical data analysis. For example, Gilbert (1993) and Agresti (1996) assume a modest background in
statistics, whereas Bishop, Fienberg, and Holland (1975) and Agresti (1990) are more advanced. Haberman (1973) is a key paper
introducing adjusted residuals.
For more advanced work with two-way tables, use Judson’s loglinear analysis command Ioglin from STB-6 and STB-8
(Judson, 1992a, 1992b) or the even more general glm command. These allow many models other than that of independence to
be fitted and tested. On the other hand, students and others who may not be familiar with these methods might find tab2i more
accessible for its own elementary task.
In short, tab2i is a minimal first look at a two-way table. Most of the code was gleefully cribbed from tabi. Such theft
followed the observation that if there are no data in memory when tabi is invoked, then the data supplied in the table are left
behind as three variables, row, col, and pop.
Acknowledgment
William Gould of Stata Corporation provided many useful suggestions for improvement of tab2i, but he is not responsible
for any of its deficiencies.
References
Agresti, A. 1990. Categorical Data Analysis. New York: John Wiley.
——. 1996. An Introduction to Categorical Data Analysis. New York: John Wiley.