Computing optimal sampling designs for two-stage studies



Stata Technical Bulletin

4. family

float 7.9.0g

5. female

float 7.9.0g

6. marry

float 7.9.0g

7. age

float 7.9.0g

S. genotype

str2 7.9s

Sorted by:

. list

famid

id

degree

family

female

marry

age

genotype

1.

1

1

1

1

0

1

85

aa

2.

1

2

1

1

1

1

68

aa

3.

1

3

1

2

0

1

83

Aa

4.

1

4

1

2

1

1

65

Aa

ε.

1

5

2

1

1

0

51

aa

6.

1

6

2

1

0

0

61

aa

7.

1

7

2

1

1

0

43

aa

S.

1

8

2

1

1

0

33

aa

9.

1

9

2

1

1

1

4

aa

10.

1

10

2

1

1

0

25

aa

(output omitted )

29658.

2000

10

2

2

0

1

53

aa

29659.

2000

11

2

2

1

0

64

aa

29660.

2000

12

2

2

1

0

32

aa

29661.

2000

13

2

2

0

0

39

aa

29662.

2000

14

3

0

0

0

30

aa

29663.

2000

15

3

0

0

0

12

aa

29664.

2000

16

3

0

0

0

34

aa

29665.

2000

17

3

0

0

0

5

aa

29666.

2000

18

3

0

1

0

27

aa

29667.

2000

19

3

0

0

0

11

aa

In this example, we do not specify the output file, so the simulated family data are saved into temp.dta. The extra variable
named marry is produced by simuped3 compared with simuped2. It indicates the marriage status of a person.

References

Elandt-Johnson, R. 1971. Probability Models and Statistical Methods in Genetics. New York: John Wiley & Sons.

gr45 A turnip graph engine

Steven Woloshin, VA Outcomes Group, VA Medical Center, White River Junction, VT, [email protected]
Abstract: A new graphical description called a turnip graph for studying the distribution of a variable is introduced and illustrated.
Keywords: descriptive statistics, histogram, stem and leaf plot, turnip plot.

Syntax

turnip varname [if exp] [, résolut ion (#) truev(#) graph-options ]

Description

turnip creates a turnip-style graph for the variable varname. The range of the variable is divided into intervals which are
placed on the vertical axis of the plot. Symbols are plotted horizontally next to each interval reflecting the number of observations
in that interval. Thus one can use this plot in conjunction with histograms, boxplots, stem-and-leaf plots, and so on, to study the
distribution of a variable.

Options

résolut ion (#) specifies the resolution of the graph, that is, the width of the intervals to be used. The default value is 0.4s,
where
s is the standard deviation of the variable. Since resolution rounds the data, the graph in essence displays the
frequency of observations falling within each resolution unit. The user can avoid any such rounding (that is, display the
frequency of each value in the data) by specifying the resolution width as any negative number.

truev(#) specifies a value that can be used to divide a variable into three parts: one exactly equal to truev, one that is greater
than truev, and one that is less than truev. For example, suppose you are trying to display change scores and want to
show which observations are above or below zero. Using truev(0) ensures that only zero values are graphed at zero;
other values which would round to zero are set at the next appropriate category of
varname.



More intriguing information

1. The name is absent
2. Wirkt eine Preisregulierung nur auf den Preis?: Anmerkungen zu den Wirkungen einer Preisregulierung auf das Werbevolumen
3. Modelling Transport in an Interregional General Equilibrium Model with Externalities
4. Estimating the Impact of Medication on Diabetics' Diet and Lifestyle Choices
5. Washington Irving and the Knickerbocker Group
6. LABOR POLICY AND THE OVER-ALL ECONOMY
7. Synthesis and biological activity of α-galactosyl ceramide KRN7000 and galactosyl (α1→2) galactosyl ceramide
8. Top-Down Mass Analysis of Protein Tyrosine Nitration: Comparison of Electron Capture Dissociation with “Slow-Heating” Tandem Mass Spectrometry Methods
9. Evaluating the Impact of Health Programmes
10. Globalization, Divergence and Stagnation