Stata Technical Bulletin
STB-58
Example 1
We simulate 1000 two-generation families. The mean age and standard deviation of people in the first- and second-generation
are 70, 10 and 40, 10, respectively. The frequency of an allele A is assumed to be 0.05. The mean number of siblings in the
second generation is 5. The simulated family data are saved into the file output.dta.
. simuped2 70 10 40 10, reps(1000) sav(output) alle(0.05) sib(5)
. use output
. describe
Contains data from output.dta
obs: 6,818
vars: 6 23 Oct 2000 07:51
size: 177,268 (83.0% of memory free)
— 1. famid 2. id 3. degree 4. female 5. age 6. genotype |
float |
— 7.9.0g 7.9.0g 7.9.0g 7.9.0g 7.9.0g 7.9 s | ||||
— Sorted by: |
— | |||||
. list | ||||||
famid |
id |
degree |
female |
age |
genotype | |
1. 1 |
1 |
1 |
0 |
73 |
aa | |
2. 1 |
2 |
1 |
1 |
76 |
aa | |
3. 1 |
3 |
2 |
0 |
28 |
aa | |
4. 1 |
4 |
2 |
1 |
43 |
aa | |
5. 1 |
ε |
2 |
1 |
26 |
aa | |
6. 2 |
ι |
1 |
0 |
64 |
aa | |
7. 2 |
2 |
1 |
1 |
εs |
aa | |
8. 2 |
3 |
2 |
0 |
38 |
aa | |
9. 2 |
4 |
2 |
1 |
46 |
aa | |
10. 2 (output omitted ) |
ε |
2 |
0 |
ει |
aa | |
6809. 999 |
2 |
1 |
1 |
ει |
aa | |
6810. 999 |
3 |
2 |
0 |
εo |
aa | |
6811. 999 |
4 |
2 |
1 |
38 |
aa | |
6812. 999 |
ε |
2 |
0 |
37 |
aa | |
6813. 999 |
6 |
2 |
1 |
41 |
aa | |
6814. 1000 |
ι |
1 |
0 |
74 |
aa | |
6816. 1000 |
2 |
1 |
1 |
70 |
aa | |
6816. 1000 |
3 |
2 |
1 |
38 |
aa | |
6817. 1000 |
4 |
2 |
1 |
41 |
aa | |
6818. 1000 |
ε |
2 |
1 |
38 |
aa |
A total of 6,818 individuals are generated in the 1,000 families. The variable famid represents the family identification of the
simulated family, while id represents the personal identification within each family, degree represents the generation a person
belongs to, female is one or zero depending on whether or not a person is a female, age represents the simulated age, and
genotype represents a person’s genotype.
Example 2
We simulate 2,000 three-generation families. The mean age of people in the first-, second- and third-generation are 80,
50 and 20, respectively. Their standard deviation is assumed to be 10 across all generations. The frequency of an allele A is
assumed to be 0.1. The mean number of siblings in the second- and third-generation are 4 and 3.5, respectively.
. set memory 50m
. simuped3 80 10 50 10 20 10, reps(2000) allé(0.1) sib(4) si3(3.5)
. use temp
. describe
Contains data from temp.dta
obs: 29,667
vars: 8 23 Oct 2000 18:38
size: 1,008,678 (98.1% of memory free)
1. famid float %9.0g
2. id float %9.0g
3. degree float %9.0g