The name is absent



Stata Technical Bulletin

19


Methods and formulas

Here we only give the formulas for testing whether the cases are under HWE, given the controls are under HWE, as the
methods for testing one sample has been given by Cleves (1999). The standard error of the disequilibrium coefficient (D) was
not included in the command genhwi, but it is included in the new command genhwcci. Details of the formula can be found
in Weir (1990, 74).

The observed case-control data is shown in the following table, where щ and n'i represent the number of genotypes among
cases and controls, respectively,
i = AA, AL!, BB. Let τr, and τr∙ represent the probability that a person has genotype a among
cases and controls, respectively. We have ∑j 7L = 1 and ɪʃ, τr ∙ = 1

Table 1: Observed genotypic counts

Genotype

Case

Control

Total

AA

паа

nAA

mAA

AB

пав

nAB

тлв

BB

nBB

nBB

твв

Total

n

n'

m

Suppose the controls are randomly selected from the population of interest, which is under HWE. Then the distribution of
genotypes in the population is given by

tγ'aa = /, tγ'ab = 2W, bb = Q2                                 (1)

where p is the allele frequency of A in the population, and q = 1 — p.

Under the null hypothesis, that is, the cases are under HWE, the genotype distribution of the cases is also given by (1) as
the controls are assumed to be under
HWE. Then the log-likelihood function is given by

L0 = (пал + n'AA) lθg(p2) + {∏ab + n'AB) og(2pq) + (nss + n'ss) log(g2)

with one parameter p.

Under the alternative hypothesis, that is, cases are not under HWE, the genotype distribution of the cases are kaa, nAB
and tγbb, then the log-likelihood function is given by

Li = ПАА log 7Γ.4.4 + nAB log 7Γ.4B + «ВВ lθg 7Γββ + ∏'aa log(p2) + n'AB log(2pg) + n'Bb log(g2)

with the three parameters p, -∏aa and tγj4b, where the τr, sum to one.

To test the null hypothesis versus the alternative hypothesis, we use the statistic — 2{L0 — Lwh which approximates the χ2
distribution with 2 degrees of freedom. The maximum likelihood estimates of
p and τr, can be obtained from the score functions
of
L0 and Li, respectively. More specifically, under the null hypothesis, p = 22.mAA + mAB)/22.m), while under the alternative
hypothesis,
p = (2n'44 + n'4B)/(2n'), 7rj4.4 = nAA/n, ttab = nj4β∕∏, and ttbb = пвв/п. Then — 2(L0 — Z1) is evaluated
at the above maximum likelihood estimates and compared with the χ2 distribution.

Acknowledgment

I would like to thank Dr. Douglas Easton of the University of Cambridge for helpful discussions during his visit to Australia
in November of 1999.

References

Cleves, M. 1999. sg110: Hardy-Weinberg equilibrium test and allele frequency estimation. Stata Technical Bulletin 48: 34-37. Reprinted in Stata
Technical Bulletin Reprints
, vol. 8, pp. 280-284.

Helzlsouer, K. J. et al. 1998. Association between CYP17 polymorphisms and the development of breast cancer. Cancer Epidemiology, Biomarkers &
Prevention
7: 945-949.

Weir, B. S. 1990. Genetic Data Analysis. Sunderland, Massachusetts: Sinauer Associates.



More intriguing information

1. Developing vocational practice in the jewelry sector through the incubation of a new ‘project-object’
2. Developments and Development Directions of Electronic Trade Platforms in US and European Agri-Food Markets: Impact on Sector Organization
3. The name is absent
4. Trade Liberalization, Firm Performance and Labour Market Outcomes in the Developing World: What Can We Learn from Micro-LevelData?
5. Ronald Patterson, Violinist; Brooks Smith, Pianist
6. The bank lending channel of monetary policy: identification and estimation using Portuguese micro bank data
7. RETAIL SALES: DO THEY MEAN REDUCED EXPENDITURES? GERMAN GROCERY EVIDENCE
8. Income Growth and Mobility of Rural Households in Kenya: Role of Education and Historical Patterns in Poverty Reduction
9. The name is absent
10. Changing spatial planning systems and the role of the regional government level; Comparing the Netherlands, Flanders and England