Creating a 2000 IES-LFS Database in Stata



PROVIDE Project Technical Paper 2005:1
variance had the sample been a simple random one. Stratification typically reduces deff below
one, while clustering increases it above one. Deaton (1997:15) suggests that most surveys
have a
deff of more than one, which proves that “in survey design the practical convenience
and cost considerations of clustering usually predominate over the search for variance-
reduction
”.

February 2005


2.2.4. Unequal selection probabilities

Although surveys such as the IES 2000 are usually designed to be self-weighting, the
probabilities of inclusion differ between observations. The possibilities of non-cooperation
and non-contact cannot be taken into account when designing a survey. In some cases it also
costs more to sample certain households. In such instance households that are costly to
interview may be excluded on purpose, which affects the probability of inclusion of those
observations. Since each sampled observation or household is representative of a number of
other non-sampled households, it is necessary to adjust the weight of each observation to
account for over- or under-representation of certain types of representative households.
Deaton (1997:15) explains as follows:
8

“The rule here is to weight according to the reciprocals of sampling probabilities
because households with low (high) probabilities of selection stand proxy for
large (small) numbers of households in the population.”

Differences in probabilities of selection are either a result of design (in the case of surveys
that were not designed to be self-weighting) or accidental (for example when households
refuse to cooperate). In the case of accidental differences in selection probabilities it is
necessary to add weights to the survey ex-post. However, as Deaton warns, it is very difficult
to find those factors or characteristics that sufficiently explain non-response. A good example
is the apparent low response rate for White households in the IES 2000. Whether the race
explains this low response rate or whether it is as a result of a combination of factors such as
race, income and location is impossible to say. The difficulty in explaining the source(s) of
over- or under-representation suggests that there is a real threat that the ex-post weighting
adjustments could sometimes be incorrect.

2.2.5. Weights in Stata

When specifying the weight option in a Stata command line, Stata attaches a weight to each
observation. This weight can alter the ‘importance’ of each observation in the estimation of
the moments of an observation. The Stata reference manual (StataCorp, 2001) discusses four
types of weights that can be implemented in Stata:

8 See section 2.2.5 (inverse probability weights) for a discussion of the practical implementation in Stata.

6

© PROVIDE Project



More intriguing information

1. TOWARDS THE ZERO ACCIDENT GOAL: ASSISTING THE FIRST OFFICER MONITOR AND CHALLENGE CAPTAIN ERRORS
2. Heterogeneity of Investors and Asset Pricing in a Risk-Value World
3. A Pure Test for the Elasticity of Yield Spreads
4. Der Einfluß der Direktdemokratie auf die Sozialpolitik
5. Who runs the IFIs?
6. Secondary school teachers’ attitudes towards and beliefs about ability grouping
7. Robust Econometrics
8. The name is absent
9. Wettbewerbs- und Industriepolitik - EU-Integration als Dritter Weg?
10. TWENTY-FIVE YEARS OF RESEARCH ON WOMEN FARMERS IN AFRICA: LESSONS AND IMPLICATIONS FOR AGRICULTURAL RESEARCH INSTITUTIONS; WITH AN ANNOTATED BIBLIOGRAPHY
11. The Values and Character Dispositions of 14-16 Year Olds in the Hodge Hill Constituency
12. The name is absent
13. American trade policy towards Sub Saharan Africa –- a meta analysis of AGOA
14. The name is absent
15. The name is absent
16. Computational Experiments with the Fuzzy Love and Romance
17. Bidding for Envy-Freeness: A Procedural Approach to n-Player Fair Division Problems
18. On s-additive robust representation of convex risk measures for unbounded financial positions in the presence of uncertainty about the market model
19. The name is absent
20. Who is missing from higher education?