The LAD Data Set and the Analysis Sample
The master LAD file is a 10 per cent representative sample of all Canadian
tax-filers. In order to be as inclusive as possible, we look at total employment
income (henceforth “earnings”) consisting of all wage and salary income and
net self-employment income of all earners (men and women) aged 20 to 64
who were not identified as full-time students in the income year and who
received at least $1,000 in earnings (in 1996 constant dollars) as reported on
T-1 forms.1 The intention is to approximate Statistics Canada’s concept of
“All Earners” while excluding those who have only a limited attachment to the
labour market. The resulting sample in 1996 is thus 1.218 million
observations or 56 per cent of the full LAD file of 2.167 million observations
that year. The biggest exclusions were for those over age 64 (17 per cent in
1996) and under the $1,000 earnings cut-off (20 per cent). The sample sizes
vary from 1.033 million observations in 1982 to 1.218 million observations in
1996.
The LAD’s coverage (and representativeness) of the adult population is
very good since the rate of tax filings is very high in Canada: high-income
recipients are required to do so, while low-income individuals have incentives
to file in order to recover income tax and other payroll tax deductions made
throughout the year, and since 1986 to recover various tax credits. The full
set of tax files from which the LAD is constructed are estimated to cover
from 91 to 95 per cent of the target adult population (Finnie, 1997d). There
has been an increase in the proportion of individuals filing tax forms over time
due to the introduction of the federal sales tax credit in 1986, the goods and
services tax (GST) credit in 1990, and various other federal and provincial
benefits. While this improved coverage means that the LAD has become
increasingly representative of the underlying adult population, it also poses
potential problems for comparisons of earlier and later years, since the “new”
filers are more likely to have low earnings in any given year and hence bias
estimates towards slower average growth and stronger net downward
mobility. The comparison problem does not, however, appear to be very
great and is, in any case, attenuated by the age, student and low-earnings
exclusions imposed on the sample (Beach and Finnie, 2001).
The estimation sample is divided into eight separate age/sex population
subgroups. Women and men are treated separately because of their different
1We have also put in place special procedures to deal with individuals who have
changed their SINs, who have multiple SINs, and other non-standard cases — see Finnie
(1997c) — which comprise of the order of 4 per cent of the file in any given year.
Designation of full-time student is based on the tuition and education tax credit responses on
T-1 forms.
456 Charles M. Beach and Ross Finnie