which can include not known, information refused, information not yet sought, and
‘other’ non-completed, often covers a large proportion of the students. One example is
that other than ‘white’, ‘missing’ is officially the largest ethnic group among students
in England. In fact, the unknown cases considerably outnumber all of the minority
ethnic groups combined. Some of the minority ethnic groups are quite small, leading
to the usual volatility of small numbers when analysing trends over time or
differences between groups. Consequently, the high proportion of missing cases in
any analysis using this variable could significantly bias the results being presented,
even where the overall response rate is high. This means that any differences over
time and place, or between social groups, needs to be robust enough to overcome this
bias (among many others). The scale of a difference or change must be such that it
dwarfs the bias introduced by measurement errors, missing cases, and changes in data
collection methods over time. This difficulty is seldom acknowledged by
commentators.
Similarly, UCAS applicant figures, and HESA Individualised Student Records (ISRs),
have a large proportion of cases with no occupational category. In fact, when non-
responses are added to those cases otherwise unclassifiable by occupation (through
being economically inactive, for example) then having no occupational category
becomes the single largest classification. In 2002/2003, around 45% of first year
undergraduates were unclassifiable in terms of their occupational background,
according to HESA figures. How then, could we possibly know whether any
occupational group was under-represented in HE? Any difference between groups in
HE and in the population is simply dwarfed by the missing data. Thus, if students
from less-elevated occupational groups were less likely to respond to questions about
occupational background, so that a high proportion of the missing 45% of cases were
really from less-elevated occupational groups, then these groups might actually be
over-represented in HE, even though they may be under-represented among those
who answer the occupational question. We just do not know.
Analysis