This is what happens. The correlation between primary schools’ value-added score
and their KS2 results drops to +0.69 for the 353 schools with less than 50 pupils in the
cohort, and to +0.67 for the 255 schools with less than 32 pupils, for example. The
correlation rises to +0.76 for the 354 schools with more than 18 pupils, and to +0.78
for the 258 schools with more than 30 pupils, for example. All of the schools with 50
or more pupils in the cohort had value-added scores in the narrow range of 98 to 102
and, in general, the schools with the most extreme value-added scores had very few
pupils. All of this suggests that the school-level value-added scores can be explained
to a large extent by the actual level of attainment of pupils at KS2 (i.e. the raw
scores), and the apparent differences (the width of the scatter in Figure 1) can be
explained by measurement error and the volatility of small numbers.1
Some commentators might suggest that Figure 1 actually shows two different kinds of
regression. In addition to the bottom-left to top-right pattern discussed so far, there is
also a sequence of top-left to bottom-right ‘lines’. But there is no way of
distinguishing such a conceptual sequence from the scatter and volatility described
above. The appearance of the graph itself is affected by the scale chosen, and a visual
comparison of the two kinds of slopes is, therefore, not a reliable guide to the overall
pattern. If the pattern in Figure 1 had been close to a perfect diamond shape with
corners at (27, 103), (27, 97), (23, 100), and (31, 100) then the correlation would be
zero, or very close to zero. If, on the other hand, there had been an appreciable
negative slope then the correlation would have been negative overall. But +0.74 is a
very high correlation - considerably higher than standard in the educational literature
- representing an ‘effect’ size of 55%. It is also a positive correlation, representing the
positive left-right slope while ignoring the negative one.
Discussion
If accepted, then the re-analysis above, coupled with the similar analysis of the results
for all secondary schools in England (Gorard 2006c), suggests two important kinds of
conclusion. The first kind of conclusion that can be drawn is methodological. Many
analysts agree that value-added comparisons of the kind conducted so far by DfES are
problematic (e.g. Tymms and Dean 2004, Schagen 2006). Their usual response is to
try and make this complex analysis even more complex. But without confirmatory
evidence of a different nature, and no sceptical consideration of the meaning of the
measures involved, there is a danger that large-scale complex analyses such as those
considered here are rhetorically misleading. In fact, the changes and differences
identified as school effects may be largely chance processes, with a greater random
element than traditional analysts allow (Pugh and Mangan 2003). How do we know
that the variation in value-added scores for different schools means anything at all?
There is no external standard or arbiter to which we can refer. The calculations look
plausible enough, but no one had predicted the level of correlation found between
raw- and value-added scores. In fact, on hearing of it, commentators at the DfES first
denied the correlation, and then attributed it to some peculiarity of schools in
Yorkshire, briefing the education minister in the House of Lords to state this in
1 In fact, since a value-added score is, in essence, the difference between prior and subsequent
attainment figures, one would expect around half of the variance in VA figures to be explained by
either of these raw-scores, leading to two correlations of around 0.7 each.