χ 2 = Σ _fo — fe)2
fe
Where fo if the actual count in each cell and fe is the expected count. χ2 is distributed
with degrees of freedom given by (r-1)(k-1), where r is the number of rows in the table, and
k is the number of columns. Where the calculated value of χ2 exceeds some critical value,
then we conclude that the differences between observed and expected values in the table are
unlikely to have occurred by chance (less than 5 times in 100 for the particular critical
value used in this study). Because the numerator in the χ2 formula is squared, the larger the
difference between observed and expected, the bigger the influence on whether the
calculated χ2 will exceed its critical value and lead us to the conclusion that we reject the
working hypothesis that line of activity and a factor is unconnected.
A rule of thumb with contingency tables is that no more than 20% of all cells should
have an expected count less than 5. In order to achieve this, categories are sometimes
aggregated. The χ2 statistic is somewhat fragile as it is influenced by the number of cases
in the contingency table and is less likely to detect a true relationship between the variables
in the table when the number of cases is small (which is sometimes the case for the current
questionnaire survey). For that reason additional tests of the degree of association between
the variables were used which are not sensitive to the number of cases in the analysis.
Cramer’s V is used where one or more of the variables is categorical rather than
representing a ranking.
The second major means of analysis was to compare outcomes using tests based on
comparison of the median. Here the responses to each question were compared by size of
employment. The basis of the test is to rank all cases by size from smallest (given a rank of
1) to the largest. The sum of the ranks is calculated for each category of response and
divided by the number of cases to compute the mean rank. Where there is no difference
between the categories in terms of size, the mean ranks will be the same. The appropriate
test statistic, the Kruskal-Wallis, was computed and again where the calculated statistic
exceeds some critical value we reject the working hypothesis that firms giving each
category of response had the same median size.
53