diagonal for which holds i=j, while all other cells refer to country-country collaborations for
which holds i j. A co-occurrence of two addresses in different countries is attributed to both
cells that refer to a pair of countries so we get a symmetric matrix (qij= qji).4 The share of each
country in the total number of collaborations is then given by:
15
qi.=∑ qy (1)
j=1
and, because of symmetry in the matrix, q.j is equal to qi. for i=j. In other words, to derive the
marginal totals for each country one can either sum over the rows or over the columns.
3.1 Mutual information
The degree of integration of country i with respect to country j is measured here as the
difference between the observed share of collaborations qij and what would be expected from
the product of the individual shares qi. and q.j. The difference between the observed share and
the expected share is measured by the natural logarithm of the division of qij by the products
of qi. and q.j :
q
Tij = ln (2)
qi. ∙ q.j
The Tij-value is a measure of bias. The value is positive when country i is collaborating with
country j more than what is expected from the product of the individual country shares in all
output. The Tij -measure takes on a negative sign when country i is collaborating with country
j less than what was expected from their shares. Put another way, a positive value indicates a
positive bias in the propensity of country i to collaborate with country j and vice versa while a
negative value indicates a negative bias in the propensity of country i to collaborate with
country j and vice versa.
The use of a logarithm renders this measure symmetric regarding to whether a country
collaborates x times more than expected or x times less than expected with another country.
For example, when two countries collaborates two times more than expected, the Tij -measure
4 Consequently, a co-occurrence of two addresses in the same country is counted twice. The complete
procedure is also illustrated in the example in Table 1.