The geography of collaborative knowledge production: entropy techniques and results for the European Union

equals ln 2 = 0.693 and when two countries collaborates two times less than expected, the
Tij -measure equals ln ½ = -0.693.

The degree of integration of the network of fifteen member states as a whole is measured by
T, which is the sum of the values for Tij weighted for the share in the total number of
collaborations qij . In information theory, the measure T is known as the “mutual information”
value, which measures dependence in a frequency matrix (Frenken, 2000, 2001; Langton,
1990; Leydesdorff, 1991; Theil, 1967, 1972):

15 15 q

t =ΣΣ q,,-in (3)⁵

i.ι j.ι q, ∙ qj

It has been shown that mutual information is non-negative for any frequency distribution
(Theil, 1972). When all pairs of countries would collaborate exactly as much as expected
from their individual shares, we have qij = q_i. ∙ qj. In this case, all pair wise bias values Tij
equal zero, and the T-value consequently equals zero too (total independence). In the context
of research collaboration, a zero T-value indicates perfect integration of all fifteen member
states within the European science system. When any bias exists in the propensity to
collaborate, mutual information will be positive. The higher the T-value, the less countries are
integrated in a system (higher dependence).

Theil (1967, chapter 9) initially used the mutual information measure to characterise the
amount of information contained in input-output tables. In this application, the values of qij
stand for the inter-industry flows as fractions of the aggregate output. Total independence of
the matrix (T=0) would mean that the input-output table would not contain any information at
all, since the inter-industry flows qij can readily be derived from the product of the marginal
totals qi. and q.j . Any other input-output table would yield a positive mutual information. The
higher the value of the mutual information, the more structure is present in the input-output-
table, and the higher its information content. ^{6 7}

⁵ For x=0 we have x ∙ ln x 0. In information theory, one usually uses base two logarithm inste ad of
the natural logarithm to express the value of mutual information in bits. When the natural logarithm is
used, as in this study, one speaks of “nits” (Theil, 1972).

⁶ Theil (1967) also showed why the mutual information decreases when sectors in an input-output table
are aggregated. In this context, he showed that minimisation of input heterogeneity of sectors that are
aggregated minimises the loss of information due to aggregation. A similar aggregation procedure,
though not followed below, could be applied to the matrices of research collaborations.

⁷ More recent applications of mutual information in social sciences can be divided in two groups:
applications to empirical data and application to simulation data. Empirical applications include the
dependence between different donors and different recipients of grants in the United States (Theil,
1972), the dependence between journals as reflected in matrices of journal-journal citation matrices

More intriguing information

1. The East Asian banking sector—overweight?
2. Impact of Ethanol Production on U.S. and Regional Gasoline Prices and On the Profitability of U.S. Oil Refinery Industry
3. The name is absent
4. Gender stereotyping and wage discrimination among Italian graduates
5. The name is absent
6. The name is absent
7. Improvement of Access to Data Sets from the Official Statistics
8. Cross-Country Evidence on the Link between the Level of Infrastructure and Capital Inflows
9. The name is absent
10. Nach der Einführung von Arbeitslosengeld II: deutlich mehr Verlierer als Gewinner unter den Hilfeempfängern