information, FAME also has information on the year of incorporation of the company,
postcodes, the 4-digit 2003 SIC industry code, and country of ownership. The
definitions of variables included in our econometric models are provided in Table A1
in the Data Appendix. Note, we only use data containing unconsolidated accounts, to
avoid double counting and within firm transfer effects. Our final dataset used for
statistical analysis comprises of an unbalanced panel, containing 81,819 firms with
326,906 observations covering 1996-2004, where information on ‘entry and exits’
into export markets is also available.11
The FAME dataset is severely biased towards large enterprises, and thus is
unrepresentative of UK firms. To obtain a distribution representative of the population
of firms operating in the UK, we treat the firms in the FAME dataset as a sample of
the ARD population 12 , and consequently weight the FAME data to produce a
representative database (by industry and firm size).13 In practice, we have obtained
aggregated turnover data from the ARD sub-divided into 5 size-bands (based on
turnover quintiles) and 3-digit industry SICs14. We then aggregate the FAME data into
the same sub-groups, so that we can calculate weights using the total turnover data
from the ARD divided by the comparable data from FAME. In essence, the FAME
dataset is weighted to acquire the same distribution of turnover as those firms in the
ARD.15
11 Nearly 23% of firms are observed throughout the nine-year period; thus the majority of firms are
observed for only some of 1996-2004.
12 For a details description of the ARD (available at the ONS), see Oulton (1997), Griffith (1999) and
Harris (2005).
13 Efforts have also been made to merge FAME into the ARD; nevertheless, these have been largely
unsuccessful (see Harris and Li, 2007, Chapter 2, for more details).
14 Where there are fewer than 10 enterprises in any sub-group, these data are not used, so as to avoid
disclosure of confidential information in these ONS data. This results in a loss of some 4% of the total
turnover available in the ARD.
15 Note we do not weight the FAME data for 34 industries because the FAME data have better coverage
in terms of total turnover than the ARD. These 34 industries (out of 215 in total) account for just 2.9%
11