(i) In the case of babies wrongly bunched together in the same pregnancy, it is
possible in most cases to re-arrange the 29 ‘baby’ variables correctly, with no loss of
information. But the same does not apply to most of the 10 ‘pregnancy’ variables.
For instance, where three babies from distinct pregnancies have been bunched as
though they were a multiple birth, only one set of ‘pregnancy’ variables has been
completed. So once the three babies are correctly separated into three distinct
pregnancies, there is no way of telling which of these three the questions relate to (e.g.
“did you smoke during this pregnancy?”). The two exceptions are pregnum (“How
many babies did you carry in this pregnancy?) and morepreg (“Have there been
other pregnancies before this one?”), which can be re-calculated logically by machine
algorithm. A decision was therefore taken during this data cleaning exercise to copy
the same pregnancy details into each of the now-separated pregnancy slots, but care
should of course be taken when analysing these cases (see list at Appendix 1).
(ii) In the case of all pregnancies not resulting in a live birth, although the date of
stillbirth/miscarriage/abortion is still required, only the month and year were asked.
The ‘day’ variable (preged, preged2-40) remains system-missing, and consequently
the composite date variable prege (prege2-40) also remains system-missing.
(iii) In the case of respondents who were still pregnant at the interview date, no
outcome date could be entered, but one would obviously expect the data to be entered
in the first (i.e. most recent) birth slot. However, there are cases where the ongoing
pregnancy has been entered in birth slots other than the first, and indeed some where
the ongoing pregnancy is featured as an apparent multiple birth with another
completed pregnancy. See Appendix 5.
(iv) A number of outcome dates contain the missing values 99 or 9999 in one or more
of the ‘day’, ‘month’ or ‘year’ fields (preged, pregem, pregey, etc). In these cases,
the composite 8-digit date variable prege (or prege2-prege40) remains system-
missing. In most cases it seems to indicate the respondent (often a single or separated
male) could not remember or did not know the full birth date, but could remember the
year and possibly the month (although there are odd cases such as NCDS 385037C
where the respondent could apparently remember the month of birth, but not the day
or year). If at least the year has been entered, we can in most cases sort out whether
the pregnancy outcome has been put in the right order. But of course if the year is
entered as ‘9999’, the best one can do is a manual scrutiny of any other dates to
hazard a guess as to whether it is likely to be in the right order. See Appendix 6 for
the results of this manual scrutiny.
(v) Some successive outcome dates seem questionable. There are many cases where
two dates are more than 3 days apart, but less than nine months (see Appendix 7). In
the case of miscarriages or abortions this is understandable, although some
miscarriages from apparently different pregnancies are entered as being in successive
months. One or two live births are separated in time by over a month, but less than
eight months, which seems to indicate a data entry error. A surprisingly common
feature is a miscarriage followed a few months later by a live birth, which seems to
indicate twins where one miscarried, and the other went to full term. But where there
is a 7- or 8-month gap it is sometimes hard to tell if perhaps the two outcomes were
from different pregnancies. All cases in Appendix 7 were subjected to detailed
scrutiny, looking sometimes at whether the live birth was logged as premature, or the