Provided by Cognitive Sciences ePrint Archive
Published in Behavioral and Brain Sciences, 14 (1991), 136- 137 as part of a special issue of
commentary around a lead article, “The reliability of peer review for manuscripts and grant
submissions: A cross-disciplinary investigation.” The section runs from pages 119-186.
Does the Need for Agreement Among Reviewers
Inhibit the Publication of Controversial Findings?
J. Scott Armstrong
Raymond Hubbard
As Cicchetti indicates, agreement among reviewers is not high. This conclusion is
empirically supported by Fiske and Fogg (1990), who reported that two independent reviews of
the same papers typically had no critical point in common. Does this imply that journal editors
should strive for a high level of reviewer consensus as a criterion for publication? Prior research
suggests that such a requirement would inhibit the publication of papers with controversial
findings. We summarize this research and report on a survey of editors.
Prior research. Horrobin (1990) suggests that the primary function of peer review
should be to identify new and useful findings, that is, to promote the publication of important
innovations. This function is typically subordinated to the quality control aspects of peer review,
however. The quality control approach looks for agreement among the reviewers. The result,
Horrobin claims, is that competent research yielding relatively unimportant findings is more
readily accepted for publication. 1 He provides numerous examples of harsh peer review given to
important research that presents controversial results.
The popular press often reports difficulties associated with the publication of important
research findings. The scanning tunneling microscope (STM) is a case in point. The STM is
capable of distinguishing individual atoms and has been hailed as one of the most important
inventions of this century. It earned a Nobel Prize in physics for its inventors. Nevertheless, the
first attempt to publish the results produced by the STM in 1981 failed because a journal referee
found the paper “not interesting enough.” (Fisher 1989).
Armstrong (1982c) provides additional examples of lapses in the peer review system, along with
summaries of empirical evidence that disconfirming findings about important topics are difficult
to publish. Among these, the experimental studies by Goodstein and Brazis (1970) and Mahoney
(1977) are of particular interest. They found that reviewers were biased against negative
1 It is not clear that the quality control function is performed well. About one-third of the papers
in biomedical journals were found to contain citation errors, and one-third also incorrectly
quoted findings from the literature (Evans et al. (1990). In addition, Hubbard and Armstrong
(1990) found that 60% of published replications with extensions in three leading marketing
science journals failed to support the original findings.