Database Search Strategies for Proteomic Data Sets Generated by Electron Capture Dissociation Mass Spectrometry



Provided by University of Birmingham Research Archive, E-prints Repository

research articles proιtjur ∣ ∣tj

’research

Database Search Strategies for Proteomic Data Sets Generated by
Electron Capture Dissociation Mass Spectrometry

Steve M. M. Sweet/# Andrew W. Jones, Debbie L. Cunningham/ John K. Heath/
Andrew J. Creese, and Helen J. Cooper*

School of Biosciences, College of Life and Environmental Sciences, University of Birmingham,
Edgbaston, Birmingham B15 2TT, United Kingdom

Received May 28, 2009

Large data sets of electron capture dissociation (ECD) mass spectra from proteomic experiments are
rich in information; however, extracting that information in an optimal manner is not straightforward.
Protein database search engines currently available are designed for low resolution CID data, from
which Fourier transform ion cyclotron resonance (FT-ICR) ECD data differs significantly. ECD mass
spectra contain both z-prime and z-dot fragment ions (and c-prime and c-dot); ECD mass spectra contain
abundant peaks derived from neutral losses from charge-reduced precursor ions; FT-ICR ECD spectra
are acquired with a larger precursor
m/z isolation window than their low-resolution CID counterparts.
Here, we consider three distinct stages of postacquisition analysis: (1) processing of ECD mass spectra
prior to the database search; (2) the database search step itself and (3) postsearch processing of results.
We demonstrate that each of these steps has an effect on the number of peptides identified, with the
postsearch processing of results having the largest effect. We compare two commonly used search
engines: Mascot and OMSSA. Using an ECD data set of modest size (3341 mass spectra) from a complex
sample (mouse whole cell lysate), we demonstrate that search results can be improved from 630
identifications (19% identification success rate) to 1643 identifications (49% identification success rate).
We focus in particular on improving identification rates for doubly charged precursors, which are
typically low for ECD fragmentation. We compare our presearch processing algorithm with a similar
algorithm recently developed for electron transfer dissociation (ETD) data.

Keywords: ECD neutral loss OMSSA Mascot identification CID mass spectrometry FT-ICR
LTQ-FT

Introduction

Electron capture dissociation (ECD) is a radical-driven
fragmentation technique which provides an alternative to slow-
heating collision induced dissociation (CID).1 ECD has suc-
cessfully been applied to the small-scale detailed character-
ization of various peptides, modified or otherwise.2,3 These
experiments are greatly facilitated by a prior knowledge of the
peptide sequence, allowing manual analysis of the ECD data.
In contrast, large-scale proteomic experiments utilizing ECD
rely on a database search step in order to identify the
fragmented peptide.4,5 The database search engines employed
were originally designed to accept low resolution CID data.
High resolution ECD data presents a significantly different
challenge. The characteristics of FT-ICR ECD data are sub-10
ppm mass accuracy, low noise levels, intense precursor and
charge-reduced precursor peaks, and strong neutral loss peaks

* Address correspondence to: Helen J. Cooper, School of Biosciences,
College of Life and Environmental Sciences, University of Birmingham,
Edgbaston, Birmingham B15 2TT, UK. Telephone:
+44 (0)121 414 7527. Fax:
+44 (0)121 414 5925. E-mail: [email protected].

t CRUK Growth Factor Group.

# Current address: Department of Chemistry, University of Illinois at
Urbana-Champaign, Urbana, IL 61801, USA.

10.1021∕pr9008282 CCC: $40.75    © 2009 American Chemical Society
from the charge-reduced precursor.6,7 Furthermore, hydrogen
transfer can occur between ECD c-prime and z-dot fragments,
resulting in c-dot and z-prime products.8

The search engines that have been employed for large-scale
ECD data analysis are Mascot and OMSSA.4,5 These search
engines have certain limitations, for example, the product ion
tolerance cannot be specified in ppm and the benefits of high
mass accuracy data are not fully realized. We have analyzed
large-scale ECD data sets both manually and using various
search engines. It is apparent from these analyses that certain
generic aspects of ECD mass spectra are likely to be detrimental
to their identification by database search engines. The most
obvious of these is the high intensity precursor and charge-
reduced precursor peaks. Both search engines tested here
already anticipate these peaks, removing them from consid-
eration. For example, Mascot removes peaks within the frag-
ment ion tolerance window about each of the precursor isotope
peaks. However, the search engines do not consider coeluting
peaks in the precursor isolation window and are, in fact,
ignorant of the isolation window size. Another characteristic
of ECD is the generation of various neutral losses from the
charge-reduced precursors. These peaks are not utilized by
currently available search engines. In the case of ECD of doubly

Journal of Proteome Research 2009, 8, 5475-5484 5475

Published on Web 10/13/2009



More intriguing information

1. Portuguese Women in Science and Technology (S&T): Some Gender Features Behind MSc. and PhD. Achievement
2. The name is absent
3. Changing spatial planning systems and the role of the regional government level; Comparing the Netherlands, Flanders and England
4. Imputing Dairy Producers' Quota Discount Rate Using the Individual Export Milk Program in Quebec
5. The name is absent
6. The name is absent
7. Outsourcing, Complementary Innovations and Growth
8. Locke's theory of perception
9. Neural Network Modelling of Constrained Spatial Interaction Flows
10. Dynamiques des Entreprises Agroalimentaires (EAA) du Languedoc-Roussillon : évolutions 1998-2003. Programme de recherche PSDR 2001-2006 financé par l'Inra et la Région Languedoc-Roussillon
11. APPLYING BIOSOLIDS: ISSUES FOR VIRGINIA AGRICULTURE
12. Incorporating global skills within UK higher education of engineers
13. The WTO and the Cartagena Protocol: International Policy Coordination or Conflict?
14. Rent Dissipation in Chartered Recreational Fishing: Inside the Black Box
15. Bird’s Eye View to Indonesian Mass Conflict Revisiting the Fact of Self-Organized Criticality
16. The name is absent
17. Evidence-Based Professional Development of Science Teachers in Two Countries
18. Elicited bid functions in (a)symmetric first-price auctions
19. The name is absent
20. Group cooperation, inclusion and disaffected pupils: some responses to informal learning in the music classroom