events contact us
Search the complete PARC site
 

Mass Spectrometry for Protein Identification

Tandem mass spectrometry has emerged as a key technique for identifying proteins in complex biological samples.

In the "shotgun proteomics" technique, proteins are digested into peptides – which are identified from fragmentation spectra – and then peptide identifications are integrated back into protein identifications. However, since high-throughput proteomics laboratories can produce millions of mass spectra a week, automatic data analysis is critical.

Working with top proteomics laboratories, PARC researchers have developed new algorithms and software for efficiently identifying peptides and proteins – with greater sensitivity and accuracy than standard tools such as SEQUEST, Mascot, and ProteinProphet. Our collaborators are using PARC's software to address difficult proteomics problems such as biomarker discovery and oxidative footprinting.

PARC’s peptide identification program – ByOnic – takes a hybrid approach that uses both de novo sequencing and database search techniques.

  • ByOnic first employs de novo sequencing to identify a small number of "lookup peaks", likely b- and y-ion masses. Then the database is searched for peptides that match a given number of lookup peaks, for example, 1 match for fully tryptic peptides, 2 for semi-tryptic peptides, and 3 for non-tryptic peptides. Qualifying peptides are then scored in great detail, taking into account predicted and observed peak intensities and mass measurement errors.Lookup peaks function similarly to 3-letter "sequence tags", but are more efficient because 2 lookup peaks filter the database about 5 times more effectively than a 3-letter tag.

PARC’s protein identification program ComByne integrates ByOnic's peptide identifications into a list of protein identifications, ranked by confidence.

  • To compile its list, ComByne uses the number of peptide identifications, along with their lengths and scores, and then corrects for the lengths and redundancies of proteins. On a complex, high-dynamic-range sample like blood plasma, ByOnic typically makes 50% to 100% more spectrum identifications than Mascot or SEQUEST at the same false discovery rate (empirically measured using reversed protein sequences). This improvement at the spectrum level typically translates into 30% to 70% more identifications at the protein level.

A comparison of ByOnic/ComByne v. Mascot/ProteinProphet v. X!Tandem (using the product of E-values for protein ranking) on a sample of mouse blood plasma. All three tools were run on the same 50,000-protein database, which included reversed proteins – deliberate decoys – for an empirical estimate of the false discovery rate.

  • All three tools agreed on the first 69 proteins in the mouse plasma sample.
  • For Mascot, reversed proteins started showing up at rank 70, and reached about 50% of all identifications by rank 90.
  • For X!Tandem, reversed proteins started showing up at rank 105, and reached about 50% of all indentifications by rank 120.
  • For ByOnic, reversed proteins started showing up at rank 148, and reached about 50% of all identifications by rank 160.
  • The sample included 13 soluble human proteins spiked into the mouse plasma at a concentration of 10 micrograms per milliliter.
  • Mascot found only 2 of the spikes, but ByOnic found 10.

 

 

 

 

 

BUSINESS CONTACT
Richard Bruce
Manager, Biomedical Systems
650-812-4447
KEYWORDS
de novo sequencing ∙ peptide identification ∙ proteomics ∙ tandem mass spectrometry
DOWNLOADS

PARC's ByOnic & ComByne software [submit spectra & compare results]

RELATED WEBPAGES

Mass Spectrometry for Glycomics
 

PUBLICATIONS

Lookup peaks: a hybrid of de novo sequencing and database search for protein identification by tandem mass spectrometry

Improved ranking functions for protein and modification-site functions

   

  (Logo/Homepage) PARC - Palo Alto Research Center

Copyright © 2002-2007 Palo Alto Research Center Incorporated. All Rights Reserved.
PARC, the PARC Logo, AspectJ, DataGlyph, Obje, Silx, StressedMetal, and ClawConnect
are trademarks or registered trademarks of Palo Alto Research Center Incorporated.