eagle-i The University of PennsylvaniaThe University of Pennsylvania
See it in Search
This page is a preview of the following resource. Continue onto eagle-i search using the button on the right to see the full record.

Empirical Bayes Protein identifier

eagle-i ID


Resource Type

  1. Algorithmic software suite


  1. Resource Description
    "EBP calculates protein expression probabilities based on peptide sequence identifications from search algorithms such as Mascot and Sequest. Protein lists can be generated by choosing proteins whose expression probabilities exceed a threshold value. Varying the probability threshold allows the sensitivity of protein identification to be balanced against the false positive error rate. The statistical model assumes that every peptide sequence that could theoretically result from enzymatic digestion of a protein in the search database has a chance of being identified in the search results, whether correctly or incorrectly. The probabilities of correct identification are combined across multiple peptide searches using a function that returns the maximum probability from consensus identifications, and penalizes non-consensual identifications. Both correct and incorrect peptide sequence identifications are assumed to occur at random in this "space" of peptides, at rates that are governed by model parameters including protein length, estimated protein abundance, the size of the search database, and the number of peptide sequence identifications in the dataset. Degenerate peptides whose sequence matches multiple proteins are treated using "Occam's Razor", a principle by which the smallest set of probable proteins is chosen that is sufficient to explain the peptide sequence identifications. For each protein in the database, a likelihood ratio is calculated for the possibility that the peptide identifications whose sequence matches the protein are all incorrect. These likelihood ratios are used to estimate the expression probabilities, from which updated parameter estimates are obtained. The procedure is iterated until the algorithm converges at the maximum likelihood estimates. Replicated datasets can be analyzed by estimating multiple sets of model parameters can be estimated simultaneously. In this way, hypotheses about protein expression can be tested using the results of replicate experiments." "EBP can be run as an alternative to ProteinProphet as part of the Sashimi Trans-Proteomic Pipeline."
  2. Additional Name
  3. Used by
    Institute for Translational Medicine and Therapeutics
  4. Data Input
    Peptide identification data (SBCtools pepXML)
  5. Data Output
    Protein identifications with expression probabilities (SBCtools protXML)
  6. Software purpose
    Mass spectrometry data analysis objective
  7. Related Publication or Documentation
    EBP, a program for protein identification using multiple tandem mass spectrometry datasets
  8. Related Publication or Documentation
    The Post-Synaptic Density of Human Postmortem Brain Tissues: An Experimental Study Paradigm for Neuropsychiatric Illnesses
  9. Website(s)
  10. Related Technique
    Protein expression profiling
  11. Developed by
    Price, Thomas., PhD
  12. Software license
    Creative Commons
  13. Algorithm used
    Bayesian Model
  14. Coded in
Provenance Metadata About This Resource Record
  1. workflow state
  2. contributor
  3. created
  4. creator
  5. modified
Copyright © 2016 by the President and Fellows of Harvard College
The eagle-i Consortium is supported by NIH Grant #5U24RR029825-02 / Copyright 2016