| US 7,555,393 B2 | ||
| Evaluating the probability that MS/MS spectral data matches candidate sequence data | ||
| Rovshan Goumbatoglu Sadygov, San Jose, Calif. (US); and Andreas Huhmer, Mountain View, Calif. (US) | ||
| Assigned to Thermo Finnigan LLC, San Jose, Calif. (US) | ||
| Filed on Jun. 01, 2007, as Appl. No. 11/809,703. | ||
| Prior Publication US 2008/0300795 A1, Dec. 04, 2008 | ||
| Int. Cl. G01N 33/48 (2006.01) | ||
| U.S. Cl. 702—19 | 20 Claims |

| 1. A method for generating a compound probability that product ion spectral data matches a candidate sequence in a sequence
database by random, the product ion spectral data having been generated by a non-ergodic process as implemented in a mass
spectrometer instrument, comprising:
(a) generating a first product ion spectral data from one or more biological samples using the mass spectrometer instrument;
wherein the steps following steps are preformed by a computer;
(b) preprocessing the first product ion spectral data;
(c) determining product ion abundance values and product ion mass-to-charge ratio values for each of a plurality of peaks
from the pre-processed first product ion spectral data;
(d) utilizing the product ion abundance values to determine an intensity probability distribution, the intensity probability
distribution representing a first probability that the product ion spectral data was generated at random;
(e) utilizing the mass-to-charge ratio values of the product ions to determine a fragment probability distribution, the fragment
probability distribution representing a second probability that the product ion spectral data was generated at random;
(f) determining the compound probability based on the intensity probability distribution and the fragment probability distribution,
the probability representing a probability that the generated product ion spectral data matches a candidate sequence in the
sequence database at random; and
(g) displaying to a user, the candidate sequence which best matches the generated first product ion spectral data based on
the determined compound probability.
|