| US 7,617,163 B2 | ||
| Kernels and kernel methods for spectral data | ||
| Asa Ben-Hur, Seattle, Wash. (US); André Elisseeff, Thalwil (Switzerland); Olivier Chapelle, Tuebingen (Germany); and Jason Aaron Edward Weston, New York, N.Y. (US) | ||
| Assigned to Health Discovery Corporation, Savannah, Ga. (US) | ||
| Filed on Oct. 09, 2002, as Appl. No. 10/267,977. | ||
| Application 10/267977 is a continuation in part of application No. 10/477078, granted, now 7,353,215, previously published as PCT/US02/14311, filed on May 07, 2002. | ||
| Application 10/267977 is a continuation in part of application No. 10/267977. | ||
| Application 10/267977 is a continuation in part of application No. 10/057849, filed on Jan. 24, 2002, granted, now 7,117,188. | ||
| Application 10/057849 is a continuation in part of application No. 09/633410, filed on Aug. 07, 2000, granted, now 6,882,990. | ||
| Claims priority of provisional application 60/289163, filed on May 07, 2001. | ||
| Claims priority of provisional application 60/329874, filed on Oct. 17, 2001. | ||
| Claims priority of provisional application 60/328309, filed on Oct. 09, 2001. | ||
| Prior Publication US 2005/0228591 A1, Oct. 13, 2005 | ||
| Int. Cl. G06F 15/18 (2006.01) | ||
| U.S. Cl. 706—12 [706/45; 706/20; 706/25] | 25 Claims |

| 1. A method for analysis of data contained in a plurality of spectra generated from mass spectrographic measurement of protein
samples corresponding to different biological conditions, wherein the different biological conditions have associated different
levels of protein expression, the method comprising:
downloading the plurality of spectra into a computer system comprising a processor and a storage device, wherein the processor
is programmed to execute at least one support vector machine and performs the steps of:
aligning the plurality of spectra, comprising:
selecting a first example spectrum as a baseline example;
sliding each spectral peak of a second example spectrum one at a time along a plurality of peaks within the baseline example;
applying a scoring function to obtain a similarity score between each spectral peak of the second example spectrum and the
peaks within the baseline example, the similarity score being determined according to the relationship
![]() ![]() offsetting the second example spectrum relative to the baseline example according to the similarity score achieved for the
second example spectrum;
repeating the step of aligning the spectra for at least one additional example spectrum to create a set of aligned spectra;
applying a feature selection algorithm to the set of aligned spectra to select a subset of spectral peaks that discriminate
between the different biological conditions;
training and testing the at least one support vector machine using the set of aligned spectra to provide a trained at least
one support vector machine for processing of live data;
inputting live spectral data from live protein samples to be analyzed, selecting the subset of spectral peaks from the live
spectral data, aligning the live spectral data and processing the live spectral data using the trained at least one support
vector machine; and
displaying a listing comprising two or more subsets of live protein samples wherein each subset is identified as belonging
to a group possessing one of the different biological conditions.
|