| US 7,457,745 B2 | ||
| Method and apparatus for fast on-line automatic speaker/environment adaptation for speech/speaker recognition in the presence of changing environments | ||
| Shubha Kadambe, Thousand Oaks, Calif. (US); Ron Burns, Oceanside, Calif. (US); and Markus Iseli, Los Angeles, Calif. (US) | ||
| Assigned to HRL Laboratories, LLC, Malibu, Calif. (US) | ||
| Filed on Dec. 03, 2003, as Appl. No. 10/728,106. | ||
| Claims priority of provisional application 60/430788, filed on Dec. 03, 2002. | ||
| Prior Publication US 2004/0230420 A1, Nov. 18, 2004 | ||
| Int. Cl. G10L 19/00 (2006.01) | ||
| U.S. Cl. 704—216 [704/243; 704/244] | 123 Claims |

| 1. A method for fast on-line automatic speaker/environment adaptation suitable for speech/speaker recognition in the presence
of changing environmental conditions, the method comprising acts of:
performing front-end processing on an acoustic input signal, wherein the front-end processing generates MEL frequency cepstral
features representative of the acoustic input signal;
performing recognition and adaptation by:
providing the MEL frequency cepstral features to a speech recognizer, wherein the speech recognizer utilizes the MEL frequency
cepstral features and a current list of acoustic training models to determine at least one best hypothesis;
receiving, from the speech recognizer, at least one best hypothesis, associated acoustic training models, and associated probabilities;
computing a pre-adaptation acoustic score by recognizing an utterance using the associated acoustic training models;
choosing acoustic training models from the associated acoustic training models;
performing adaptation on the chosen associated acoustic training models;
computing a post-adaptation acoustic score by recognizing the utterance using the adapted acoustic training models;
comparing the pre-adaptation acoustic score with the post-adaptation acoustic score to check for improvement; modifying the
current list of acoustic training models to include the adapted acoustic training models, if the acoustic score improved after
performing adaptation; and performing recognition and adaptation iteratively until the acoustic score ceases to improve;
choosing the best hypothesis as recognized words once the acoustic score ceases to improve; and
outputting the recognized words.
|