US 9,812,123 B1 | ||
Background audio identification for speech disambiguation | ||
Jason Sanders, New York, NY (US); Gabriel Taubman, Brooklyn, NY (US); and John J. Lee, Long Island City, NY (US) | ||
Assigned to Google Inc., Mountain View, CA (US) | ||
Filed by Google Inc., Mountain View, CA (US) | ||
Filed on Aug. 13, 2015, as Appl. No. 14/825,648. | ||
Application 14/825,648 is a continuation of application No. 13/804,986, filed on Mar. 14, 2013, granted, now 9,123,338. | ||
Claims priority of provisional application 61/654,387, filed on Jun. 1, 2012. | ||
Claims priority of provisional application 61/654,518, filed on Jun. 1, 2012. | ||
Claims priority of provisional application 61/654,407, filed on Jun. 1, 2012. | ||
Claims priority of provisional application 61/778,570, filed on Mar. 13, 2013. | ||
Int. Cl. G10L 15/20 (2006.01); G10L 15/08 (2006.01); G06F 17/30 (2006.01); H04M 3/493 (2006.01); G10L 15/26 (2006.01); G10L 21/0208 (2013.01) |
CPC G10L 15/08 (2013.01) [G06F 17/30746 (2013.01); G10L 15/265 (2013.01); H04M 3/4936 (2013.01); G10L 21/0208 (2013.01)] | 20 Claims |
1. A computer-implemented method comprising:
receiving, by an application server of an automated speech recognition system that includes (a) the application server, (b)
a background audio recognizer, (c) a conceptual expander component, (d) an automated speech recognizer, and (e) a speech recognition
language model, an audio stream at a computing device, the audio stream comprising user speech data;
identifying, by the background audio recognizer of the automated speech recognition system, concepts from audio features of
the audio stream;
generating, by the conceptual expander component of the automated speech recognition system, a set of terms related to the
identified concepts;
influencing the automated speech recognizer of the automated speech recognition system based on at least one of the terms
related to the identified concepts, comprising adjusting one or more probabilities or relevance scores of the speech recognition
language model, wherein each of the one or more probabilities or relevance scores corresponds to a term related to the identified
concepts;
providing, by the application server of the automated speech recognition system, the audio stream to the influenced automated
speech recognizer;
generating, by the automated speech recognizer of the automated speech recognition system, a recognized version of the user
speech data; and
providing, by the application server of the automated speech recognition system, the recognized version of the user speech
data, for output.
|