| US 7,516,067 B2 | ||
| Method and apparatus using harmonic-model-based front end for robust speech recognition | ||
| Michael Seltzer, Pittsburgh, Pa. (US); James Droppo, Duvall, Wash. (US); and Alejandro Acero, Bellevue, Wash. (US) | ||
| Assigned to Microsoft Corporation, Redmond, Wash. (US) | ||
| Filed on Aug. 25, 2003, as Appl. No. 10/647,586. | ||
| Prior Publication US 2005/0049857 A1, Mar. 03, 2005 | ||
| Int. Cl. G10L 21/02 (2006.01); G10L 15/00 (2006.01); G10L 15/20 (2006.01) | ||
| U.S. Cl. 704—226 [704/227; 704/228; 704/231; 704/233] | 18 Claims |

| 1. A method of identifying an estimate for a noise-reduced value representing a portion of a noise-reduced speech signal,
the method comprising:
decomposing each frame of a noisy speech signal into a harmonic component for the frame and a random component for the frame;
for each frame, determining a separate scaling parameter for the frame for at least the harmonic component wherein determining
a scaling parameter for each frame of the harmonic component comprises determining a ratio of an energy of the harmonic component
in the frame without the random component of the frame to an energy of the frame of the noisy speech signal;
for each frame, multiplying the harmonic component of the frame by the scaling parameter of the frame for the harmonic component
to form a scaled harmonic component for the frame;
for each frame, multiplying the random component of the frame by a fixed scaling parameter for the random component, wherein
the fixed scaling parameter is the same for all frames and is less than one to form a scaled random component for the frame;
and
for each frame, summing the scaled harmonic component for the frame and the scaled random component for the frame to form
the noise-reduced value representing a frame of a noise-reduced speech signal wherein the frame of the noise-reduced speech
signal has reduced noise relative to the frame of the noisy speech signal.
|