US 7,516,067 B2
Method and apparatus using harmonic-model-based front end for robust speech recognition
Michael Seltzer, Pittsburgh, Pa. (US); James Droppo, Duvall, Wash. (US); and Alejandro Acero, Bellevue, Wash. (US)
Assigned to Microsoft Corporation, Redmond, Wash. (US)
Filed on Aug. 25, 2003, as Appl. No. 10/647,586.
Prior Publication US 2005/0049857 A1, Mar. 03, 2005
Int. Cl. G10L 21/02 (2006.01); G10L 15/00 (2006.01); G10L 15/20 (2006.01)
U.S. Cl. 704—226  [704/227; 704/228; 704/231; 704/233] 18 Claims
OG exemplary drawing
 
1. A method of identifying an estimate for a noise-reduced value representing a portion of a noise-reduced speech signal, the method comprising:
decomposing each frame of a noisy speech signal into a harmonic component for the frame and a random component for the frame;
for each frame, determining a separate scaling parameter for the frame for at least the harmonic component wherein determining a scaling parameter for each frame of the harmonic component comprises determining a ratio of an energy of the harmonic component in the frame without the random component of the frame to an energy of the frame of the noisy speech signal;
for each frame, multiplying the harmonic component of the frame by the scaling parameter of the frame for the harmonic component to form a scaled harmonic component for the frame;
for each frame, multiplying the random component of the frame by a fixed scaling parameter for the random component, wherein the fixed scaling parameter is the same for all frames and is less than one to form a scaled random component for the frame; and
for each frame, summing the scaled harmonic component for the frame and the scaled random component for the frame to form the noise-reduced value representing a frame of a noise-reduced speech signal wherein the frame of the noise-reduced speech signal has reduced noise relative to the frame of the noisy speech signal.