US 9,812,147 B2
System and method for generating an audio signal representing the speech of a user
Patrick Kechichian, Eindhoven (NL); and Wilhelmus Andreas Martinus Arnoldus Maria Van Den Dungen, Boxtel (NL)
Assigned to KONINKLIJKE PHILIPS N.V., Eindhoven (NL)
Appl. No. 13/988,142
Filed by Patrick Kechichian, Eindhoven (NL); and Wilhelmus Andreas Martinus Arnoldus Maria Van Den Dungen, Boxtel (NL)
PCT Filed Nov. 17, 2011, PCT No. PCT/IB2011/055149
§ 371(c)(1), (2), (4) Date May 17, 2013,
PCT Pub. No. WO2012/069966, PCT Pub. Date May 31, 2012.
Claims priority of application No. 10192409 (EP), filed on Nov. 24, 2010.
Prior Publication US 2013/0246059 A1, Sep. 19, 2013
Int. Cl. G10L 21/0208 (2013.01)
CPC G10L 21/0208 (2013.01) 15 Claims
OG exemplary drawing
 
1. A method of generating a signal representing the speech of a user, the method comprising:
obtaining a first audio signal representing the speech of the user using a sensor in contact with the user;
obtaining a second audio signal using an air conduction sensor, the second audio signal representing the speech of the user and including noise from the environment around the user;
detecting periods of speech in the first audio signal;
applying a speech enhancement algorithm to the second audio signal to reduce the noise in the second audio signal, the speech enhancement algorithm using the detected periods of speech in the first audio signal;
equalizing the first audio signal using the noise-reduced second audio signal to produce an output audio signal representing the speech of the user, the equalizing includes performing linear prediction analysis on both the first audio signal and the noise-reduced second audio signal to construct an equalization filter, wherein the performing linear prediction analysis further includes:
(i) estimating linear prediction coefficients for both the first audio signal and the noise-reduced second audio signal;
(ii) using the linear prediction coefficients for the first audio signal to produce an excitation signal for the first audio signal;
(iii) using the linear prediction coefficients for the noise-reduced second audio signal to construct a frequency domain envelope; and
(iv) equalizing the excitation signal for the first audio signal using the frequency domain envelope.