US 9,812,122 B2
Speech recognition model construction method, speech recognition method, computer system, speech recognition apparatus, program, and recording medium
Gakuto Kurata, Tokyo (JP); Toru Nagano, Tokyo (JP); Masayuki Suzuki, Tokyo (JP); and Ryuki Tachibana, Tokyo (JP)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed on Sep. 23, 2015, as Appl. No. 14/863,124.
Claims priority of application No. 2014-193564 (JP), filed on Sep. 24, 2014.
Prior Publication US 2016/0086599 A1, Mar. 24, 2016
Int. Cl. G10L 15/00 (2013.01); G10L 15/06 (2013.01); G10L 15/187 (2013.01); G10L 15/19 (2013.01); G10L 13/00 (2006.01)
CPC G10L 15/063 (2013.01) [G10L 15/187 (2013.01); G10L 15/19 (2013.01); G10L 13/00 (2013.01)] 19 Claims
OG exemplary drawing
14. A computer system for constructing a speech recognition model, comprising:
a processor and a memory coupled to the processor;
an alignment acquisition section adapted to acquire alignment between speech of each of a plurality of speakers and a transcript of the speaker;
a speech synthesizing section adapted to combine the speech of the speakers and thereby generate speech of mixed speakers;
a transcription section adapted to automatically join the transcripts of the respective ones of the plurality of speakers along a time axis, create a transcript of speech of the mixed speakers, and replace predetermined transcribed portions overlapping among plural speakers on the time axis with a unit which represents a simultaneous speech segment using the processor and the memory;
a language model construction section adapted to construct a language model making up a speech recognition model; and
an acoustic model construction section adapted to construct an acoustic model making up the speech recognition model, based on the speech of the mixed speakers as well as on the transcript of the speech of the mixed speakers.