14. A computer system for constructing a speech recognition model, comprising:
a processor and a memory coupled to the processor;
an alignment acquisition section adapted to acquire alignment between speech of each of a plurality of speakers and a transcript of the speaker;
a speech synthesizing section adapted to combine the speech of the speakers and thereby generate speech of mixed speakers;
a transcription section adapted to automatically join the transcripts of the respective ones of the plurality of speakers along a time axis, create a transcript of speech of the mixed speakers, and replace predetermined transcribed portions overlapping among plural speakers on the time axis with a unit which represents a simultaneous speech segment using the processor and the memory;
a language model construction section adapted to construct a language model making up a speech recognition model; and
an acoustic model construction section adapted to construct an acoustic model making up the speech recognition model, based on the speech of the mixed speakers as well as on the transcript of the speech of the mixed speakers.