US 7,552,053 B2
Techniques for aiding speech-to-speech translation
Yuqing Gao, Mount Kisco, N.Y. (US); Hong-Kwang Jeff Kuo, Pleasantville, N.Y. (US); and Bowen Zhou, Ossining, N.Y. (US)
Assigned to International Business Machines Corporation, Armonk, N.Y. (US)
Filed on Aug. 22, 2005, as Appl. No. 11/208,989.
Prior Publication US 2007/0043567 A1, Feb. 22, 2007
Int. Cl. G10L 13/00 (2006.01)
U.S. Cl. 704—258 1 Claim
OG exemplary drawing
 
1. A computer-implemented method for translation of a source language utterance of a first user, said source language utterance being in a source language, said first user being a speaker of said source language, said method comprising the steps of:
performing automatic speech recognition on said source language utterance to obtain a speech recognition result corresponding to said source language utterance;
performing machine translation on said speech recognition result to obtain a translation result in a target language;
performing information retrieval on a supplemental database, based on said speech recognition result, to obtain multiple retrieved word strings, in said source language, related to said source language utterance, said retrieved word strings having word string translations in said target language associated therewith, said supplemental database comprising a parallel corpus with indexed training data including training word strings in said source language and corresponding translated training word strings in said target language, said step of performing said machine translation being carried out by a translation system trained on said training data of said parallel corpus, said performing of said information retrieval comprising:
taking said speech recognition result as a query in a weighted combination; and
searching said indexed training data based on said query;
formatting said speech recognition result and said retrieved word strings in said source language for display to facilitate an appropriate translation selection;
displaying said formatted speech recognition result and said retrieved word strings together with associated translation confidence scores;
obtaining said translation selection, from said first user, among said speech recognition result and said retrieved word strings;
in case said translation selection comprises said speech recognition result:
displaying said translation result in said target language, and
performing text-to-speech synthesis on said translation result to sound out said translation result in said target language;
in case said translation selection comprises one of said retrieved word strings:
displaying a corresponding one of said word string translations, and
performing text-to-speech synthesis on said corresponding one of said word string translations to sound out said corresponding one of said word string translations in said target language;
predicting:
at least one future time source language word string likely to be useful for future translation from said source language in a current conversation, and
at least one future time target language word string likely to be useful for future translation from said target language in said current conversation,
based on a dialog model of past dialog history and sentence correlation from previous conversations of other persons;
displaying said at least one future time source language word string in a format to facilitate selection, by said first user, of said at least one future time source language word string for translation, said at least one future time source language word string being displayed so as to guide said first user to employ a desirable lexicon in subsequent dialog; and
displaying said at least one future time target language word string in a format to facilitate selection, by a second user, of said at least one future time target language word string for translation, said at least one future time target language word string being displayed so as to guide said second user to employ a desirable lexicon in subsequent dialog, said second user being a speaker of said target language.