US 7,580,839 B2
Apparatus and method for voice conversion using attribute information
Masatsune Tamura, Kanagawa (Japan); and Takehiko Kagoshima, Kanagawa (Japan)
Assigned to Kabushiki Kaisha Toshiba, Tokyo (Japan)
Filed on Sep. 19, 2006, as Appl. No. 11/533,122.
Claims priority of application No. 2006-011653 (JP), filed on Jan. 19, 2006.
Prior Publication US 2007/0168189 A1, Jul. 19, 2007
Int. Cl. G10L 13/00 (2006.01)
U.S. Cl. 704—258  [704/254; 704/257; 704/246; 379/88.02] 13 Claims
OG exemplary drawing
 
1. A speech processing apparatus comprising:
a speech storage configured to store a plurality of speech units of a conversion-source speaker and source-speaker attribute information corresponding to the speech units;
a speech-unit extractor configured to divide the speech of a conversion-target speaker into a predetermined type of a speech unit to form target-speaker speech units;
an attribute-information generator configured to generate target-speaker attribute information corresponding to the target-speaker speech units from the speech of the conversion-target speaker or linguistic information of the speech;
a speech-unit selector configured to calculate costs on the target-speaker attribute information and the source-speaker attribute information using cost functions, and selects one or a plurality of speech units with the same phoneme from the speech storage according to the costs to form a source-speaker speech unit; and
a voice-conversion-rule generator configured to generate speech conversion functions for converting the one or the plurality of source-speaker speech units to the target-speaker speech units based on the target-speaker speech units and the one or the plurality of source-speakerspeech units.