| US 7,526,361 B2 | ||
| Robotics visual and auditory system | ||
| Kazuhiro Nakadai, Wako (Japan); Hiroshi Okuno, Wako (Japan); and Hiroaki Kitano, Kawagoe (Japan) | ||
| Assigned to Honda Motor Co., Ltd., Tokyo (Japan) | ||
| Appl. No. 10/506,167 PCT Filed Aug. 30, 2002, PCT No. PCT/JP02/08827 § 371(c)(1), (2), (4) Date Feb. 09, 2006, . |
||
| Claims priority of application No. 2002-056670 (JP), filed on Mar. 01, 2002. | ||
| Prior Publication US 2006/0241808 A1, Oct. 26, 2006 | ||
| Int. Cl. G06F 19/00 (2006.01) | ||
| U.S. Cl. 700—245 [700/258; 700/259; 318/568.1; 901/1] | 5 Claims |

| 1. Robotics visual and auditory system comprising:
an audition module including at least a pair of microphones for collecting external sounds;
a face module including a camera for taking images in front of a robot;
a stereo module for extracting a matter by a stereo camera;
a motor control module including a drive motor for rotating the robot in the horizontal direction;
an association module for generating streams by associating events from said audition, said face, said stereo, and said motor
control modules; and
an attention control module for conducting attention control based on the stream generated by said association module; characterized
in that:
said audition module determines at least one speaker's direction from the sound source separation and localization by grouping
based on pitch extraction and harmonic wave structure, based on sound signal from the microphones, and extracts a auditory
event;
said face module identifies each speaker from each speaker's face recognition and localization based on the image taken by
the camera, and extracts a face event;
said stereo module extracts a stereo event by extraction and localization of a longitudinally long matter based on a disparity
extracted from the image taken by the stereo camera;
said motor control module extracts a motor event based on the rotational position of the drive motor; and thereby
said association module determines each speaker's direction based on directional information of sound source localization
by the auditory event, face localization by the face event, and matter localization by the stereo event from the auditory,
face, stereo, and motor events, generates a auditory, a face, and a stereo streams by connecting the events in the temporal
direction using a Kalman filter, and further generates a association stream by associating these;
said attention control module conducts attention control based on said streams, and drive-control of the motor based on a
result of planning for the action accompanying those; and
said audition module collects sub-bands having interaural phase difference (IPD) or interaural intensity difference (IID)
within a predetermined range by an active direction pass filter having a pass range which, according to auditory characteristics,
becomes minimum in the frontal direction, and larger as the angle becomes wider to the left and right, based on an accurate
sound source directional information from the association module, and conducts sound source separation by restructuring a
wave shape of a sound source.
|