CPC G06T 7/11 (2017.01) [G06T 7/50 (2017.01); G06T 19/20 (2013.01); G06V 10/764 (2022.01); G06V 20/00 (2022.01); G06T 2207/10024 (2013.01); G06T 2207/10028 (2013.01); G06T 2207/20084 (2013.01)] | 29 Claims |
1. A computer-implemented method, the method comprising:
maintaining object data specifying objects that have been recognized in a scene in an environment;
receiving a stream of input images of the scene;
for each of a plurality of input images in the stream of input images:
providing the input image as input to an object recognition system;
receiving, as output from the object recognition system, a recognition output that identifies a respective bounding box in the input image for each of one or more objects that have been recognized in the input image;
providing data identifying the bounding boxes as input to a three-dimensional (3-D) bounding box generation system that determines, from the object data and the bounding boxes, a respective 3-D bounding box for each of one or more of the objects that have been recognized in the input image,
wherein the 3-D bounding box generation system performs operations comprising
generating a current 3-D object mask of a first object that has been recognized in the input image, and
performing fusion between the current 3-D object mask of the first object and respective object data specifying a previously recognized object associated with the first object; and
receiving, as output from the 3-D bounding box generation system, data specifying one or more 3-D bounding boxes for one or more of the objects recognized in the input image based on the fusion performed by the 3-D bounding box generation system; and
providing, as output, data specifying the one or more 3-D bounding boxes.
|