US 9,811,756 B2
Method for labeling images of street scenes
Ming-Yu Liu, Revere, MA (US); Srikumar Ramalingam, Cambridge, MA (US); and Oncel Tuzel, Winchester, MA (US)
Assigned to Mitsubishi Electric Research Laboratories, Inc., Cambridge, MA (US)
Filed by Mitsubishi Electric Research Laboratories, Inc., Cambridge, MA (US)
Filed on Feb. 23, 2015, as Appl. No. 14/628,808.
Prior Publication US 2016/0247290 A1, Aug. 25, 2016
Int. Cl. G06K 9/00 (2006.01); G06K 9/46 (2006.01)
CPC G06K 9/4604 (2013.01) [G06K 9/00791 (2013.01); G06K 2009/00953 (2013.01)] 13 Claims
OG exemplary drawing
 
1. A method for labeling an image of a street view, wherein the image includes a set of columns of pixels, comprising the steps of:
employing at least one processor executing computer executable instructions stored on at least one computer readable memory to facilitate performing the steps of:
receiving the image having image pixels including respective two-dimensional features;
receiving image data points corresponding to the image, the image data points including respective three-dimensional image features;
extracting, for each pixel, an appearance feature from the image pixels, wherein the appearance features are determined using a deep neural network learned from a labeled dataset;
extracting, for each pixel, a depth feature from the image data points; and
applying a column-wise labeling procedure to jointly determine a semantic label and a depth label for each pixel for each column of pixels from the set of columns of pixels of the image using both the appearance features and the depth features from their respective column of pixels, and wherein the column-wise labeling procedure is according to a model of the street view, wherein each column of pixels of the set of columns of pixels includes at most four ordered layers from a top layer or a fourth layer to a first layer or a bottom layer, such that the at most four ordered layers are obtained using an inference procedure that jointly estimates the sematic labels and the depth labels for each image column;
processing, the column-wise labeling procedure subject to the at most four ordered layers to produce the labeled image, and the steps are performed in the processor.