US 11,816,188 B2
Weakly supervised one-shot image segmentation
Moin Nabi, Berlin (DE); Tassilo Klein, Berlin (DE); Hasnain Raza, Berlin (DE); and Sayyed Mahdyar Ravanbakhsh, Berlin (DE)
Assigned to SAP SE, Walldorf (DE)
Filed by SAP SE, Walldorf (DE)
Filed on Aug. 31, 2020, as Appl. No. 17/008,615.
Prior Publication US 2022/0067455 A1, Mar. 3, 2022
Int. Cl. G06F 18/2413 (2023.01); G06T 7/11 (2017.01); G06N 20/00 (2019.01); G06N 5/04 (2023.01); G06F 18/214 (2023.01); G06F 18/21 (2023.01)
CPC G06F 18/24143 (2023.01) [G06F 18/214 (2023.01); G06F 18/217 (2023.01); G06N 5/04 (2013.01); G06N 20/00 (2019.01); G06T 7/11 (2017.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01)] 17 Claims
OG exemplary drawing
 
1. A system, comprising:
at least one processor; and
at least one memory including program code which when executed by the at least one processor provides operations comprising:
training, in a supervised manner, a machine learning model to learn a plurality of base class prototypes corresponding to a plurality of base objects, each of the plurality of base class prototypes corresponding to a segmentation of a class of one or more similar base objects, the machine learning model being trained based on a plurality of training images, each training image of the plurality of training images depicting a base object of the plurality of base objects, and each training image of the plurality of training images being associated with a plurality of pixel-wise labels corresponding to semantic classes indicative of a ground-truth segmentation of the base object depicted therein;
training, based at least on a support image depicting a novel object, the machine learning model to learn a novel class prototype corresponding to the novel object, the support image being associated with an image-level label identifying the novel object depicted therein instead of a plurality of pixel-wise labels corresponding to a ground-truth segmentation of the novel object, the machine learning model being trained to learn the novel class prototype based at least on one of the plurality of base class prototypes identified as being similar to the support image, and the novel object being a different object than the plurality of base objects preserving clusters associated with the semantic classes depicted in plurality of base objects; and
applying the trained machine learning model to segment a query image.