US 11,809,828 B2
Systems and methods of data augmentation for pre-trained embeddings
Keld Lundgaard, Cambridge, MA (US); and Cameron Wolfe, Austin, TX (US)
Assigned to Salesforce, Inc., San Francisco, CA (US)
Filed by Salesforce, Inc., San Francisco, CA (US)
Filed on Aug. 30, 2022, as Appl. No. 17/898,780.
Application 17/898,780 is a continuation of application No. 16/827,830, filed on Mar. 24, 2020, granted, now 11,461,537.
Claims priority of provisional application 62/967,137, filed on Jan. 29, 2020.
Claims priority of provisional application 62/934,714, filed on Nov. 13, 2019.
Prior Publication US 2023/0039734 A1, Feb. 9, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 40/00 (2020.01); G06F 40/30 (2020.01); G06F 40/151 (2020.01); G06F 17/18 (2006.01); G06N 3/08 (2023.01); G06N 20/10 (2019.01); G06N 3/04 (2023.01); G06F 18/214 (2023.01); G06F 18/25 (2023.01); G06F 18/2431 (2023.01); G06V 10/764 (2022.01); G06V 10/80 (2022.01); G06V 10/82 (2022.01); G06V 10/40 (2022.01); G06N 20/00 (2019.01)
CPC G06F 40/30 (2020.01) [G06F 17/18 (2013.01); G06F 18/214 (2023.01); G06F 18/2431 (2023.01); G06F 18/251 (2023.01); G06F 40/151 (2020.01); G06N 3/04 (2013.01); G06N 3/08 (2013.01); G06N 20/10 (2019.01); G06V 10/40 (2022.01); G06V 10/764 (2022.01); G06V 10/803 (2022.01); G06V 10/82 (2022.01); G06N 20/00 (2019.01)] 28 Claims
OG exemplary drawing
 
1. A method comprising:
generating, by a server, textual embeddings by tokenizing text data and generating vectors to be provided to a transformer system, wherein the textual embeddings are vector representations of semantic meanings of text that is part of the text data;
averaging, by the server, the vectors for every token of the generated textual embeddings and concatenating average output activations of two layers of the transformer system;
generating, by the server, image embeddings from image data, wherein the image embeddings are vector representations of the images that are part of the image data;
combining, by the server, the textual embeddings and image embeddings to form combined embeddings to be provided to the transformer system; and
transmitting, by the server, the combined embeddings.