CPC G06F 16/93 (2019.01) [G06F 9/30036 (2013.01); G06F 16/335 (2019.01); G06F 18/23 (2023.01); G06F 18/232 (2023.01); G06F 18/2413 (2023.01); G06F 18/24765 (2023.01); G06F 40/279 (2020.01); G06N 3/08 (2013.01); G06V 10/763 (2022.01); G06V 10/764 (2022.01); G06V 10/765 (2022.01); G06V 10/82 (2022.01); G06V 30/224 (2022.01); G06V 30/412 (2022.01); G06V 30/10 (2022.01)] | 20 Claims |
1. A method, comprising:
obtaining a layout of a document, the document having a plurality of fields;
identifying the document, based on the layout, as belonging to a first type of documents of a plurality of identified types of documents;
identifying a plurality of symbol sequences of the document;
processing, by a processing device, the plurality of symbol sequences of the document using a first neural network associated with the first type of documents to generate a plurality of feature vectors;
using the plurality of feature vectors to form one or more association hypotheses, wherein each of the one or more association hypotheses associates one of the plurality of fields of the document with at least one of the plurality of feature vectors;
determining, using the one or more association hypotheses, an association of a first field of the plurality of fields with a first set of one or more symbol sequences of the plurality of symbol sequences of the document; and
causing a representation of the first set of the one or more symbol sequences to be stored in a computer memory in association with a profile of the document.
|