US 11,809,460 B1
Systems, methods, and graphical user interfaces for taxonomy-based classification of unlabeled structured datasets
Nancy Anne Rausch, Apex, NC (US); Ruth Oluwadamilola Akintunde, Raleigh, NC (US); and Brant Nathan Kay, Pittsboro, NC (US)
Assigned to SAS Institute, Inc., Cary, NC (US)
Filed by SAS INSTITUTE INC., Cary, NC (US)
Filed on Jul. 13, 2023, as Appl. No. 18/221,695.
Claims priority of provisional application 63/398,827, filed on Aug. 17, 2022.
Claims priority of provisional application 63/391,772, filed on Jul. 24, 2022.
Int. Cl. G06F 16/28 (2019.01)
CPC G06F 16/287 (2019.01) 27 Claims
OG exemplary drawing
 
1. A computer-program product embodied in a non-transitory machine-readable storage medium storing computer instructions that, when executed by one or more processors, perform operations comprising:
obtaining, via a graphical user interface (GUI) of a computer system, a GUI request for one or more taxonomy-labeled structured datasets;
extracting, from the GUI request, one or more taxonomy tokens;
defining a taxonomy token-informed search operation based on the one or more taxonomy tokens;
executing, by one or more processors based on a token vectorization model, the taxonomy token-informed search operation for searching a database comprising a plurality of distinct corpora of labeled structured datasets, wherein each distinct corpus of the plurality of distinct corpora of labeled structured datasets includes a grouping of taxonomy-labeled structured datasets having an attribution of distinct hierarchical taxonomy metadata, and wherein the executing the taxonomy token-informed search operation includes:
evaluating the one or more taxonomy tokens of the GUI request against a distinct dataset of taxonomy tokens attributed to said each distinct corpus of the plurality of distinct corpora of labeled structured datasets, and
identifying taxonomy token matches including one or more exact matches or one or more semantic matches between the evaluated one or more taxonomy tokens of the GUI request and the distinct dataset of taxonomy tokens attributed to said each distinct corpus of the plurality of distinct corpora of labeled structured datasets;
identifying a target corpus of taxonomy-labeled structured datasets of the plurality of distinct corpora of labeled structured datasets based on the execution of the taxonomy token-informed search operation and selecting a target corpus of taxonomy-labeled structured datasets having a subject distinct set of taxonomy tokens contributing to the taxonomy token matches;
computing contemporaneously with the execution of taxonomy-token informed search operation, by the one or more processors, a summary artifact that is derived based on the distinct hierarchical taxonomy metadata attributed to the identified target corpus of taxonomy-labeled structured datasets, wherein the summary artifact includes slots of a precis template and a computer-generated precis derived based on content included in the identified target corpus of the taxonomy-labeled structured datasets; and
returning, via a display of the GUI, a response to the GUI request that includes a plurality of taxonomy-labeled structured datasets from the target corpus of taxonomy-labeled structured datasets and the summary artifact based on the identification of the identified target corpus of taxonomy-labeled structured datasets and the computing of the summary artifact.