US 11,704,497 B2
Generating and using a sentence model for answer generation
Kyle Croutwater, Chapel Hill, NC (US); Zhe Zhang, Cary, NC (US); Vikrant Verma, Raleigh, NC (US); and Le Zhang, Cary, NC (US)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Sep. 9, 2020, as Appl. No. 17/15,663.
Prior Publication US 2022/0075951 A1, Mar. 10, 2022
Int. Cl. G06F 40/30 (2020.01); G06F 16/901 (2019.01)
CPC G06F 40/30 (2020.01) [G06F 16/9024 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method comprising:
ingesting, by one or more computer processors, a first corpus of a plurality of text sentences;
converting, by one or more computer processors, the plurality of text sentences into a plurality of sentence vectors, wherein a sentence vector is a numerical coordinate representation of a sentence in an x-y plane;
grouping, by one or more computer processors, the plurality of sentence vectors into a plurality of sentence clusters, wherein a sentence cluster is composed of sentence vectors that are semantically similar;
receiving, by one or more computer processors, a second corpus of a plurality of text sentences;
determining, by one or more computer processors, a meaning of each sentence of the second corpus;
based on the determined meaning, assigning, by one or more computer processors, each sentence of the second corpus to a sentence cluster of the plurality of sentence clusters;
determining, by one or more computer processors, for each sentence cluster of the plurality of sentence clusters, a frequency each sentence cluster appears in the second corpus;
based on the determined frequency, calculating, by one or more computer processors, a probability associated with each sentence cluster that appears in the second corpus, wherein the probability is a total number of sentence clusters in the second corpus divided by the determined frequency; and
based on the calculated probabilities, generating, by one or more computer processors, a first sentence model.