US 11,755,885 B2
Joint learning of local and global features for entity linking via neural networks
Nicolas R. Fauceglia, Buenos Aires (AR); Alfio M. Gliozzo, Brooklyn, NY (US); Oktie Hassanzadeh, Briarcliff Manor, NY (US); Thien H. Nguyen, Brooklyn, NY (US); Mariano Rodriguez Muro, New York, NY (US); and Mohammad Sadoghi Hamedani, West Lafayette, IN (US)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Apr. 6, 2020, as Appl. No. 16/840,846.
Application 16/840,846 is a continuation of application No. 15/351,897, filed on Nov. 15, 2016, granted, now 10,643,120.
Prior Publication US 2020/0234102 A1, Jul. 23, 2020
Int. Cl. G06N 3/04 (2023.01); G06N 3/045 (2023.01); G06N 3/044 (2023.01); G06N 3/084 (2023.01)
CPC G06N 3/045 (2023.01) [G06N 3/044 (2023.01); G06N 3/084 (2013.01)] 14 Claims
OG exemplary drawing
 
1. A computer-implemented method comprising:
identifying, using the at least one processor, a set of one or more entity mentions in an electronic document, an entity mention to be linked to a page of a plurality of candidate pages in a knowledge base;
representing each entity mention as a plurality of word sequences capturing a context or topic of the entity mention at multiple granularities in the electronic document;
for each entity mention in the electronic document, identify a set of target candidate pages in the knowledge base that potentially refer to the entity mention in the document;
applying a scoring function to obtain a relevance score for each said target candidate page of the corpus for each mention, said applying a scoring function comprising:
running a CNN model using the plurality of word sequences of the entity mention and a candidate target page of the knowledge base to compute a first score representing a local similarity score between each entity mention and candidate target page, said running a CNN model further comprising forward linking of each entity mention to identified target candidate pages, and ranking forward links based on the first score; and
running a RNN model that simultaneously models an interdependence among the other entity mentions in the document and other candidate pages to compute a second score, said running a RNN model comprising a backward linking of the entity mentions to identified target candidate pages by traversing, using RNN Model operations, the entity mentions from an end to the beginning of the electronic document, wherein second scores are computed for all the target candidates pages of all the entity mentions in each document simultaneously, while preserving the order of the entity mentions from the beginning to the end of an input document;
creating a combined linking score by adding the first computed score of a forward linked target candidate page for an entity mention and the second computed score of the backward linked target candidate page for that entity mention;
ranking said target candidate pages based on their combined linking score for the entity mention; and
providing a link for linking the entity mention to the target candidates page of the knowledge base based on a highest combined linking score for the entity mention.