| US 7,548,929 B2 | ||
| System and method for determining semantically related terms | ||
| Robert J. Collins, Carlsbad, Calif. (US); Graham Harris, Altadena, Calif. (US); Jesse Harris, Van Nuys, Calif. (US); Grant Kushida, Los Angeles, Calif. (US); Lance Riedel, Altadena, Calif. (US); Mohammad Sabah, North Hollywood, Calif. (US); Shaji Sebastian, Pasadena, Calif. (US); Jeff Yuan, Pasadena, Calif. (US); and Yiping Zhou, Sunnyvale, Calif. (US) | ||
| Assigned to Yahoo! Inc., Sunnyvale, Calif. (US) | ||
| Filed on May 11, 2006, as Appl. No. 11/432,266. | ||
| Claims priority of provisional application 60/703904, filed on Jul. 29, 2005. | ||
| Prior Publication US 2007/0027864 A1, Feb. 01, 2007 | ||
| Int. Cl. G06F 17/30 (2006.01) | ||
| U.S. Cl. 707—101 | 23 Claims |

| 1. A method for determining semantically related terms, comprising:
receiving one or more seed terms;
searching a first index to determine a plurality of webpages associated with the seed terms, the first index comprising a
plurality of terms and for each term of the plurality of terms, an association between one or more webpages and the term;
searching a second index to determine a plurality of potential terms associated with the plurality of webpages associated
with the seed terms, the second index comprising a plurality of identifiers for webpages and for each webpages of the plurality
of identifiers for webpages, an association between one or more terms and the webpage;
sending at least one term of the plurality of potential terms to a user to suggest the at least one term of the plurality
of potential terms to the user;
receiving an indication of relevance of at least one suggested term to the user;
modifying with a processor the terms which comprise the seed terms based at least in part on the received indication of relevance;
receiving an indication that a first term is relevant to the user; and
modifying with a processor the seed terms to comprise the first term as a positive seed term;
wherein receiving one or more seed terms comprises;
receiving a location of a webpage
retrieving with a processor the content of the webpage from the location of the webpage;
stripping code from the content of the webpage with a processor;
pulling one or more terms from the content of the webpage; and
weighing each term of the one or more terms pulled form the content of the webpage with a processor based on a location of
where the term was located on the webpage.
|