| US 7,580,926 B2 | ||
| Method and apparatus for representing text using search engine, document collection, and hierarchal taxonomy | ||
| Shyam Kapur, Sunnyvale, Calif. (US); Ayman O. Farahat, San Francisco, Calif. (US); and Richard E. Chatwin, Sunnyvale, Calif. (US) | ||
| Assigned to Adchemy, Inc., Redwood City, Calif. (US) | ||
| Filed on Dec. 01, 2006, as Appl. No. 11/607,191. | ||
| Claims priority of provisional application 60/742023, filed on Dec. 01, 2005. | ||
| Prior Publication US 2007/0136256 A1, Jun. 14, 2007 | ||
| Int. Cl. G06F 17/30 (2006.01); G06F 7/00 (2006.01) | ||
| U.S. Cl. 707—3 | 36 Claims |

| 1. A method comprising:
accessing a target comprising a target alphanumerical string that comprises one or more words;
determining one or more tokens from one or more documents, each token comprising a token alphanumerical string that comprises
one or more words and appears at least once in the documents;
determining a token order among the tokens;
communicating the target as an original query to a search engine;
accessing one or more original search results returned by the search engine in response to the original query;
calculating an original value for each token based at least in part on a first number of appearances of the token in the documents
and a second number of appearances of the token in the original search results;
generating a target vector representing the target, the target vector comprising the original values calculated for the tokens
ordered according to the token order;
for each of a plurality of candidates comprising a candidate alphanumerical string that comprises one or more words:
generating a candidate vector representing the candidate; and
calculating a distance between the candidate vector and the target vector; and
selecting a candidate from the plurality of candidates that is related to the target based on a comparison among the distances
between the candidate vectors and the target vector.
|