US 7,580,931 B2
Topic distillation via subsite retrieval
Tie-Yan Liu, Beijing (China); Tao Qin, Beijing (China); and Wei-Ying Ma, Beijing (China)
Assigned to Microsoft Corporation, Redmond, Wash. (US)
Filed on Mar. 13, 2006, as Appl. No. 11/375,612.
Prior Publication US 2007/0214116 A1, Sep. 13, 2007
Int. Cl. G06F 17/30 (2006.01)
U.S. Cl. 707—6  [707/5] 15 Claims
OG exemplary drawing
 
1. A system with a central processing unit and a memory for calculating subtree features for subtrees having root documents, a subtree being a hierarchical organization of documents in which documents have ancestor/descendant relationships, comprising:
a calculate feature component that calculates a feature for each document within a subtree; and
a calculate subtree feature component that calculates a subtree feature for the subtree wherein a contribution of a descendant document of the root document decreases as an ancestral distance between the descendant document and the root document increases and as a number of sibling documents of the descendant document increases;
wherein the subtree feature of a subtree represents an aggregate feature for the root document of the subtree,
wherein the contribution is decreased as represented by the following:

OG Complex Work Unit Drawing
where h(ps) represents height of a subtree with the root document of ps, R(ps) represents the child document of ps, ∥a∥ represents the number of elements of a, and f(piu) represents the feature of document piu, and
wherein the components are implemented as computer-executable instructions stored in memory for execution by the central processing unit.