US 7,613,787 B2
Efficient algorithm for finding candidate objects for remote differential compression
Mark S. Manasse, San Francisco, Calif. (US); Dan Teodosiu, Bellevue, Wash. (US); and Akhil Wable, Pittsburgh, Pa. (US)
Assigned to Microsoft Corporation, Redmond, Wash. (US)
Filed on Sep. 24, 2004, as Appl. No. 10/948,980.
Prior Publication US 2006/0085561 A1, Apr. 20, 2006
Int. Cl. G06F 15/16 (2006.01)
U.S. Cl. 709—217  [709/247; 709/238] 17 Claims
OG exemplary drawing
 
1. A method for identifying objects for use in remote differential compression, the method comprising:
calculating traits for an object, wherein the calculating traits comprises:
partitioning the object into chunks;
computing signatures for each of the object chunks;
grouping the signatures into shingles;
computing at least one shingle signature for each of the shingles;
mapping the shingle signatures into image sets;
calculating pre-traits from the image sets; and
computing the traits using the pre-traits, wherein the traits are smaller in size as compared to the pre-traits;
using the traits to identify candidate-objects that are similar to the object; and
selecting final objects from the identified candidate objects.