| US 7,596,544 B2 | ||
| Tracking set-expression cardinalities over continuous update streams | ||
| Sumit Ganguly, Bhopal (India); Minos Garofalakis, Morristown, N.J. (US); and Rajeev Rastogi, New Providence, N.J. (US) | ||
| Assigned to Alcatel-Lucent USA Inc., Murray Hill, N.J. (US) | ||
| Filed on Dec. 29, 2004, as Appl. No. 11/25,355. | ||
| Prior Publication US 2006/0143218 A1, Jun. 29, 2006 | ||
| Int. Cl. G06F 7/00 (2006.01) | ||
| U.S. Cl. 707—2 | 17 Claims |

| 1. A method of obtaining an estimate of a set-expression cardinality relating to at least a first and second data-stream,
the method comprising the steps of:
using a database management system comprising a computer for:
creating a first hash-sketch synopsis for the first data stream and a second hash-sketch synopsis for the second data stream,
each hash-sketch synopsis comprising a random hash-table and a 2-level hash sketch for each hash-bucket of said random hash-tables,
said 2-level hash sketch comprising a first-level hash-table and a counter away for each hash-bucket of said first-level hash-table;
pre-hashing said first and second data-streams into said first and second random hash tables, respectively;
hashing individual buckets of said random hashing tables to the corresponding 2-level hash sketch for each of those buckets;
maintaining said first and said second hash-sketch synopsis using one or more data elements from said first and second data-streams
respectively;
obtaining a set-expression singleton count over said first and second hash-sketch; and
estimating said set-expression cardinality estimate using said set-expression singleton count.
|