US 11,809,384 B2
Optimized data storage for fast retrieval
Michael Feldman, Pardesiya (IL); Nir Nice, Kfar Veradim (IL); Nimrod Ben Simhon, Netanya (IL); and Ayelet Kroskin, Ra'Anana (IL)
Assigned to MICROSOFT TECHNOLOGY LICENSING, LLC, Redmond, WA (US)
Filed by MICROSOFT TECHNOLOGY LICENSING, LLC, Redmond, WA (US)
Filed on Mar. 6, 2017, as Appl. No. 15/450,782.
Prior Publication US 2018/0253449 A1, Sep. 6, 2018
Int. Cl. G06F 16/21 (2019.01)
CPC G06F 16/21 (2019.01) 20 Claims
OG exemplary drawing
 
1. A computer-implemented method for creating an optimized data structure, the computer-implemented method comprising:
receiving a data file comprising a dataset, wherein the dataset consists of a plurality of records having key-value pairs;
setting at least one bucketing limit, wherein bucketing limits control a resulting data structure;
calculating a value size for each value in the dataset;
creating buckets based at least in part on the value size and the bucketing limit, wherein the buckets comprise a plurality of slots, wherein each slot in a single bucket is equal in size, and wherein different buckets comprise slots of different sizes;
dividing each value into at least one of the created buckets based on a size range, wherein each value is bucketed using a hash function;
adjusting each bucket by removing duplicative values across all buckets to create a structured data file based at least in part on the bucketing limit; and
storing the structured data file along with metadata.