US 11,816,157 B2
Efficient storage and query of schemaless data
Luis Alonso, Mountain View, CA (US); Vladislav Grachev, Bothell, WA (US); Hossein Ahmadi, Seattle, WA (US); Srinagesh Susarla, Saratoga, CA (US); Francis Lan, San Francisco, CA (US); Srinidhi Raghavan, Mountain View, CA (US); Vinay Balasubramaniam, Sammamish, WA (US); and Oleksandr Blyzniuchenko, Redmond, WA (US)
Assigned to Google LLC, Mountain View, CA (US)
Filed by Google LLC, Mountain View, CA (US)
Filed on May 5, 2021, as Appl. No. 17/308,986.
Prior Publication US 2022/0358160 A1, Nov. 10, 2022
Int. Cl. G06F 16/84 (2019.01); G06F 16/2452 (2019.01); G06F 16/22 (2019.01); G06F 16/25 (2019.01)
CPC G06F 16/86 (2019.01) [G06F 16/2282 (2019.01); G06F 16/2452 (2019.01); G06F 16/258 (2019.01)] 18 Claims
OG exemplary drawing
 
1. A computer-implemented method when executed by data processing hardware causes the data processing hardware to perform operations comprising:
receiving user data from a user of a query system, the user data comprising semi-structured user data, the query system in communication with a database, the database comprising a table comprising semi-structured data and structured data;
receiving an indication that the semi-structured user data fails to comprise a fixed schema;
in response to the indication that the semi-structured user data fails to comprise the fixed schema, extracting, without user input, a schema for the semi-structured user data by:
parsing the semi-structured user data into a plurality of data paths; and
extracting a data type associated with each respective data path of the plurality of data paths;
storing, according to the extracted schema, the semi-structured user data as a row entry in the table of the database in communication with the query system, wherein each column value of the extracted schema associated with the row entry corresponds to a respective one of the plurality of data paths and the data type associated with the respective data path;
receiving, from the user of the query system, a query for data associated with the database; and
in response to the query:
determining, based on the extracted schema, a respective data path of the stored semi-structured data that is responsive to the query;
generating, using the extracted schema, a query response comprising a respective column value of the row entry corresponding to the respective data path of the stored semi-structured data responsive to the query; and
transmitting the query response to the user.