US 11,704,331 B2
Dynamic generation of data catalogs for accessing data
Andrew Edward Caldwell, Santa Clara, CA (US); Anurag Windlass Gupta, Atherton, CA (US); Mehul A. Shah, Seratoga, CA (US); Prajakta Datta Damle, San Jose, CA (US); and George Steven McPherson, Seattle, WA (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Jul. 10, 2020, as Appl. No. 16/926,537.
Application 16/926,537 is a continuation of application No. 15/199,505, filed on Jun. 30, 2016, granted, now 10,713,272.
Prior Publication US 2020/0409967 A1, Dec. 31, 2020
Int. Cl. G06F 16/00 (2019.01); G06F 16/25 (2019.01); G06F 16/28 (2019.01); G06F 16/951 (2019.01); G06F 16/23 (2019.01)
CPC G06F 16/254 (2019.01) [G06F 16/2358 (2019.01); G06F 16/283 (2019.01); G06F 16/951 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A system, comprising:
a plurality of computing devices, respectively comprising at least one processor and a memory, configured to implement a data catalog service as part of a provider network, wherein the data catalog service is configured to:
identify a plurality of data sets maintained in different storage locations;
receive respective selections of one or more recognizers of a plurality of different recognizers corresponding to a plurality of different file formats supported by the data catalog service to apply to data scanned from the plurality of data sets;
add respective structural data for the plurality of data sets to a data catalog that provides a centralized location for searching for desired data sets, wherein the respective structural data makes a client capable of interpreting between different items within the plurality of data sets when the client connects to data sources for individual ones of the plurality of data sets, and wherein the adding comprises:
access the different storage locations to apply the selected one or more recognizers to the data scanned from the plurality of data sets to determine the respective structural data; and
store the respective structural data as part of the data catalog; and
provide access to the respective structural data in the data catalog in response to one or more requests to access the data catalog.