CPC G06F 16/254 (2019.01) [G06F 16/2358 (2019.01); G06F 16/283 (2019.01); G06F 16/951 (2019.01)] | 20 Claims |
1. A system, comprising:
a plurality of computing devices, respectively comprising at least one processor and a memory, configured to implement a data catalog service as part of a provider network, wherein the data catalog service is configured to:
identify a plurality of data sets maintained in different storage locations;
receive respective selections of one or more recognizers of a plurality of different recognizers corresponding to a plurality of different file formats supported by the data catalog service to apply to data scanned from the plurality of data sets;
add respective structural data for the plurality of data sets to a data catalog that provides a centralized location for searching for desired data sets, wherein the respective structural data makes a client capable of interpreting between different items within the plurality of data sets when the client connects to data sources for individual ones of the plurality of data sets, and wherein the adding comprises:
access the different storage locations to apply the selected one or more recognizers to the data scanned from the plurality of data sets to determine the respective structural data; and
store the respective structural data as part of the data catalog; and
provide access to the respective structural data in the data catalog in response to one or more requests to access the data catalog.
|