US 11,816,077 B2
Measuring data quality in a structured database through SQL
Paul Lecaillon, Dhahran (SA); Rami Majed Aljawad, Al Qatif (SA); Ning Li, Dhahran (SA); and Mohammed Jebreel Hakami, Dhahran (SA)
Assigned to Saudi Arabian Oil Company, Dhahran (SA)
Filed by Saudi Arabian Oil Company, Dhahran (SA)
Filed on Mar. 2, 2021, as Appl. No. 17/189,946.
Prior Publication US 2022/0283996 A1, Sep. 8, 2022
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/20 (2019.01); G06F 16/215 (2019.01); G06F 16/2455 (2019.01); G06F 16/28 (2019.01); G06F 11/34 (2006.01); G06F 11/30 (2006.01)
CPC G06F 16/215 (2019.01) [G06F 11/3072 (2013.01); G06F 11/3428 (2013.01); G06F 16/24564 (2019.01); G06F 16/287 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method, comprising:
launching one or more scripts in a batch job on a hardware processor, wherein the one or more scripts describe a plurality of tables that store
(i) metadata that characterize a hierarch of exploration data assets, wherein the hierarchy of exploration data assets comprise databases holding data records obtained from a geophysical exploration when core samples are extracted from a plurality of wells drilled during the geophysical exploration for measurements,
(ii) metadata that characterize a set of data quality rules, wherein the set of data quality rules specify at least one physical relationship between core samples whose measurements, as obtained from the geophysical exploration, are captured in at least two data records of the databases, and
(iii) metadata that characterize defects identifiable as defective data records in the databases of the hierarchy of exploration data assets, wherein defective data records fail to comply with the set of data quality rules;
wherein when the batch job is executed, the hardware processor performs operations of:
querying the hierarchy of exploration data assets according to one or more data quality rules from the set of data quality rules;
identifying instances of defective data records that fail to meet the one or more data quality rules;
based on analyzing the instances of defective data records, calculating one or more data quality metrics for the hierarchy of exploration data assets, wherein the one or more data quality metrics comprise an aggregated statistic measure of defective data records in the databases of the hierarch of exploration data assets; and
continuously monitoring the one or more data quality metrics for the hierarchy of exploration data assets; and
in response to the one or more data quality metrics exceeding a threshold, alerting an operator.