US 9,811,557 B2
Optimizing query statements in relational databases
DongJie Wei, Beijing (CN); Ke W. Wei, Beijing (CN); Xin Ying Yang, Beijing (CN); and Miao Zheng, Beijing (CN)
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Apr. 11, 2014, as Appl. No. 14/250,505.
Claims priority of application No. 2013 1 0210410 (CN), filed on May 30, 2013.
Prior Publication US 2014/0358894 A1, Dec. 4, 2014
Int. Cl. G06F 7/00 (2006.01); G06F 17/30 (2006.01)
CPC G06F 17/30442 (2013.01) [G06F 17/30451 (2013.01)] 15 Claims
OG exemplary drawing
 
1. An optimizing method for a query statement in a relational database, comprising:
determining a filtering performance for each of at least two complex predicates in a predetermined layer in a set of predetermined layers of the query statement based on a result of a query performed on predetermined data records by using only the at least two complex predicates in the predetermined layer;
re-ordering the complex predicates in the predetermined layer based on the filtering performance so as to rank a complex predicate with higher filtering performance before a complex predicate with a lower filtering performance, wherein said re-ordering the complex predicates in the predetermined layer comprises:
determining a number of elements having the value of 1 for each N-variable row vector;
re-ordering the complex predicates in the predetermined layer based on the number of the elements having the value of 1 in respective N-variable row vectors, according to the predicate conjunction between the complex predicates in the predetermined layer; and
wherein the filtering performance is ranked based upon a predicate conjunction; and
determining the filtering performance for a compound predicate comprising the respective predicates in the predetermined layer and conjunctions joining the respective predicates;
wherein said determining filtering performance for each of the complex predicates comprises:
determining sequence numbers of the data records obtained by performing a query using the complex predicate;
creating a N-variable row vector corresponding to the complex predicate to represent the filtering performance of the complex predicate, wherein N is the number of the predetermined data records, elements in the row vector at positions corresponding to the sequence numbers of the data records obtained by performing the query have a value 1, and elements at other positions have a value 0;
creating corresponding N-variable row vectors for respective predicates in the predetermined layer for which the filtering performance has not been determined; and
performing logical operation represented by the predicate conjunction for the corresponding N-variable row vectors of the predicates in the predetermined layer one element by one element, so as to obtain an intermediate N-variable row vector representing the filtering performance of the compound predicate.