US 9,811,287 B2
High-performance hash joins using memory with extensive internal parallelism
Jeffrey H. Derby, Chapel Hill, NC (US); Charles L. Johnson, Fort Meyers, FL (US); Robert K. Montoye, New York, NY (US); Dheeraj Sreedhar, Bangalore (IN); and Steven P. VanderWiel, Rosemount, MN (US)
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Jun. 25, 2015, as Appl. No. 14/749,730.
Application 14/749,730 is a continuation of application No. 14/585,239, filed on Dec. 30, 2014.
Claims priority of provisional application 62/082,157, filed on Nov. 20, 2014.
Prior Publication US 2016/0147451 A1, May 26, 2016
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 12/00 (2006.01); G06F 3/06 (2006.01); G06F 17/30 (2006.01); G06F 13/16 (2006.01)
CPC G06F 3/0673 (2013.01) [G06F 3/0611 (2013.01); G06F 3/0659 (2013.01); G06F 17/30498 (2013.01); G06F 13/1615 (2013.01)] 6 Claims
OG exemplary drawing
 
1. A computer-implemented method, comprising:
issuing, to a dynamic random access memory with extensive internal parallelism (DRAM with EIP), a first group of two or more load requests to load data from a hash table comprising one or more hash buckets, wherein the hash table is constructed from hashed join-key values of a dimension table for a hash-join procedure, and wherein each load request in the first group corresponds to an entry in a fact table of the hash-join procedure and seeks a hash bucket matching a hashed join-key value for the corresponding entry in the fact table;
issuing, to the DRAM with EIP, a second group of two or more load requests to load data from the hash table;
receiving, from the DRAM with EIP, first response data that is responsive to the first group of load requests, wherein the first response data comprises one or more hash buckets from the hash table; and
processing, by a computer processor, the first response data while awaiting second response data that is responsive to the second group of load requests, wherein processing the first response data comprises:
identifying matches between the join-key values corresponding to entries in the two or more load requests of the first group and the one or more hash buckets in the first response data;
wherein the size of the second group of two or more load requests is selected such that a time for processing the first response data is based on the latency in receiving the second response data.