US 9,811,316 B2
Parallel, low-latency method for high-performance speculative globally-large element extraction from distributed, sorted arrays
Charles J. Archer, Rochester, MN (US); Michael A. Blocksome, Rochester, MN (US); Joseph D. Ratterman, Rochester, MN (US); and Brian Smith, Rochester, MN (US)
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION, Armonk, NY (US)
Filed by Charles J. Archer, Rochester, MN (US); Michael A. Blocksome, Rochester, MN (US); Joseph D. Ratterman, Rochester, MN (US); and Brian Smith, Rochester, MN (US)
Filed on Jun. 6, 2007, as Appl. No. 11/758,703.
Prior Publication US 2008/0307195 A1, Dec. 11, 2008
Int. Cl. G06F 7/36 (2006.01); G06F 9/38 (2006.01); G06F 15/80 (2006.01); G06F 19/16 (2011.01); G06F 9/30 (2006.01); G06F 19/28 (2011.01)
CPC G06F 7/36 (2013.01) [G06F 9/30021 (2013.01); G06F 9/30032 (2013.01); G06F 9/3885 (2013.01); G06F 15/80 (2013.01); G06F 19/28 (2013.01); G06F 19/16 (2013.01)] 10 Claims
OG exemplary drawing
 
1. A computer system comprising:
a plurality of processors having a local processor;
a memory operatively coupled to the local processor;
a module residing in the memory that determines a globally largest element and a globally smallest element across a set of multi-element inputs, the set of multi-element inputs comprising no duplicate elements and comprising a first multi-element input corresponding to the local processor, the first multi-element input comprising a plurality of local elements;
an assignment module residing in the memory that assigns the globally largest element to a first variable and the globally smallest element to a second variable;
a set of modules configured to perform an iterative process to determine a set of largest elements from the set of multi-element inputs, the set of modules comprising:
a size partition element generation module residing in the memory that, during a first step in the iterative process, generates a partition value from the first variable and the second variable;
a module residing in the memory that, during a second step in the iterative process, counts a number of the plurality of local elements greater than the partition value to generate a local count;
a module residing in the memory that, during a third step in the iterative process, sums the local count with one or more other local counts of elements from one or more other inputs in the set of multi-element inputs to determine a global count; and
a comparison module residing in the memory that, during a fourth step in the iterative process, determines whether the global count is greater than, less than, or equal to a size of one of the multi-element inputs,
wherein responsive to a determination that the global count is greater than the size, the assignment module assigns the partition value to the second variable and a subsequent iteration of the iterative process is performed,
wherein responsive to a determination that the global count is less than the size, the assignment module assigns the partition value to the first variable and the subsequent iteration of the iterative process is performed, or
wherein responsive to a determination that the global count equals the size, the iterative process ends;
the computer system further comprising:
a module residing in the memory that populates a distributed result array with each largest element in the set of largest elements after the iterative process ends, wherein the set of largest elements includes each element in each multi-element input that is greater than the partition value corresponding to a final iteration of the iterative process, and wherein a first element and a second element among the set of multi-element inputs that are closest to the partition value corresponding to a threshold iteration of the iterative process are not identical if a total number of elements greater than the partition value corresponding to the threshold iteration among the set of multi-element inputs is less than the size of the one of the multi-element inputs.