US 9,811,384 B2
Dynamic data partitioning for optimal resource utilization in a parallel data processing system
Brian K. Caufield, Livermore, CA (US); Fan Ding, San Jose, CA (US); Mi Wan Shum, San Jose, CA (US); Dong Jie Wei, Beijing (CN); and Samuel H K Wong, Santa Clara, CA (US)
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION, Armonk, NY (US)
Filed by Brian K. Caufield, Livermore, CA (US); Fan Ding, San Jose, CA (US); Mi Wan Shum, San Jose, CA (US); Dong Jie Wei, Beijing (CN); and Samuel H K Wong, Santa Clara, CA (US)
Filed on Jun. 27, 2012, as Appl. No. 13/534,478.
Application 13/534,478 is a continuation of application No. 13/094,074, filed on Apr. 26, 2011.
Prior Publication US 2012/0278587 A1, Nov. 1, 2012
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 17/30 (2006.01); G06F 12/08 (2016.01); G06F 12/0808 (2016.01); G06F 12/0815 (2016.01); G06F 9/50 (2006.01)
CPC G06F 9/505 (2013.01) 9 Claims
OG exemplary drawing
 
1. A computer-implemented method for dynamically distributing data in a parallel processing system, comprising:
allocating data buffers to respective data partitions defined in the parallel processing system, each data buffer having a predefined size in which data are stored for processing by the corresponding data partition;
distributing data records of a common data structure across the data buffers for parallel processing by the respective data partitions, wherein a number of data records for each of the data partitions is selected independently of the number of the data records selected for all other of the data partitions;
determining data buffer usage at each of the data partitions by comparing a size of a free portion of the corresponding data buffer to a size criterion of the data buffer, wherein the size criterion of the data buffer is greater than zero and less than the data buffer size; and
distributing remaining data records of the common data structure across the data buffers based on the respective data buffer usage thereof, wherein the number of remaining data records is selected independently of the number of remaining data records selected for all other of the remaining data partitions.