US 7,516,306 B2
Computer program instruction architecture, system and process using partial ordering for adaptive response to memory latencies
David F. Bacon, Sleepy Hollow, N.Y. (US); and Xiaowei Shen, Hopewell Junction, N.Y. (US)
Assigned to International Business Machines Corporation, Armonk, N.Y. (US)
Filed on Oct. 05, 2004, as Appl. No. 10/959,609.
Prior Publication US 2006/0101249 A1, May 11, 2006
Int. Cl. G06F 9/312 (2006.01)
U.S. Cl. 712—219  [712/225] 3 Claims
OG exemplary drawing
 
1. A computer system for reducing a program latency comprising:
a processor implemented in hardware;
a memory; and
a prediction mechanism which provides an estimate of a latency of a memory access operation, said prediction mechanism being further adapted to:
use a computer instruction set architecture comprising braids and fibers to execute an inquiry from said processor, wherein the program is divided into at least one braid, wherein the at least one braid is a collection of fibers within a scope of execution and all of the fibers must terminate for the at least one braid to terminate, wherein a fiber comprises a section of sequential code that can be interleaved in a partial order with respect to other fibers and executed sequentially within the at least one braid, wherein because the fibers are executed sequentially a break statement cannot asynchronously interrupt another fiber within the at least one braid; and
provide the estimate, without executing the memory access operation, based upon the inquiry, the inquiry being made to a prediction table containing a predicted latency based on an on-chip cache directory, wherein a register is assigned to hold an outcome of the inquiry in register fields, the register fields comprising:
an available field, wherein the available field indicates whether data is available on chip;
a total latency field, wherein the total latency field provides an estimate of the total latency of the memory access operation; and
an overflow field, wherein the overflow field indicates if hardware resources are available for split phase memory operations;
wherein, based upon the inquiry, the estimate is reported as a low latency if data is available on-chip;
wherein a called fiber within a braid either runs immediately at a call point for the called fiber if the called fiber has a low latency, or the called fiber is deferred, wherein if the called fiber is deferred, other previously deferred fibers may run so that any fiber within the braid may run at any fiber call point, or at the end of the braid, but nowhere else, wherein the fibers execute atomically until termination or until another fiber call; and wherein each fiber runs with its own stack.