| US 7,484,075 B2 | ||
| Method and apparatus for providing fast remote register access in a clustered VLIW processor using partitioned register files | ||
| Krishnan K. Kailas, Ossining, N.Y. (US) | ||
| Assigned to International Business Machines Corporation, Armonk, N.Y. (US) | ||
| Filed on Dec. 16, 2002, as Appl. No. 10/320,150. | ||
| Prior Publication US 2004/0117597 A1, Jun. 17, 2004 | ||
| Int. Cl. G06F 9/30 (2006.01) | ||
| U.S. Cl. 712—24 | 20 Claims |

| 1. A computer system, comprising:
a plurality of clustered processing cores for processing VLIW (Very Long Instruction Word) operations, wherein each processing
core comprises:
a local partitioned register file having a subset of an architected name space;
an instruction decoder to decode a VLIW for execution;
an inter-cluster communication bus enabling communication between the processing cores;
a processor pipeline including a plurality of stages for operating on the VLIW; and
a hardware register pre-fetch unit comprising an instruction pre-fetch buffer to store the VLIW to await decoding by the instruction
decoder,
wherein the hardware register pre-fetch unit (i) pre-decodes a name of a register specified in the VLIW in advance of decoding
by the instruction decoder to determine if a remote register is needed to execute the VLIW, and (ii) generates a control signal
to pre-fetch data, from the specified remote register in a remote processing core or from a remote bypass network, for an
instruction along one execution path in a program, in advance of decoding of the VLIW by the instruction decoder for execution,
based on a compiler analysis of the program that schedules instructions that are data dependent by taking into account a latency
of the inter-cluster communication bus, a size of the instruction pre-fetch buffer, and a depth of the processor pipeline.
|