US 11,720,332 B2
Compiling a program from a graph
David Lacey, Cheltenham (GB); and Godfrey Da Costa, Bristol (GB)
Assigned to GRAPHCORE LIMITED, Bristol (GB)
Filed by Graphcore Limited, Bristol (GB)
Filed on Jul. 31, 2019, as Appl. No. 16/527,410.
Claims priority of application No. 1904637 (GB), filed on Apr. 2, 2019.
Prior Publication US 2020/0319861 A1, Oct. 8, 2020
Int. Cl. G06F 8/30 (2018.01); G06F 8/41 (2018.01)
CPC G06F 8/41 (2013.01) 21 Claims
OG exemplary drawing
 
1. A computer-implemented method for generating an executable program to run on a system of one or more processor chips each comprising one or more processor modules, each processor module comprising an execution unit and memory; the method comprising:
receiving a graph comprising a plurality of data nodes, a plurality of compute vertices and a plurality of directional edges, each data node representing a data element, each edge representing an input to a compute vertex from a data node or an output from a compute vertex input to a data node or another compute vertex, and each compute vertex representing one or more computations to perform on its input or inputs in order to produce the output or outputs from that compute vertex;
compiling the graph into said executable program, the executable program comprising a plurality of machine code instructions, including one or more types of multi-access instruction each of which performs at least two load operations, at least two store operations, or at least one load and one store operation in a single instruction;
wherein the memory on each of the processor modules comprises a respective plurality of memory banks; and
the compilation comprises assigning instances of said multi-access instructions to implement at least some of said edges, analyzing the graph, and in dependence upon a result of analyzing the graph, allocating the data elements to memory addresses within different ones of the banks, wherein the allocating applies one or more constraints including at least a first constraint that no two edges can access the same memory bank at the same time;
wherein allocating the data elements comprises:
grouping the data elements into equivalence classes, wherein a first equivalence class includes a first set of the data elements that interfere with a second set of other ones of the data elements of a second equivalence class;
generating ordered equivalence classes at least in part by ordering the equivalence classes according to a metric; and
stepping through the data elements within the ordered equivalence classes to allocate each of the data elements in turn, wherein the metric comprises a first metric, the method further comprising: ordering the data elements within each equivalence class according to a second metric that is different from the first metric; and
for each equivalence class, stepping through the ordered data elements within a respective equivalence class to allocate each of the data elements in turn.