CPC G06N 3/063 (2013.01) [G06F 15/7892 (2013.01); G06F 16/904 (2019.01)] | 18 Claims |
1. A computer-implemented method of efficiently executing an operation unit graph on a reconfigurable data processor with a target architecture that includes physical compute units and/or physical memory units, the method including:
reducing a number of the physical compute units and/or physical memory units of the reconfigurable data processor required to execute the operation unit graph by
receiving, from a user, architectural hints that are specific to the target architecture of the reconfigurable data processor,
wherein the architectural hints
call for fusing first operation units when executing a pattern of the first operation units on the physical compute units and/or physical memory units of the reconfigurable data processor,
specify the first operation units in the pattern as first nodes,
specify first dataflows among the first operation units in the pattern as first edges, and
direct fusion among the first operation units in the pattern;
scanning the operation unit graph to detect an instance of the pattern of the first operation units specified by the architectural hints, including
matching second nodes and second edges in the operation unit graph with the first nodes and the first edges in the architectural hints, and detecting a pattern match;
fusing operation units of the second nodes and the second edges in the operation unit graph into a consolidated operation units block, thereby producing a fused operation unit graph;
allocating a set of physical compute units and/or physical memory units of the physical compute units and/or physical memory units of the reconfigurable data processor to the fused operation unit graph; and
executing the fused operation unit graph on the reconfigurable data processor based on the allocation.
|