US 11,706,163 B2
Accelerating distributed reinforcement learning with in-switch computing
Jian Huang, Champaign, IL (US); Deming Chen, Champaign, IL (US); Alexander Gerhard Schwing, Champaign, IL (US); and Youjie Li, Changsha (CN)
Assigned to The Board of Trustees of the University of Illinois, Urbana, IL (US)
Filed by Board of Trustees of the University of Illinois, Urbana, IL (US)
Filed on Dec. 17, 2020, as Appl. No. 17/247,611.
Claims priority of provisional application 62/951,761, filed on Dec. 20, 2019.
Prior Publication US 2021/0194831 A1, Jun. 24, 2021
Int. Cl. H04L 12/851 (2013.01); H04L 12/931 (2013.01); G06F 15/10 (2006.01); G06N 20/00 (2019.01); H04L 49/90 (2022.01); H04L 69/22 (2022.01); G06N 3/08 (2023.01)
CPC H04L 49/90 (2013.01) [G06N 3/08 (2013.01); H04L 69/22 (2013.01)] 22 Claims
OG exemplary drawing
 
1. A programmable switch comprising:
an input arbiter to analyze packet headers of incoming packets and determine which of the incoming packets are part of gradient vectors received from worker computing devices that are performing reinforcement learning as a group; and
an accelerator coupled to the input arbiter, the accelerator comprising a segment counter and configured to:
receive the incoming packets from the input arbiter;
assign a segment number to a gradient segment of the gradient vectors, wherein the gradient segment is a corresponding sub-portion of each respective gradient vector of the gradient vectors;
aggregate and buffer gradient data of the incoming packets that is associated with the segment number;
track buffering of the aggregated gradient data according to the segment number by incrementing the segment counter upon buffering gradient data from each incoming packet associated with a corresponding worker computing device of the worker computing devices;
responsive to the segment counter reaching a threshold number, generate an aggregated data packet containing the aggregated gradient data associated with the gradient segment of the gradient vectors; and
transfer the aggregated data packet to the input arbiter to be transmitted to the worker computing devices.