US 11,755,916 B2
	System and method for improving deep neural network performance
Yanshuai Cao, Toronto (CA); Ruitong Huang, Toronto (CA); and Junfeng Wen, Toronto (CA)
Assigned to ROYAL BANK OF CANADA, Toronto (CA)
Filed by ROYAL BANK OF CANADA, Montreal (CA)
Filed on Sep. 5, 2019, as Appl. No. 16/562,067.
Claims priority of provisional application 62/727,504, filed on Sep. 5, 2018.
Prior Publication US 2020/0074305 A1, Mar. 5, 2020
Int. Cl. G06N 3/08 (2006.01); G06N 3/084 (2023.01); G06V 10/82 (2022.01); G06F 18/24 (2023.01); G06N 3/047 (2023.01); G06N 7/01 (2023.01); G06V 10/764 (2022.01); G06V 10/778 (2022.01)

CPC G06N 3/084 (2013.01) [G06F 18/24 (2023.01); G06N 3/047 (2023.01); G06N 3/08 (2013.01); G06N 7/01 (2023.01); G06V 10/764 (2022.01); G06V 10/7788 (2022.01); G06V 10/82 (2022.01)]

20 Claims

1. A computer implemented method for training performance of a deep neural network adapted to attain a model f_T:X custom character

Δ^Cthat maps data in an input space X to a C-dimensional probability simplex that reduces catastrophic forgetting on a first T data sets after training on T sequential tasks, D_Trepresenting a current available data set, and D_t, t≤T representing additional data sets and the current available data set, the computer implemented method comprising:

storing, in non-transitory computer readable memory, logits of a set of samples from a previous set of tasks, D₁, the storage establishing a memory cost m<<n₁;

maintaining classification information from the previous set of tasks by utilizing the logits for matching during training on a new set of tasks, D₂, the logits selected to reduce a dependency on representation of D₁; and

training the deep neural network on D₂, and applying a penalty on the deep neural network for prediction deviation, the penalty adapted to sample a memory x_i⁽¹⁾, i=1, . . . , m from D₁and matching outputs for f₁* when training f₂; wherein m is a memory size of the memory x_i⁽¹⁾, n₁is a number of tasks in D₁, and f₁* is a model f trained on the first T data sets.