CPC G06N 3/084 (2013.01) [G06F 18/24 (2023.01); G06N 3/047 (2023.01); G06N 3/08 (2013.01); G06N 7/01 (2023.01); G06V 10/764 (2022.01); G06V 10/7788 (2022.01); G06V 10/82 (2022.01)] | 20 Claims |
1. A computer implemented method for training performance of a deep neural network adapted to attain a model fT:XΔC that maps data in an input space X to a C-dimensional probability simplex that reduces catastrophic forgetting on a first T data sets after training on T sequential tasks, DT representing a current available data set, and Dt, t≤T representing additional data sets and the current available data set, the computer implemented method comprising:
storing, in non-transitory computer readable memory, logits of a set of samples from a previous set of tasks, D1, the storage establishing a memory cost m<<n1;
maintaining classification information from the previous set of tasks by utilizing the logits for matching during training on a new set of tasks, D2, the logits selected to reduce a dependency on representation of D1; and
training the deep neural network on D2, and applying a penalty on the deep neural network for prediction deviation, the penalty adapted to sample a memory xi(1), i=1, . . . , m from D1 and matching outputs for f1* when training f2; wherein m is a memory size of the memory xi(1), n1 is a number of tasks in D1, and f1* is a model f trained on the first T data sets.
|