US 11,755,911 B2
Method and apparatus for training neural network and computer server
Zehao Huang, Beijing (CN); and Naiyan Wang, Beijing (CN)
Assigned to BEIJING TUSEN ZHITU TECHNOLOGY CO., LTD., Beijing (CN)
Filed by BEIJING TUSEN ZHITU TECHNOLOGY CO., LTD., Beijing (CN)
Filed on May 23, 2019, as Appl. No. 16/421,259.
Claims priority of application No. 201810498650.1 (CN), filed on May 23, 2018.
Prior Publication US 2019/0385059 A1, Dec. 19, 2019
Int. Cl. G06N 3/08 (2023.01); G06N 3/082 (2023.01); G06F 18/214 (2023.01); G06N 3/048 (2023.01); G06V 10/774 (2022.01); G06V 10/82 (2022.01)
CPC G06N 3/082 (2013.01) [G06F 18/214 (2023.01); G06N 3/048 (2023.01); G06V 10/774 (2022.01); G06V 10/82 (2022.01)] 18 Claims
OG exemplary drawing
 
1. A method for training a neural network, comprising the following process performed at a predetermined time period:
selecting input data automatically to obtain a set of data to be annotated, wherein a video containing a plurality of sequences of frames is inputted to the neural network to obtain a target detection result for each frame of image, wherein target detection results for all frames of images in the video are inputted to a target tracking model to obtain a tracking result for each frame of image, and wherein the frame of image is determined as the input data in response to that the target detection result and the tracking result for a frame of image are inconsistent with each other;
annotating the set of data to be annotated to obtain a new set of annotated data;
acquiring a set of newly added annotated data containing the new set of annotated data;
determining a union of the set of newly added annotated data and a set of training sample data for training the neural network in a previous period as a set of training sample data for a current period;
training the neural network iteratively, based on the set of training sample data for the current period, to obtain a neural network trained in the current period, wherein the neural network has a plurality of particular structures each provided with a corresponding sparse scaling operator for scaling an output from the particular structure, wherein sparsity constraints having different weights are applied to the particular structures based on different computational complexities of the particular structures; and
training the neural network iteratively using a sample data in the set of training sample data for the current period comprises training the neural network in a number of training iterations by:
optimizing a target function using a first optimization algorithm, with sparse scaling operators obtained from a previous training iteration being constants of the target function and the weights being variables of the target function, to obtain weights of a current training iteration;
optimizing the target function using a second optimization algorithm, with the weights of the current training iteration being constants of the target function and sparse scaling operators being variables of the target function, to obtain sparse scaling operators of the current training iteration; and
performing a next training iteration based on the weights and sparse scaling operators of the current training iteration.