US 11,809,955 B2
	Processing images using deep neural networks
Christian Szegedy, Mountain View, CA (US); and Vincent O. Vanhoucke, San Francisco, CA (US)
Assigned to Google LLC, Mountain View, CA (US)
Filed by Google LLC, Mountain View, CA (US)
Filed on Sep. 28, 2022, as Appl. No. 17/936,299.
Application 17/936,299 is a continuation of application No. 17/199,978, filed on Mar. 12, 2021, granted, now 11,462,035.
Application 17/199,978 is a continuation of application No. 16/846,924, filed on Apr. 13, 2020, granted, now 10,977,529, issued on Apr. 13, 2021.
Application 16/846,924 is a continuation of application No. 15/868,587, filed on Jan. 11, 2018, granted, now 10,650,289, issued on May 12, 2020.
Application 15/868,587 is a continuation of application No. 15/649,947, filed on Jul. 14, 2017, granted, now 9,904,875, issued on Feb. 27, 2018.
Application 15/649,947 is a continuation of application No. 14/839,452, filed on Aug. 28, 2015, granted, now 9,715,642, issued on Jul. 25, 2017.
Claims priority of provisional application 62/043,865, filed on Aug. 29, 2014.
Prior Publication US 2023/0014634 A1, Jan. 19, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06K 9/46 (2006.01); G06N 3/084 (2023.01); G06N 3/063 (2023.01); G06N 3/045 (2023.01); G06V 30/194 (2022.01)

CPC G06N 3/084 (2013.01) [G06N 3/045 (2023.01); G06N 3/063 (2013.01); G06V 30/194 (2022.01)]

15 Claims

1. A device comprising at least one processor and at least one storage device storing instructions that, when executed by the at least one processor, cause the device to implement:

a neural network configured to perform an object detection task by processing data characterizing an input image to generate an alternative representation of the input image, the neural network comprising:

a plurality of subnetworks arranged in a sequence from lowest to highest, the plurality of subnetworks configured to process the data according to the sequence, the plurality of subnetworks comprising a plurality of module subnetworks, each of the module subnetworks comprising:

a plurality of groups of neural network layers configured to process a preceding output representation generated by a preceding subnetwork in the sequence and to generate a respective group output for each of the plurality of groups, wherein each group of the plurality of groups includes at least two successive convolutional layers comprising at least one 1×1 convolutional layer followed by one of (i) a 3×3 convolutional layer or (ii) a 5×5 convolutional layer; and

an output layer configured to process the alternative representation of the input image to generate an output for the object detection task from the input image.