US 11,816,557 B2
	Method and apparatus with neural network parameter quantization
Sangwon Ha, Seongnam-si (KR); Gunhee Kim, Suwon-si (KR); and Donghyun Lee, Seongnam-si (KR)
Assigned to Samsung Electronics Co., Ltd., Suwon-si (KR)
Filed by Samsung Electronics Co., Ltd., Suwon-si (KR)
Filed on Sep. 22, 2022, as Appl. No. 17/950,342.
Application 17/950,342 is a continuation of application No. 16/909,095, filed on Jun. 23, 2020, granted, now 11,481,608, issued on Oct. 25, 2022.
Claims priority of application No. 10-2019-0176734 (KR), filed on Dec. 27, 2019.
Prior Publication US 2023/0017432 A1, Jan. 19, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06N 3/00 (2023.01); G06N 3/063 (2023.01); G06F 17/18 (2006.01); G06F 18/10 (2023.01); G06F 18/2321 (2023.01); G06N 3/047 (2023.01); G06V 10/764 (2022.01)

CPC G06N 3/063 (2013.01) [G06F 17/18 (2013.01); G06F 18/10 (2023.01); G06F 18/2321 (2023.01); G06N 3/047 (2023.01); G06V 10/764 (2022.01)]

13 Claims

1. A processor-implemented neural network method, the method comprising:

determining a respective probability density function (PDF) of normalizing a statistical distribution of parameter values, for each channel of each of a plurality of feature maps of a pre-trained neural network;

determining, for each channel, a corresponding quantization range for performing quantization of corresponding parameter values, based on the respective determined PDF;

correcting, for each channel, the corresponding quantization range based on a signal-to-quantization noise ratio (SQNR) of the respective determined PDF; and

generating a quantized neural network, based on the corrected quantization range corresponding for each channel.