US 11,756,576 B2
Classification of audio signal as speech or music based on energy fluctuation of frequency spectrum
Zhe Wang, Beijing (CN)
Assigned to HUAWEI TECHNOLOGIES CO., LTD., Shenzhen (CN)
Filed by Huawei Technologies Co., Ltd., Shenzhen (CN)
Filed on Mar. 11, 2022, as Appl. No. 17/692,640.
Application 17/692,640 is a continuation of application No. 16/723,584, filed on Dec. 20, 2019, granted, now 11,289,113.
Application 16/723,584 is a continuation of application No. 16/108,668, filed on Aug. 22, 2018, granted, now 10,529,361, issued on Jan. 7, 2020.
Application 16/108,668 is a continuation of application No. 15/017,075, filed on Feb. 5, 2016, granted, now 10,090,003, issued on Oct. 2, 2018.
Application 15/017,075 is a continuation of application No. PCT/CN2013/084252, filed on Sep. 26, 2013.
Claims priority of application No. 201310339218.5 (CN), filed on Aug. 6, 2013.
Prior Publication US 2022/0199111 A1, Jun. 23, 2022
This patent is subject to a terminal disclaimer.
Int. Cl. G10L 25/81 (2013.01); G10L 25/78 (2013.01); G10L 25/18 (2013.01); G10L 19/06 (2013.01); G10L 19/12 (2013.01)
CPC G10L 25/81 (2013.01) [G10L 19/06 (2013.01); G10L 19/12 (2013.01); G10L 25/18 (2013.01); G10L 25/78 (2013.01); G10L 2025/783 (2013.01)] 20 Claims
OG exemplary drawing
 
1. An audio signal classification method comprising:
storing, based on at least one condition being met, data of a frequency spectrum fluctuation parameter of a current audio frame of an audio signal into a memory where data of frequency spectrum fluctuation parameters of a plurality of audio frames are stored, wherein the at least one condition comprises the current audio frame being an active frame, and wherein the frequency spectrum fluctuation parameter denotes an energy fluctuation of a frequency spectrum of the audio signal;
modifying data of frequency spectrum fluctuation parameters of audio frames preceding the current audio frame stored in the memory into ineffective data when the current audio frame is the active frame and a last audio frame preceding the current audio frame is an inactive frame;
modifying effective data stored in the memory into a first value when a current signal is percussive music, wherein the current signal comprises the current audio frame and a plurality of audio frames preceding the current audio frame;
obtaining a first group of effective data comprising data of the frequency spectrum fluctuation parameter of the current audio frame and one or more effective data of frequency spectrum fluctuation parameter of one or more audio frames continuously prior to the current audio frame;
obtaining a first average value of the first group of effective data; and
classifying the current audio frame as the music frame based on first conditions being met,
wherein the first conditions at least comprises the first average value being less than a first threshold, and
wherein the first value is less than the first threshold.