US 9,811,726 B2
Chinese, Japanese, or Korean language detection
Mikhail Yurievich Atroshchenko, Moscow (RU); Dmitry Georgievich Deryagin, Moscow (RU); and Yuri Georgievich Chulinin, Moscow (RU)
Assigned to ABBYY DEVELOPMENT LLC, Moscow (RU)
Filed by ABBYY Development LLC, Moscow (RU)
Filed on Jun. 26, 2016, as Appl. No. 15/193,058.
Application 15/193,058 is a continuation of application No. 14/561,851, filed on Dec. 5, 2014, granted, now 9,378,414, issued on Jun. 28, 2016.
Prior Publication US 2016/0307033 A1, Oct. 20, 2016
This patent is subject to a terminal disclaimer.
Int. Cl. G06K 9/00 (2006.01); G06F 17/27 (2006.01); G06K 9/32 (2006.01); G06F 17/22 (2006.01); G06K 9/68 (2006.01); G06K 9/18 (2006.01)
CPC G06K 9/00456 (2013.01) [G06F 17/2223 (2013.01); G06F 17/275 (2013.01); G06F 17/2775 (2013.01); G06K 9/18 (2013.01); G06K 9/3208 (2013.01); G06K 9/6821 (2013.01); G06K 2209/011 (2013.01)] 22 Claims
OG exemplary drawing
1. A method comprising:
determining a language hypothesis for each text fragment in a plurality of text fragments identified from connected components in a document image;
selecting a first subset of text fragments from the plurality of text fragments based on ratings for the language hypothesis of each text fragment in the plurality of text fragments;
verifying, by a processor, the language hypothesis of one or more text fragments in the first subset of text fragments based on optical character recognition of the one or more text fragments; and
determining, by the processor, that Chinese, Japanese, or Korean (CJK) characters are present in the document image based on the verification of the language hypothesis of the one or more text fragments.