US 7,519,221 B1
Reconstructing high-fidelity electronic documents from images via generation of synthetic fonts
Dennis G. Nicholson, Atherton, Calif. (US)
Assigned to Adobe Systems Incorporated, San Jose, Calif. (US)
Filed on Feb. 28, 2005, as Appl. No. 11/69,510.
Int. Cl. G06K 9/00 (2006.01); G06K 9/34 (2006.01); G06K 9/18 (2006.01); G06K 9/62 (2006.01)
U.S. Cl. 382—181  [382/180; 382/186; 382/209] 19 Claims
OG exemplary drawing
 
13. A method for generating a synthetic font, comprising:
using a computer to perform:
receiving a set of scanned character images;
producing glyphs from the set of scanned character images, wherein producing glyphs from the set of scanned character images involves grouping similar character images into clusters, and iteratively:
registering scanned character images in each cluster with sub-pixel accuracy,
extracting a high-resolution, noise-reduced prototype from the registered character images for each cluster,
measuring a distance from each registered character image to its associated prototype, and
using the measured distances to purify each cluster via histogram analysis of inter-cluster and intra-cluster distances;
obtaining character labels for the glyphs; and
using the glyphs and associated character labels to form the synthetic font, whereby the synthetic font can represent both a logical content and a visual appearance of characters in a document, wherein the visual appearance of characters in the synthetic font are faithful replicas of corresponding characters on printed pages from which the images were generated.