US 9,811,664 B1
Methods and systems for detecting unwanted web contents
Cheng-Hsin Hsu, Taipei (TW); Peng-Shih Pu, Taipei (TW); Chih-Chia Chen, Taipei (TW); and Shr-An Su, Taipei (TW)
Assigned to Trend Micro Incorporated, Tokyo (JP)
Filed by Cheng-Hsin Hsu, Taipei (TW); Peng-Shih Pu, Taipei (TW); Chih-Chia Chen, Taipei (TW); and Shr-An Su, Taipei (TW)
Filed on Aug. 15, 2011, as Appl. No. 13/209,807.
Int. Cl. G06F 21/00 (2013.01); G06F 21/56 (2013.01); H04L 29/06 (2006.01)
CPC G06F 21/563 (2013.01) [G06F 21/564 (2013.01); H04L 63/1416 (2013.01)] 13 Claims
OG exemplary drawing
 
1. A method of detecting unwanted web contents, the method to be performed by a first computer and a second computer that each comprises a processor and a memory, the method comprising:
the first computer receiving a first web page from a first website;
the first computer extracting a plurality of hypertext markup language (HTML) tags from the first web page;
the first computer generating page structure traits of the first web page by forming the plurality of HTML tags together into a pattern that comprises the plurality of HTML tags;
the first computer comparing the page structure traits of the first web page to page structure traits of a normal web page;
to prevent false positives, the first computer removing from the page structures of the first web page a feature that makes the page structure traits of the normal web page match the page structure traits of the first web page;
the second computer receiving the page structure traits of the first web page after the feature has been removed from the page structure traits of the first web page; and
the second computer detecting unwanted web content in a second web page received from a second website by comparing page structure traits of the second web page against the page structure traits of the first web page.