US 7,546,370 B1
Search engine with multiple crawlers sharing cookies
Anurag Acharya, Campbell, Calif. (US); Michal Louz-On, Philadephia, Pa. (US); and Alexander C. Roetter, Palo Alto, Calif. (US)
Assigned to Google Inc., Mountain View, Calif. (US)
Filed on Aug. 18, 2004, as Appl. No. 10/921,378.
Int. Cl. G06F 15/173 (2006.01)
U.S. Cl. 709—227  [709/223; 715/745] 11 Claims
OG exemplary drawing
 
1. A web crawler system, comprising:
a plurality of network crawlers each including, one or more processors and memory storing one or more modules to be executed by the one or more processors, the one or more modules having instructions for fetching documents from hosts on a network; and
a cookie database shared by the plurality of network crawlers, the cookie database storing cookies and associated information for use by the plurality of network crawlers;
wherein each network crawler of the plurality of network crawlers further includes instructions for retrieving one or more cookies from the cookie database so as to enable access to documents on at least one of the hosts on the network and each of the network crawlers includes instructions for detecting any of a plurality of predefined cookie errors associated with fetching a document by comparing a fetched document with a plurality of predefined cookie error patterns; and
wherein the cookie database includes cookie acquisition information corresponding to each of at least a plurality of the cookies in the cookie database; the cookie acquisition information for a respective cookie enabling a respective network crawler to acquire the cookie from an acquisition URL specified by the cookie acquisition information; wherein the acquisition URL is distinct from a target URL to be accessed using the respective cookie.