| US 7,599,920 B1 | ||
| System and method for enabling website owners to manage crawl rate in a website indexing system | ||
| Vanessa Fox, Redmond, Wash. (US); Amanda Ann Camp, Kirkland, Wash. (US); Maximilian Ibel, Pfaeffikon (Switzerland); Patrik Rene Celeste Reali, Zurich (Switzerland); Jeremy J. Lilley, Mountain View, Calif. (US); Katherine Jane Lai, Cambridge, Mass. (US); Ted J. Bonkenburg, Mountain View, Calif. (US); and Neal Douglas Cardwell, San Francisco, Calif. (US) | ||
| Assigned to Google Inc., Mountain View, Calif. (US) | ||
| Filed on Oct. 12, 2006, as Appl. No. 11/549,075. | ||
| Int. Cl. G06F 17/30 (2006.01) | ||
| U.S. Cl. 707—3 [709/224] | 36 Claims |

| 1. A computer-implemented method of indexing documents in websites, the method comprising:
on a server system having one or more processors and memory storing programs to be executed by the one or more processors:
for each website of a multiplicity of websites, each website having a corresponding current crawl rate limit:
crawling the respective website, in accordance with the current crawl rate limit corresponding to the respective website,
to download documents from the respective website for inclusion in a database;
storing crawl data associated with the crawling of the respective website;
providing, for display, a crawl rate control mechanism to a respective owner of the respective website, including providing
for display to the respective owner at least a portion of the crawl data, and enabling selection of a new crawl rate limit
corresponding to the respective website by the respective owner;
comparing a maximum crawl rate for the respective website over a defined period of time with the current crawl rate limit
for crawling the respective website to determine if the current crawl rate limit is a limiting factor in crawling the respective
website; and
in response to a request to increase a current crawl rate for crawling the respective website, increasing the current crawl
rate limit only when the current crawl rate limit is a limiting factor in crawling the respective website.
|