What is a Googlebot ?
January 28th, 2008Definitions of Googlebot on the Web:
* The crawler Google uses on a daily basis to find and index new web pages.
* Google’s main spider which scours the web for pages
* Google’s search spider.
* Google’s Web spider.
* This is the spider or the crawler of Google which crawls web sites monthly. Googlebot will visit and index pages on a daily basis and mark that page in its search results as being fresh.
* Google’s robot scanning the Internet is called GoogleBot. The technology employed in GoogleBots is varied. They all share cache between them. The different spiders include vertical search spiders and spiders associated with ad targeting.
* Name of the indexing robot of Google, which scans the Web from link to link looking for new pages. You may know if googlebot came to visit your site by looking at the log files of your server.
* The agent name of Google’s search engine spider which crawls the web to create its searchable index.
* Google’s main robot. Known as a “spider” it crawls the web and records information about websites as visits them.
* name of searching robot, used by Google.
* A Googlebot is a search bot used by Google. It collects documents from the web to build a searchable index for the Google search engine.
A Googlebot is a search bot used by Google. It collects documents from the web to build a searchable index for the Google search engine.
If a webmaster wishes to restrict the information on their site available to a Googlebot, or another well-behaved spider, they can do so with the appropriate directives in a robots.txt file, and by adding the meta tag <META NAME=”Googlebot” CONTENT=”nofollow”> to the webpage. Googlebot requests to Web servers are discernible from their user-agent string ‘Googlebot’.
Googlebot has two versions, deepbot and freshbot. Deepbot, the deep crawler, tries to follow every link on the web and download as many pages as it can to the Google indexers. It completes this process about once a month. Freshbot crawls the web looking for fresh content. It visits websites that change frequently, according to how frequently they change. Currently Googlebot only follows HREF links and SRC links.
Googlebot discovers pages by harvesting all of the links on every page it finds. It then follows these links to other web pages. New web pages must be linked to from another known page on the web in order to be crawled and indexed.
A problem which webmasters have often noted with the Googlebot is that it takes up an enormous amount of bandwidth. This can cause websites to exceed their bandwidth limit and be taken down temporarily. This is especially troublesome for mirror sites which host many gigabytes of data. Google provides “Webmaster Tools” that allow website owners to throttle the crawl rate.
You may want to visit Google ’s support pages about Googlebot for learn more about ;
* Why doesn’t Google index all of the pages of my site?
* Google 101: How Google crawls, indexes, and serves the web
* How often will Googlebot access my web pages?
* Why is Googlebot downloading information from our “secret” web server?
* Can you tell me the IP addresses from which Googlebot crawls so that I can filter my logs?
* How do I prevent Googlebot from following links on my pages?
* What is Feedfetcher, and why is it ignoring my robots.txt file?
* Why is Feedfetcher downloading the same page on my site multiple times?
* What kinds of links does Feedfetcher follow?
What is the Google Dance?
Once a month, and totally unannounced, Google has a major shift in it’s rankings. This is when Google “tweaks” is algorithm, and when it updates each sites PageRank and Back Links.
During the month there will be minor changes in rankings. This is called ‘Everflux’. But only about once per month does Google Dance, updating the back links and the PageRank. The dance usually lasts about 3-5 days. During these days the Google Results will vary widely.
The Google Spider is called Googlebot. Most sites are revisited by Googlebot only around the Google Dance time. Read More >

January 28th, 2008 at 5:05 pm
[...] « What is a Googlebot ? [...]