Jump to content
 







Main menu
   


Navigation  



Main page
Contents
Current events
Random article
About Wikipedia
Contact us
Donate
 




Contribute  



Help
Learn to edit
Community portal
Recent changes
Upload file
 








Search  

































Create account

Log in
 









Create account
 Log in
 




Pages for logged out editors learn more  



Contributions
Talk
 



















Contents

   



(Top)
 


1 Behavior  





2 Mediabot  





3 Inspection Tool Crawlers  





4 References  





5 External links  














Googlebot






Acèh
العربية
Čeština
Deutsch
Español
Français

Italiano
עברית

Norsk bokmål
Polski
Português
Русский
Svenska
Türkçe
Українська
Tiếng Vit

 

Edit links
 









Article
Talk
 

















Read
Edit
View history
 








Tools
   


Actions  



Read
Edit
View history
 




General  



What links here
Related changes
Upload file
Special pages
Permanent link
Page information
Cite this page
Get shortened URL
Download QR code
Wikidata item
 




Print/export  



Download as PDF
Printable version
 
















Appearance
   

 






From Wikipedia, the free encyclopedia
 


This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these template messages)
This article needs additional citations for verification. Please help improve this articlebyadding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Googlebot" – news · newspapers · books · scholar · JSTOR
(October 2019) (Learn how and when to remove this message)
This article needs to be updated. Please help update this article to reflect recent events or newly available information. (March 2020)
(Learn how and when to remove this message)

Googlebot

Original author(s)

Google

Type

Web crawler

Website

Googlebot FAQ

Googlebot is the web crawler software used by Google that collects documents from the web to build a searchable index for the Google Search engine. This name is actually used to refer to two different types of web crawlers: a desktop crawler (to simulate desktop users) and a mobile crawler (to simulate a mobile user).[1]

Behavior[edit]

A website will probably be crawled by both Googlebot Desktop and Googlebot Mobile. However starting from September 2020, all sites were switched to mobile-first indexing, meaning Google is crawling the web using a smartphone Googlebot.[2] The subtype of Googlebot can be identified by looking at the user agent string in the request. However, both crawler types obey the same product token (useent token) in robots.txt, and so a developer cannot selectively target either Googlebot mobile or Googlebot desktop using robots.txt.

Google provides various methods that enable website owners to manage the content displayed in Google's search results. If a webmaster chooses to restrict the information on their site available to a Googlebot, or another spider, they can do so with the appropriate directives in a robots.txt file,[3] or by adding the meta tag <meta name="Googlebot" content="nofollow" /> to the web page.[4] Googlebot requests to Web servers are identifiable by a user-agent string containing "Googlebot" and a host address containing "googlebot.com".[5]

Currently, Googlebot follows HREF links and SRC links.[3] There is increasing evidence Googlebot can execute JavaScript and parse content generated by Ajax calls as well.[6] There are many theories regarding how advanced Googlebot's ability is to process JavaScript, with opinions ranging from minimal ability derived from custom interpreters.[7] Currently, Googlebot uses a web rendering service (WRS) that is based on the Chromium rendering engine (version 74 as on 7 May 2019).[8] Googlebot discovers pages by harvesting every link on every page that it can find. Unless prohibited by a nofollow-tag, it then follows these links to other web pages. New web pages must be linked to from other known pages on the web in order to be crawled and indexed, or manually submitted by the webmaster.

A problem that webmasters with low-bandwidth Web hosting plans[citation needed] have often noted with the Googlebot is that it takes up an enormous amount of bandwidth.[citation needed] This can cause websites to exceed their bandwidth limit and be taken down temporarily. This is especially troublesome for mirror sites which host many gigabytes of data. Google provides "Search Console" that allow website owners to throttle the crawl rate.[9]

How often Googlebot will crawl a site depends on the crawl budget. Crawl budget is an estimation of how typically a website is updated.[citation needed] Technically, Googlebot's development team (Crawling and Indexing team) uses several defined terms internally to take over what "crawl budget" stands for.[10] Since May 2019, Googlebot uses the latest Chromium rendering engine, which supports ECMAScript 6 features. This will make the bot a bit more "evergreen" and ensure that it is not relying on an outdated rendering engine compared to browser capabilities.[8]

Mediabot[edit]

Mediabot is the web crawler that Google uses for analyzing the content so Google AdSense can serve contextually relevant advertising to a web page. Mediabot identifies itself with the user agent string "Mediapartners-Google/2.1".

Unlike other crawlers, Mediabot does not follow links to discover new crawlable URLs, instead only visiting URLs that have included the AdSense code.[11] Where that content resides behind a login, the crawler can be given a log in so that it is able to crawl protected content.[12]

Inspection Tool Crawlers[edit]

InspectionTool is the crawler used by Search testing tools such as the Rich Result Test and URL inspection in Google Search Console. Apart from the user agent and user agent token, it mimics Googlebot.[13]

A guide to the crawlers was independently published.[14] It details four (4) distinctive crawler agents based on Web server directory index data - one (1) non-chrome and three (3) chrome crawlers.

References[edit]

  1. ^ "Googlebot". Google. 2019-03-11. Retrieved 2019-03-11.
  • ^ "Announcing mobile first indexing for the whole web". Google Developers. Retrieved 2021-03-17.
  • ^ a b "Google Search Console". Google.com.
  • ^ "Google Search Console". search.google.com. Retrieved 2019-03-11.
  • ^ "What is Googlebot | Google Search Central | Documentation". May 2022.
  • ^ "Understand the JavaScript SEO basics | Search for Developers". Google Developers. Retrieved 2020-07-26.
  • ^ Splitt, Martin. "How Google Search indexes JavaScript sites - JavaScript SEO". YouTube. Archived from the original on 2021-12-12.
  • ^ a b "The new evergreen Googlebot". Official Google Webmaster Central Blog. Retrieved 2019-06-07.
  • ^ "Google - Webmasters". Retrieved 2012-12-15.
  • ^ "What Crawl Budget Means for Googlebot". Official Google Webmaster Central Blog. Retrieved 2018-07-04.
  • ^ "About the AdSense Crawler".
  • ^ "Display ads on login-protected pages".
  • ^ "Google Crawler (User Agent) Overview".
  • ^ "The Ultimate Guide to the New InspectionTool Crawlers".
  • External links[edit]

  • History
  • List of Android apps
  • List of Easter eggs
  • List of mergers and acquisitions
  • Company

    Divisions

  • AI
  • DeepMind
  • Android
  • China
  • Chrome
  • Cloud
  • Glass
  • Google.org
  • Health
  • Maps
  • Pixel
  • Search
  • Sidewalk Labs
  • Sustainability
  • YouTube
  • People

    Current

  • Vint Cerf
  • Jeff Dean
  • John Doerr
  • Sanjay Ghemawat
  • Al Gore
  • John L. Hennessy
  • Urs Hölzle
  • Salar Kamangar
  • Ray Kurzweil
  • Ann Mather
  • Alan Mulally
  • Rick Osterloh
  • Sundar Pichai (CEO)
  • Ruth Porat (CFO)
  • Rajen Sheth
  • Hal Varian
  • Susan Wojcicki
  • Neal Mohan
  • Former

  • Sergey Brin (Founder)
  • David Cheriton
  • Matt Cutts
  • David Drummond
  • Alan Eustace
  • Timnit Gebru
  • Omid Kordestani
  • Paul Otellini
  • Larry Page (Founder)
  • Patrick Pichette
  • Eric Schmidt
  • Ram Shriram
  • Amit Singhal
  • Shirley M. Tilghman
  • Rachel Whetstone
  • Real estate

  • Androidland
  • Barges
  • Binoculars Building
  • Central Saint Giles
  • Chelsea Market
  • Chrome Zone
  • Data centers
  • Googleplex
  • Mayfield Mall
  • Pier 57
  • Sidewalk Toronto
  • St. John's Terminal
  • YouTube Space
  • YouTube Theater
  • Design

  • Noto
  • Product Sans
  • Roboto
  • Logo
  • Material Design
  • Events

  • Developer Day
  • Developer Lab
  • Code-in
  • Code Jam
  • Developer Day
  • Developers Live
  • Doodle4Google
  • G-Day
  • I/O
  • Jigsaw
  • Living Stories
  • Lunar XPRIZE
  • Mapathon
  • Science Fair
  • Summer of Code
  • Talks at Google
  • YouTube

  • CNN/YouTube presidential debates
  • Comedy Week
  • Live
  • Music Awards
  • Space Lab
  • Symphony Orchestra
  • Projects and
    initiatives

  • Area 120
  • ATAP
  • Business Groups
  • Computing University Initiative
  • Data Liberation Front
  • Data Transfer Project
  • Developer Expert
  • Digital Garage
  • Digital News Initiative
  • Digital Unlocked
  • Dragonfly
  • Founders' Award
  • Free Zone
  • Get Your Business Online
  • Google for Education
  • Google for Startups
  • Labs
  • Liquid Galaxy
  • Made with Code
  • Māori
  • ML FairnessNative Client
  • News Lab
  • Nightingale
  • OKR
  • PowerMeter
  • Privacy Sandbox
  • Quantum Artificial Intelligence Lab
  • RechargeIT
  • Shield
  • Silicon Initiative
  • Solve for X
  • Starline
  • Student Ambassador Program
  • Submarine communications cables
  • Sunroof
  • YouTube
  • Zero
  • Criticism

  • 2018 walkouts
  • Alphabet Workers Union
  • Censorship
  • DeGoogle
  • "Did Google Manipulate Search for Hillary?"
  • Dragonfly
  • FairSearch
  • "Ideological Echo Chamber" memo
  • Litigation
  • Privacy concerns
  • San Francisco tech bus protests
  • Services outages
  • Smartphone patent wars
  • Worker organization
  • YouTube

  • Censorship
  • Copyright issues
  • Copyright strike
  • Elsagate
  • Fantastic Adventures scandal
  • Headquarters shooting
  • Kohistan video case
  • Reactions to Innocence of Muslims
  • Slovenian government incident
  • Operating systems

  • Glass OS
  • Go
  • gLinux
  • Goobuntu
  • Things
  • TV
  • Wear OS
  • ChromeOS
  • Fuchsia
  • TV
  • Libraries/
    frameworks

  • AMP
  • Angular
  • ARCore
  • APIs
  • Blockly
  • Chart API
  • Charts
  • Dialogflow
  • Exposure Notification
  • Fast Pair
  • Federated Learning of Cohorts
  • File System
  • FlatBuffers
  • Flutter
  • Gears
  • gRPC
  • Gson
  • Guava
  • Guice
  • Guetzli
  • JAX
  • gVisor
  • MapReduce
  • Matter
  • Mobile Services
  • Neural Machine Translation
  • OpenSocial
  • Pack
  • Polymer
  • Protocol Buffers
  • Reqwireless
  • Shell
  • Skia Graphics Engine
  • Tango
  • TensorFlow
  • Test
  • WaveNet
  • Weave
  • Web Accelerator
  • WebRTC
  • Platforms

  • AppJet
  • Apps Script
  • Cloud Platform
  • Firebase
  • Global IP Solutions
  • Gridcentric, Inc.
  • ITA Software
  • Kubernetes
  • LevelDB
  • Neatx
  • Project IDX
  • SageTV
  • Apigee

  • Bitium
  • Chronicle
  • Compute Engine
  • Connect
  • Dataflow
  • Datastore
  • Kaggle
  • Looker
  • Mandiant
  • Messaging
  • Orbitera
  • Shell
  • Stackdriver
  • Storage
  • Tools

  • Android Cloud to Device Messaging
  • Android Debug Bridge
  • Android Studio
  • App Maker
  • App Runtime for Chrome
  • AppSheet
  • Bazel
  • Chrome Frame
  • Closure Tools
  • Cpplint
  • Data Protocol
  • Gadgets
  • Gerrit
  • GYP
  • Kythe
  • Lighthouse
  • MIT App Inventor
  • Mashup Editor
  • Native Client
  • Optimize
  • OpenRefine
  • OR-Tools
  • PageSpeed
  • Plugin for Eclipse
  • Programmable Search Engine
  • Public DNS
  • reCAPTCHA
  • Schema.org
  • Search Console
  • Sitemaps
  • Swiffy
  • Tesseract (software)
  • Trendalyzer
  • VisBug
  • Wave Federation Protocol
  • Web Toolkit
  • Search algorithms

  • PageRank
  • Panda
  • Penguin
  • Pigeon
  • RankBrain
  • Others

  • BigQuery
  • Chrome Experiments
  • Flutter
  • Gemini
  • Googlebot
  • Keyhole Markup Language
  • LaMDA
  • Open Location Code
  • PaLM
  • Programming languages
  • Transformer
  • Viewdle
  • Webdriver Torso
  • Web Server
  • File formats

  • APK
  • On2 Technologies
  • VP9
  • WebM
  • WebP
  • WOFF2
  • Entertainment

  • PaperofRecord.com
  • Podcasts
  • Quick, Draw!
  • Santa Tracker
  • Songza
  • Stadia
  • TV
  • Vevo
  • Video
  • Play

  • Games
  • most downloaded apps
  • Music
  • Newsstand
  • Pass
  • Services
  • YouTube

  • BrandConnect
  • Content ID
  • Instant
  • Kids
  • Music
  • Official channel
  • Preferred
  • Premium
  • YouTube Rewind
  • RightsFlow
  • Shorts
  • Studio
  • TV
  • Communication

  • Bump
  • Buzz
  • Chat
  • Contacts
  • Currents (social app)
  • Dodgeball
  • Duo
  • Fi Wireless
  • Friend Connect
  • Gizmo5
  • Google+
  • Gmail
  • Groups
  • Hangouts
  • Helpouts
  • IME
  • Jaiku
  • Marratech
  • Meebo
  • Meet
  • Messages
  • Moderator
  • Neotonic Software
  • Orkut
  • Postini
  • Quest Visual
  • Schemer
  • Spaces
  • Sparrow
  • Talk
  • Translate
  • Voice
  • Voice Local Search
  • Wave
  • Search

  • Alerts
  • Answers
  • Base
  • BeatThatQuote.com
  • Blog Search
  • Books
  • Code Search
  • Data Commons
  • Dataset Search
  • Dictionary
  • Directory
  • Fast Flip
  • Flu Trends
  • Finance
  • Goggles
  • Google.by
  • Images
  • Kaltix
  • Knowledge Graph
  • Like.com
  • News
  • Patents
  • People Cards
  • Personalized Search
  • Public Data Explorer
  • Questions and Answers
  • SafeSearch
  • Scholar
  • Searchwiki
  • Shopping
  • Catalogs
  • Squared
  • Tenor
  • Travel
  • Trends
  • Voice Search
  • WDYL
  • Navigation

  • Endoxon
  • ImageAmerica
  • Maps
  • Waze
  • Business
    and finance

  • AdMob
  • Ads
  • Adscape
  • AdSense
  • Attribution
  • BebaPay
  • Checkout
  • Contributor
  • DoubleClick
  • Marketing Platform
  • Pay (mobile app)
  • PostRank
  • Primer
  • Softcard
  • Wildfire Interactive
  • Widevine
  • Organization
    and productivity

  • Browser Sync
  • Calendar
  • Cloud Search
  • Desktop
  • Drive
  • Etherpad
  • fflick
  • Files
  • iGoogle
  • Jamboard
  • Notebook
  • One
  • Photos
  • Quickoffice
  • Quick Search Box
  • Surveys
  • Sync
  • Tasks
  • Toolbar
  • Docs Editors

  • Drawings
  • Forms
  • Fusion Tables
  • Keep
  • Sheets
  • Slides
  • Sites
  • Vids
  • Publishing

  • Blogger
  • Domains
  • FeedBurner
  • One Pass
  • Page Creator
  • Sites
  • Web Designer
  • Education

  • Grasshopper
  • Socratic
  • Photomath
  • Read Along
  • Workspace
  • Others

  • Takeout
  • Android Auto
  • Android Beam
  • Arts & Culture
  • Assistant
  • Authenticator
  • Body
  • BufferBox
  • Building Maker
  • BumpTop
  • Cast
  • Cloud Print
  • Crowdsource
  • Digital Wellbeing
  • Expeditions
  • Family Link
  • Find My Device
  • Fit
  • Google Fonts
  • Gboard
  • Gemini
  • Gesture Search
  • Impermium
  • Knol
  • Lively
  • Live Transcribe
  • MyTracks
  • Nearby Share
  • Now
  • Offers
  • Opinion Rewards
  • Person Finder
  • Poly
  • Question Hub
  • Quick Share
  • Reader
  • Safe Browsing
  • Sidewiki
  • SlickLogin
  • Sound Amplifier
  • Speech Services
  • Station
  • Store
  • TalkBack
  • Tilt Brush
  • URL Shortener
  • Voice Access
  • Wavii
  • Web Light
  • WiFi
  • Chrome

  • Chromium
  • Dinosaur Game
  • GreenBorder
  • Remote Desktop
  • Web Store
  • V8
  • Images and
    photography

  • Lens
  • Snapseed
  • Panoramio
  • Photos
  • Picasa
  • Picnik
  • Hardware

    Smartphones

  • Android One
  • Nexus
  • S
  • Galaxy Nexus
  • 4
  • 5
  • 6
  • 5X
  • 6P
  • Comparison
  • Pixel
  • Play Edition
  • Project Ara
  • Laptops and tablets

  • Nexus
  • Pixel
  • Wearables

  • Pixel Buds
  • Pixel Watch
  • Pixel Watch 2
  • Project Iris (unreleased)
  • Virtual reality
  • Others

  • Chromebox
  • Clips
  • Digital media players
  • Dropcam
  • Liquid Galaxy
  • Nest
  • OnHub
  • Pixel Visual Core
  • Search Appliance
  • Sycamore processor
  • Tensor
  • Tensor Processing Unit
  • Titan Security Key
  • t
  • e
  • Advertising

  • Rescuecom Corp. v. Google Inc. (2009)
  • Goddard v. Google, Inc. (2009)
  • Rosetta Stone Ltd. v. Google, Inc. (2012)
  • Google, Inc. v. American Blind & Wallpaper Factory, Inc. (2017)
  • Jedi Blue
  • Antitrust

  • United States v. Adobe Systems, Inc., Apple Inc., Google Inc., Intel Corporation, Intuit, Inc., and Pixar (2011)
  • Umar Javeed, Sukarma Thapar, Aaqib Javeed vs. Google LLC and Ors. (2019)
  • United States v. Google LLC (2020)
  • United States v. Google LLC (2023)
  • Intellectual property

  • Viacom International Inc. v. YouTube, Inc. (2010)
  • Lenz v. Universal Music Corp.(2015)
  • Authors Guild, Inc. v. Google, Inc. (2015)
  • Field v. Google, Inc. (2016)
  • Google LLC v. Oracle America, Inc. (2021)
  • Smartphone patent wars
  • Privacy

  • Hibnick v. Google, Inc. (2010)
  • United States v. Google Inc. (2012)
  • Judgement of the German Federal Court of Justice on Google's autocomplete function (2013)
  • Joffe v. Google, Inc. (2013)
  • Mosley v SARL Google (2013)
  • Google Spain v AEPD and Mario Costeja González (2014)
  • Frank v. Gaos (2019)
  • Other

  • Google LLC v Defteros (2020)
  • Epic Games v. Google (2021)
  • Gonzalez v. Google LLC (2022)
  • Terms and phrases

  • Gayglers
  • Google (verb)
  • Google bombing
  • Google effect
  • Googlefight
  • Google hacking
  • Googleshare
  • Google tax
  • Googlewhack
  • Googlization
  • "Illegal flower tribute"
  • Rooting
  • Search engine manipulation effect
  • Sitelink
  • Site reliability engineering
  • YouTube poop
  • Documentaries

  • Google: Behind the Screen
  • Google Maps Road Trip
  • Google and the World Brain
  • The Creepy Line
  • Books

  • The Google Story
  • Google Volume One
  • Googled: The End of the World as We Know It
  • How Google Works
  • I'm Feeling Lucky
  • In the Plex
  • The Google Book
  • The MANIAC
  • Popular culture

  • Google Me (film)
  • "Google Me" (Kim Zolciak song)
  • "Google Me" (Teyana Taylor song)
  • Is Google Making Us Stupid?
  • Proceratium google
  • Matt Nathanson: Live at Google
  • The Billion Dollar Code
  • The Internship
  • Where on Google Earth is Carmen Sandiego?
  • Others

  • elgooG
  • Predictions of the end
  • Registry
  • Pimp My Search
  • Relationship with Wikipedia
  • Sensorvault
  • Stanford Digital Library Project
  • Category
  • Commons
  • Outline
  • WikiProject
  • Internet bots designed for Web crawling and Web indexing

    Active

  • bingbot
  • Crawljax
  • Fetcher
  • Googlebot
  • Heritrix
  • HTTrack
  • PowerMapper
  • Wget
  • Discontinued

  • msnbot
  • RBSE
  • TkWWW robot
  • Twiceler
  • Types

  • Focused crawler

  • Retrieved from "https://en.wikipedia.org/w/index.php?title=Googlebot&oldid=1229344262"

    Categories: 
    Google software
    Web crawlers
    Internet bots
    Google Search
    Hidden categories: 
    Articles with short description
    Short description matches Wikidata
    Articles needing additional references from October 2019
    All articles needing additional references
    Wikipedia articles in need of updating from March 2020
    All Wikipedia articles in need of updating
    Articles with multiple maintenance issues
    All articles with unsourced statements
    Articles with unsourced statements from May 2019
    Articles with unsourced statements from March 2011
    Articles with unsourced statements from May 2018
     



    This page was last edited on 16 June 2024, at 08:25 (UTC).

    Text is available under the Creative Commons Attribution-ShareAlike License 4.0; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.



    Privacy policy

    About Wikipedia

    Disclaimers

    Contact Wikipedia

    Code of Conduct

    Developers

    Statistics

    Cookie statement

    Mobile view



    Wikimedia Foundation
    Powered by MediaWiki