Jump to content
 







Main menu
   


Navigation  



Main page
Contents
Current events
Random article
About Wikipedia
Contact us
Donate
 




Contribute  



Help
Learn to edit
Community portal
Recent changes
Upload file
 








Search  

































Create account

Log in
 









Create account
 Log in
 




Pages for logged out editors learn more  



Contributions
Talk
 



















Contents

   



(Top)
 


1 Robots noindexNoindexing entire pages  



1.1  Bot-specific directives  





1.2  robots.txt file  







2 Noindexing part of a page  



2.1  <noindex> tag  





2.2  microformat  





2.3  Yahoo!  





2.4  SharePoint  





2.5  Structured comments  



2.5.1  Google Search Appliance  









3 See also  





4 References  














noindex






Dansk
فارسی

Русский
Türkçe
Українська

 

Edit links
 









Article
Talk
 

















Read
Edit
View history
 








Tools
   


Actions  



Read
Edit
View history
 




General  



What links here
Related changes
Upload file
Special pages
Permanent link
Page information
Cite this page
Get shortened URL
Download QR code
Wikidata item
 




Print/export  



Download as PDF
Printable version
 
















Appearance
   

 






From Wikipedia, the free encyclopedia
 


The noindex value of an HTML robots meta tag requests that automated Internet bots avoid indexing a web page.[1][2] Reasons why one might want to use this meta tag include advising robots not to index a very large database, web pages that are very transitory, web pages that are under development, web pages that one wishes to keep slightly more private, or the printer and mobile-friendly versions of pages. Since the burden of honoring a website's noindex tag lies with the author of the search robot, sometimes these tags are ignored. Also the interpretation of the noindex tag is sometimes slightly different from one search engine company to the next.

Robots noindexNoindexing entire pages[edit]

<html>
<head>
  <meta name="robots" content="noindex">
  <title>Don't index this page</title>
</head>

Possible values for the meta tag content are: "none", "all", "index", "noindex", "nofollow", and "follow". A combination of the values is also possible,[1] for example:

<meta name="robots" content="noindex, follow">

Bot-specific directives[edit]

The noindex directive can be restricted only to certain bots by specifying a different "name" value in the meta tag. For example, to specifically block Google's bot,[3] specify:

<meta name="googlebot" content="noindex">

Or, to block Bing's bot, specify:

<meta name="bingbot" content="noindex">

Or, to block Baidu's bot, specify:

<meta name="baiduspider" content="noindex">

robots.txt file[edit]

Arobots.txt file can be used to block crawling.

Noindexing part of a page[edit]

It is also possible to exclude part of a Web page, for example navigation text, from being indexed rather than the whole page. There are various techniques for doing this; it is possible to use several in combination. Google's main indexing spider, Googlebot, is not known to recognize any of these techniques.

<noindex> tag[edit]

The Russian search engine Yandex introduced a new <noindex> tag which prevents indexing of the content between the tags. To allow the source code to validate, <!--noindex--> alternatively can be used:[4]

<p>
Do index this text.
<noindex>Don't index this text.</noindex>
<!--noindex-->Don't index this text.<!--/noindex-->
</p>

Other indexing spiders also recognize the <noindex> tag, including Atomz.[5]

microformat[edit]

There is a 2005 draft microformats specification with the same functionality. The Robot Exclusion Profile looks for the attribute and value class="robots-noindex" in HTML tags:[6]

<p>Do index this text.</p>
<div class="robots-noindex">Don't index this text.</div>
<span class="robots-noindex">Don't index this text.</span>
<p class="robots-noindex">Don't index this text.</p>

A combination of values is also possible,[6] for example:

<div class="robots-noindex robots-follow">Text.</div>

Yahoo![edit]

In 2007, Yahoo! introduced similar functionality to the microformat into its spider. However, Yahoo!'s spider is incompatible in that it looks for the value class="robots-nocontent" and only this value:[7]

<p>Do index this text.</p>
<div class="robots-nocontent">Don't index this text.</div>
<span class="robots-nocontent">Don't index this text.</span>
<p class="robots-nocontent">Don't index this text.</p>

SharePoint[edit]

SharePoint 2010’s iFilter excludes content inside of a <div> tag with the attribute and value class="noindex". Inner <div>s were initially not excluded, but this may have changed. It is also unknown whether the attribute can be applied to tags other than <div>.[8]

<p>Do index this text.</p>
<div class="noindex">Don't index this text.</div>

Structured comments[edit]

Google Search Appliance[edit]

The Google Search Appliance uses structured comments:[9]

<p>
Do index this text.
<!--googleoff: all-->
Do index this text.
<!--googleon: all-->
</p>

Other indexing spiders also use their own structured comments.

See also[edit]

References[edit]

  • ^ Using meta tags to block access to your site, Google Webmasters Tools Help
  • ^ "Using HTML tags". webmaster → help. Yandex. Section: <noindex> tag. Retrieved March 25, 2013.
  • ^ "General Search FAQ". Help. Atomz. 2013. Section: How do I exclude parts of my site from being searched?. Archived from the original on December 8, 2021. Retrieved March 23, 2013. Need to prevent parts of individual pages from being searched? If you want to exclude portions of a page from indexing, surround the text with <noindex> and </noindex> tags. This is useful, for example, if you want to exclude navigation text from searches.(registration required)
  • ^ a b Janes, Peter (June 18, 2005). "Robot Exclusion Profile". Microformats. Retrieved March 24, 2013.
  • ^ Garg, Priyank (May 2, 2007). "Introducing Robots-Nocontent for Page Sections". Yahoo! Search Blog. Yahoo!. Archived from the original on August 20, 2014. Retrieved March 23, 2013.
  • ^ "Control Search Indexing (Crawling) Within a Page with Noindex". Microsoft Developer. Microsoft. June 7, 2010. Archived from the original on November 4, 2017. Retrieved November 4, 2017.
  • ^ "Administering Crawl: Preparing for a Crawl". Google Search Appliance. Google Inc. August 23, 2012. Section: Excluding Unwanted Text from the Index. Archived from the original on November 23, 2012. Retrieved March 23, 2013.

  • Retrieved from "https://en.wikipedia.org/w/index.php?title=Noindex&oldid=1233455120"

    Categories: 
    Search engine optimization
    World Wide Web
    Hidden categories: 
    Pages with login required references or sources
    Articles with short description
    Short description is different from Wikidata
     



    This page was last edited on 9 July 2024, at 04:43 (UTC).

    Text is available under the Creative Commons Attribution-ShareAlike License 4.0; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.



    Privacy policy

    About Wikipedia

    Disclaimers

    Contact Wikipedia

    Code of Conduct

    Developers

    Statistics

    Cookie statement

    Mobile view



    Wikimedia Foundation
    Powered by MediaWiki