robots-exclusion
Robot Exclusion Profile
Draft Specification 2005-06-18
Authors
Copyright
This specification is © 2004-2005 by the author. However, the author intends to submit this specification to a standards body with a liberal copyright/licensing policy such as the GMPG. See the GMPG Principles for more details. Anyone wishing to contribute to this effort MUST read those principles, especially those regarding copyright and licensing, and agree to them before contributing.
Patents
The author neither holds nor intends to apply for any patents on anything required to implement this specification.
Abstract
The Robot Exclusion Profile is a reworking of the Robots META tag (and less-standard extensions) as a microformat.
Introduction
The Robots META tag is used to provide page-specific direction for web crawlers. While being useful in many cases, its page-specific nature means it cannot be used to restrict crawlers from indexing only certain sections of a document. Several attempts have been made to create more granular solutions through various methods but have perceived shortcomings that limit their use; the Robot Exclusion Profile defines a microformat that can be applied to any element or set of elements in a page.
Like other microformats such as hCalendar, the Robot Exclusion Profile defines a set of class names that may be applied to (X)HTML elements. class
can be applied to almost every (X)HTML element, which means that authors may be as specific or general as they wish in their application. This differs from the similarly-purposed rel="nofollow"
attribute, which may only be applied to (and does not refer to the content of) a specific inline link. (It is interesting to note that this behaviour is entirely encompassed by the use of class="robots-nofollow"
on the same element.) Classes are also additive, so multiple values can be specified at once, e.g. class="robots-nofollow robots-noindex"
. For robot exclusion in particular, this allows authors to specify multiple rules for an element without adding unnecessary extra markup.
Format
Profile URI
http://example.org/xmdp/robots-profile#
(obviously preliminary)
The classes defined by the Robot Exclusion Profile should be considered meaningless when the profile URI is not present in the document