microformats2 parsing specification

(Difference between revisions)

Jump to: navigation, search
(add note backward compatibility details to help clarify backcompat parsing as defined in the algorithm)
m (t)
Line 3: Line 3:
:This is a '''Living Specification''' with several interoperable [[#Implementations|implementations]]
:This is a '''Living Specification''' with several interoperable [[#implementations|implementations]]
:Wiki ([[#qeedback|Questions]], [[#issues|Open issues]])
:Wiki ([[#qeedback|Questions]], [[#issues|Open issues]])

Revision as of 05:14, 27 November 2015

microformats2 is a simple, open format for marking up data in HTML. The microformats2 parsing specification describes how to implement a microformats2 parser, independent of any specific vocabularies.

This is a Living Specification with several interoperable implementations
Wiki (Questions, Open issues)
IRC: #microformats on Freenode
Tantek Çelik
Per CC0, to the extent possible under law, the editors have waived all copyright and related or neighboring rights to this work. In addition, as of 2017-11-18, the editors have made this specification available under the Open Web Foundation Agreement Version 1.0.



parse a document for microformats

To parse a document for microformats, follow the HTML parsing rules and do the following:

 "items": [],
 "rels": {},
 "rel-urls": {}

Parsers may simultaneously parse the document for both class and rel microformats (e.g. in a single tree traversal).

parse an element for class microformats

To parse an element for class microformats:

parse an element for properties

parsing a p- property

To parse an element for a p-x property value whether explicit "p-*" or backcompat equivalent:

parsing a u- property

To parse an element for a u-x property value whether explicit "u-*" or backcompat equivalent:

parsing a dt- property

To parse an element for a dt-x property value whether explicit "dt-*" or backcompat equivalent:

parsing an e- property

To parse an element for a e-x property value whether explicit "e-*" or backcompat equivalent:

parsing for implied properties

Imply properties only on explicit h-x class name root microformat element (no backcompat roots)

Note: The same markup for a property should not be causing that property to occur in both a microformat and one embedded inside - such a property should only be showing up on one of them. The parsing algorithm has details to prevent that, such as the :not[.h-*] tests above.

parse a hyperlink element for rel microformats

To parse a hyperlink element (e.g. a or link) for rel microformats: use the following algorithm or an algorithm that produces equivalent results:

rel parse examples

Here are some examples to show how parsed rels may be reflected into the JSON (empty items key).

E.g. parsing this markup:

<a rel="author" href="http://example.com/a">author a</a>
<a rel="author" href="http://example.com/b">author b</a>
<a rel="in-reply-to" href="http://example.com/1">post 1</a>
<a rel="in-reply-to" href="http://example.com/2">post 2</a>
<a rel="alternate home"
   hreflang="fr">French mobile homepage</a>

Would generate this JSON:

  "items": [],
  "rels": { 
    "author": [ "http://example.com/a", "http://example.com/b" ],
    "in-reply-to": [ "http://example.com/1", "http://example.com/2" ],
    "alternate": [ "http://example.com/fr" ], 
    "home": [ "http://example.com/fr" ] 
  "rel-urls": {
    "http://example.com/a": {
      "rels": ["author"], 
      "text": "author a"
    "http://example.com/b": {
      "rels": ["author"], 
      "text": "author b"
    "http://example.com/1": {
      "rels": ["in-reply-to"], 
      "text": "post 1"
    "http://example.com/2": {
      "rels": ["in-reply-to"], 
      "text": "post 2"
    "http://example.com/fr": {
      "rels": ["alternate", "home"],
      "media": "handheld", 
      "hreflang": "fr", 
      "text": "French mobile homepage"

what do the CSS selector expressions mean

This section is non-normative.

Use SelectORacle to expand any of the above CSS selector expressions into longform English prose.


note HTML parsing rules

This section is non-normative.

microformats2 parsers are expected to follow HTML parsing rules, which includes for example:

note backward compatibility details

The parsing algorithm and details refer to "backcompat root classes" (backcompat roots for short) and "backcompat properties". These conditions and steps in the algorithm document how to parse pre-microformats2 microformats which all defined their own specific root class names and explicit sets of properties.

Some details to be aware of (which are explicitly in the algorithm, this is just an informal summary)


See the FAQ:


See the issues page:


Main article: microformats2#Implementations

There are open source microformats2 parsers available for Javascript, node.js, PHP, Ruby and Python.

test suite


Ports to/for other languages encouraged.

see also


microformats2 parsing specification was last modified: Wednesday, December 31st, 1969