microformats2-parsing
Revision as of 01:35, 16 October 2012 by Tantek (talk | contribs) (→see also: clarify brainstorming)
<entry-title>microformats2 parsing</entry-title>
One of the goals of microformats2 is to greatly simplify parsing of microformats, in particular, by making parsing independent of any one vocabulary.
parsing algorithm
To parse an element for microformats:
- parse element class for root class name(s) "h-*" (and backcompat)
- if found, start parsing a new microformat
- parse contained elements for properties (depth first, doc order)
- parse an element for microformats (recurse)
- imply properties (see below)
- parse contained elements for properties (depth first, doc order)
- if found, start parsing a new microformat
- parse element class for properties (p-,dt-,u-,e-)
- add properties found (with any nested microformats) to current microformat
parsing for implied properties
To imply properties: (where h-* is the root microformat element being parsed)
- if no explicit "name" property,
- then imply by:
- if img.h-* then use its alt attribute for name
- else if .h-*>img:only-node then use that img alt for name
- else if .h-*>:only-node>img:only-node use that img alt for name
- else use the innertext of the .h-* for name
- drop leading & trailing white-space from name, including nbsp
- if no explicit "photo" property,
- then imply by:
- if img.h-*[src] then use src for photo
- else if .h-*>img[src]:only-of-type then use that img src for photo
- else if .h-*>:only-child>img[src]:only-of-type then use that img src for photo
- if no explicit "url" property,
- then imply by:
- if a.h-*[href] then use href for url
- else if .h-*>a[href]:only-of-type then use that a[href] for url
see also
- microformats2
- microformats2-implied-properties
- microformats2-parsing-brainstorming - for background, thinking, exploring possibilities