microformats2-parsing

<entry-title>microformats2 parsing</entry-title>

One of the goals of microformats2 is to greatly simplify parsing of microformats, in particular, by making parsing independent of any one vocabulary.

parsing algorithm

parse a document for microformats

To parse a document for microformats:

start with an empty JSON items array
parse the root element for microformats

parse an element for microformats

To parse an element for microformats:

parse element class for root class name(s) "h-x" (and backcompat)
- if found, start parsing a new microformat
  - parse contained elements for properties (depth first, doc order)
    - parse an element for microformats (recurse)
  - imply properties (see below)
parse element class for properties (p-,dt-,u-,e-)
add properties found (with any nested microformats) to current microformat

parsing a p- property

To parse an element for a p-x property value:

parse the element for the value-class-pattern, if a value is found then return it.
if abbr.p-x[title], then return the title attribute
else if data.p-x[value], then return the value attribute
else if br.p-x or hr.p-x, then return "" (empty string)
else if img.p-x[alt] or area.p-x[alt], then return the alt attribute
else return the innertext of the element.

parsing a u- property

To parse an element for a u-x property value:

parse the element for the value-class-pattern, if a value is found then return it.
if a.u-x[href] or area.u-x[href], then get the href attribute
else if img.u-x[src], then get the src attribute
else if object.u-x[data], then get the data attribute
if there is a gotten value, return the normalized absolute URL of it, following the containing document's language's rules for resolving relative URLs.
else if abbr.u-x[title], then return the title attribute
else if data.u-x[value], then return the value attribute
else return the innertext of the element.

parsing a dt- property

To parse an element for a dt-x property value:

parse the element for the value-class-pattern including the date and time parsing rules, if a value is found then return it.
if time.dt-x[datetime] or ins.dt-x[datetime] or del.dt-x[datetime], then return the datetime attribute
else if abbr.dt-x[title], then return the title attribute
else if data.dt-x[value], then return the value attribute
else return the innertext of the element.

parsing an e- property

To parse an element for a e-x property value:

return the innerHTML of the element by using the HTML spec: Serializing HTML Fragments algorithm.

parsing for implied properties

To imply properties: (where h-x is the root microformat element being parsed)

if no explicit "name" property,
then imply by:
- if img.h-x then use its alt attribute for name
- else if .h-x>img:only-node then use that img alt for name
- else if .h-x>:only-node>img:only-node use that img alt for name
- else use the innertext of the .h-x for name
- drop leading & trailing white-space from name, including nbsp
if no explicit "photo" property,
then imply by:
- if img.h-x[src] then use src for photo
- else if .h-x>img[src]:only-of-type then use that img src for photo
- else if .h-x>:only-child>img[src]:only-of-type then use that img src for photo
if no explicit "url" property,
then imply by:
- if a.h-x[href] then use href for url
- else if .h-x>a[href]:only-of-type then use that a[href] for url

what do the CSS selector expressions mean

Use SelectORacle to expand any of the above CSS selector expressions into longform English prose.

microformats2-parsing

Contents

parsing algorithm

parse a document for microformats

parse an element for microformats

parsing a p- property

parsing a u- property

parsing a dt- property

parsing an e- property

parsing for implied properties

what do the CSS selector expressions mean

see also

Navigation menu

microformats2-parsing

parsing algorithm

parse a document for microformats

parse an element for microformats

parsing a p- property

parsing a u- property

parsing a dt- property

parsing an e- property

parsing for implied properties

what do the CSS selector expressions mean

see also

Navigation menu

Search