[uf-discuss] re: HTML5 support

Toby Inkster mail at tobyinkster.co.uk
Wed Jul 21 02:09:22 PDT 2010

On Tue, 20 Jul 2010 08:29:48 -0400
Stephen Paul Weber <singpolyma at singpolyma.net> wrote:

> Having written significant code both in-browser and out to parse
> microformats, I find the claim that parsing them using the DOM is
> "not practical" shocking.  What would you prefer?

Parsing microformats via the DOM is not practical. Parsing them any
other way is even worse though.

While writing DOM code to parse a particular site's implementation of
say, hCard, is pretty trivial, generalising that to support all the
variations of how hCard is marked up in the wild is a lot of work.

As a comparison, I have written Perl parsers[*] for microformats, RDFa
and Microdata. Here are the lines-of-code counts for each, excluding
documentation, comments and blank lines:

Microdata      :  945
RDFa 1.0       : 1265
RDFa 1.1 [**]  : 2611
microformats   : 9455

*  = See <http://search.cpan.org/~tobyink/>.
** = this code actually handles both RDFa 1.0 and 1.1. Whatsmore it can
     handle them embedded in Atom, SVG and OpenDocument Format; not
     just (X)HTML. A pure RDFa-1.1-in-(X)HTML parser could probably be
     written in under 1000 lines of Perl.

The amount of code needed to parse microformats is clearly different
from the other formats.

Another difference is that the Microdata and RDFa 1.0 implementations
can be considered more-or-less complete. (The RDFa 1.1 working drafts
are still somewhat is flux, so the implementation no doubt still needs
changes.) If somebody comes up tomorrow with a new RDFa or Microdata
vocabulary for describing cows, or bread makers, or train timetables,
it will work out of the box. For microformats, that's not the case -
code needs to be written.

So you end up with a chicken-and-egg situation with nobody implementing
tools for a new draft microformat because it's not used in the wild;
nobody using it in the wild because of a lack of tool support; and the
microformat never progressing beyond draft status because of lack of
implementation experience, and uncertainty about how it might work in
the wild. That's why we haven't had any of the draft microformats on
the wiki move out of draft status in the last four years or so; or at
least it's a major contributory factor.

Toby A Inkster
<mailto:mail at tobyinkster.co.uk>

More information about the microformats-discuss mailing list