[uf-discuss] HTML::Microformats and XML::Atom::Microformats stable releases

Toby Inkster mail at tobyinkster.co.uk
Tue Dec 21 04:28:32 PST 2010


I released the first stable version of HTML::Microformats a
few days ago, but was waiting for the first stable release of
XML::Atom::Microformats before I announced it here.

HTML::Microformats is a Perl module to parse microformats
embedded in HTML or XHTML, outputting them as RDF, JSON or
native Perl objects.

It offers support for all microformat specifications (except
rel-nofollow), most of the drafts, and all of the design
patterns listed on the front page of the microformats.org wiki,
and a little bit more.

See also:
http://microformats.org/wiki/parsers#HTML::Microformats
http://search.cpan.org/dist/HTML-Microformats-0.100/

(There's a known bug surrounding RDF output of rel=me.)

XML::Atom::Microformats brings the same functionality to
HTML and XHTML content found in <content> elements within
Atom feeds.

See also:
http://microformats.org/wiki/parsers#XML::Atom::Microformats
http://search.cpan.org/dist/XML-Atom-Microformats-0.001/

Also perhaps worth mentioning is HTML::Data::Parser which wraps
HTML::Microformats alongside an RDFa parser, microdata parser,
and other modules that extract data from HTML, exposing the result
as a single RDF model suitable for querying with SPARQL.

http://search.cpan.org/dist/HTML-Data-Parser-0.003/

There's a little demo of the combined parser here:

http://srv.buzzword.org.uk/HTML-Data-Parser.pl?format=html&url=http://tantek.com/

It's not intended for production use and has some unicode issues at the
moment.

-- 
Toby A Inkster
<mailto:mail at tobyinkster.co.uk>
<http://tobyinkster.co.uk>



More information about the microformats-discuss mailing list