[uf-discuss] hKit parsing library for PHP5
lists at allinthehead.com
Mon Jun 19 15:10:54 PDT 2006
On returning from @media and lots of associated interesting hallway
conversions, I sat down to hack on a microformat-related idea. It
quickly became apparent that what should've been a couple of hours
coding was going to take significantly longer because I had no
toolkits to back me up. So I put my idea on hold, and thought I'd
better get hacking on a parsing toolkit instead.
I poked around looking at stuff that's already out there, including
Microformats Base, but I couldn't find anything that fitted the model
I was after - namely chuck in a string or URL, and get out an array
structure of, say, hCards.
So in the principal of release early, release often, here's what I'm
calling hKit for PHP5 version 0.1.
It depends on SimpleXML in PHP5, and really needs either the PHP Tidy
functions or tidy on the local system (a configurable setting),
otherwise you're depending on the page being valid. It uses a
pluggable system of 'profiles' for each supported µF - the only one
of which is hCard at the moment. The GetByURL() method supports URL
This really is way too early to be releasing, but if I don't now I
probably never will. This really is a first pass, and bits of it are
a bit hacky. Known limitations (ha!) are:
* Doesn't fully enforce all the parsing rules in hcard-parsing
* Doesn't support include pattern
* Doesn't resolve paths to absolute URLs on, say, images
* No architecure to post-process values yet - e.g. email begins with
All of which makes me ask, well what does it do? :) In practise,
point it at any random hCard-enabled page and it returns a pretty
good set of results. May even be useable for basic applications at
this point. Knock up a quick profile and it'll probably handle
Most important of all, I'm licensing it under a LGPL license with the
hope that others might like to contribute at some point along the
way. If we have a reasonable set of open tools out there, it really
lowers the point of entry for others to hack together quick
applications - as X2V has already proved.
So any quick testing or feedback would be very much appreciated.
More information about the microformats-discuss