[uf-discuss] Parsing XFN in PHP

Mark Ng mark at markng.me.uk
Thu Apr 10 09:38:30 PDT 2008


all of this being true - on the simpler subject of just grokking XFN,
we can let tidy do all the heavy lifting (and return XML-serialised
HTML from standard HTML) and not worry about the complexities of
anything else.  other microformats, of course, are a completely
different matter (and, as you suggest *MUCH* harder to deal with
reliably).

Or am I really being naive here ?

On 10/04/2008, Ryan Parman <ryan.lists.warpshare at gmail.com> wrote:
> As someone with a background in parsing RSS/Atom, I can say from years of
> experience that RSS is only occasionally XML and that you typically find far
> more HTML in a feed than XML. And parsing HTML can be a bitch.
>  1) Parsing HTML is hard -- especially when the only tools available are for
> another language (XML). If you need to screw something in, but screw drivers
> don't exist, do you use a hammer? An elegantly folded paperclip? A
> combination of both?
>
>  2) *Reliably* parsing microformats out of *most* (X)HTML with
> object-oriented PHP 5.x is going to be a big project. If you're diligent
> about commenting your code so that others can understand what's going on,
> I'd expect a PHP5 library to be at least 1 megabyte. You'll need to account
> for an unprecedented number of completely idiotic markup faults.


More information about the microformats-discuss mailing list