[uf-discuss] generic microformat parsing heuristics?

David Janes -- BlogMatrix davidjanes at blogmatrix.com
Mon Nov 7 11:52:30 PST 2005


Phil Dawes wrote:
> Out of interest, do you think that a generic microformats parser _can_
> be written?
> (e.g. something that could parse hcard, hcal et al out of xhtml without
> prior knowledge of their precise schemas?)

I have a theory that a pretty good library can be put together that will 
do much of the heavy lifting and am considering tackling the task using 
Python (and maybe Amara [1]).

That said, if the data within a hypothetical microformat U can 
represented correctly as a dictionary/map, microformat V as a list of ( 
key, value ) pairs, and W as some sort of hierarchy, I'd probably fairly 
different APIs for representing at the data and composing the same 
during parsing.

Regards, etc...
David
http://www.blogmatrix.com

[1] http://uche.ogbuji.net/uche.ogbuji.net/tech/4suite/amara/


More information about the microformats-discuss mailing list