[uf-discuss] generic microformat parsing heuristics?

Ryan King ryan at technorati.com
Tue Nov 8 13:53:16 PST 2005

On Nov 7, 2005, at 1:11 PM, Phil Dawes wrote:

> Hi Mark,
> Mark Pilgrim wrote:
>> On 11/7/05, Phil Dawes <phil at phildawes.net> wrote:
>>> Out of interest, do you think that a generic microformats parser  
>>> _can_
>>> be written?
>>> (e.g. something that could parse hcard, hcal et al out of xhtml  
>>> without
>>> prior knowledge of their precise schemas?)
>> No, nor should any effort be expended in such a pursuit.  c.f.
>> http://microformats.org/discuss/mail/microformats-discuss/2005- 
>> October/001175.html
>>  "We don't care about the general case."  This is just the general
>> case rearing its ugly head on the parsing side, instead of the
>> production side.
> Blimey - there's obviously a bit of painful history here!
> Ok cool. So I'm not advocating persuing a standard generic model  
> for semantic xhtml (I'm *obviously* at the wrong party for that!),  
> just wondering if there's some shortcuts I'm missing. Given that  
> there's a set of 'Semantic XHTML Design Principles' underpinning  
> each format I suspect there's some middle ground here that can be  
> exploited for a bit of genericity.
> Maybe a table driven parser? ;-)

That's what these would be:
* http://trac.labnotes.org/cgi-bin/trac.cgi/wiki/MicroParserPHP
* http://trac.labnotes.org/cgi-bin/trac.cgi/wiki/MicroParserRuby


Ryan King
ryan at technorati.com

More information about the microformats-discuss mailing list