[uf-discuss] generic microformat parsing heuristics?
Phil Dawes
phil at phildawes.net
Mon Nov 7 13:11:31 PST 2005
Hi Mark,
Mark Pilgrim wrote:
> On 11/7/05, Phil Dawes <phil at phildawes.net> wrote:
>
>>Out of interest, do you think that a generic microformats parser _can_
>>be written?
>>(e.g. something that could parse hcard, hcal et al out of xhtml without
>>prior knowledge of their precise schemas?)
>
>
> No, nor should any effort be expended in such a pursuit. c.f.
> http://microformats.org/discuss/mail/microformats-discuss/2005-October/001175.html
> "We don't care about the general case." This is just the general
> case rearing its ugly head on the parsing side, instead of the
> production side.
Blimey - there's obviously a bit of painful history here!
Ok cool. So I'm not advocating persuing a standard generic model for
semantic xhtml (I'm *obviously* at the wrong party for that!), just
wondering if there's some shortcuts I'm missing. Given that there's a
set of 'Semantic XHTML Design Principles' underpinning each format I
suspect there's some middle ground here that can be exploited for a bit
of genericity.
Maybe a table driven parser? ;-)
Cheers,
Phil
P.S. a big thanks for feedparser BTW - has saved me weeks of coding time
at work in the last year
More information about the microformats-discuss
mailing list