[uf-discuss] xdmp profiles not enough for parsing?

Phil Dawes phil at phildawes.net
Wed Nov 16 11:26:25 PST 2005


Tantek Çelik wrote:

[.. a load of sage stuff about xdmp, which I agree with BTW.. - I'm not 
trying to change XDMP]

..and then wrote:
 >
> Worse than that, you WILL make mistakes in terms of thinking, oh you would
> ONLY want to embed A in B, until someone figures out that oops, in
> *practice* you actual *do* want to embed <ol> or <ul> inside a <p> for
> example (from HTML4 DTD).
> 

That's cool, but microformats are already constrained by an informal 
spec and a pre-existing schema. adr elemnents *do* only contain 
(post-office-box, extended-address, street-address, locality, region, 
postal-code, country-name, type) because the spec says so.

In fact things are made worse by a lack of mechanically processable 
structure info:- if the above rules change, each parser needs to be 
re-written with the new hardcoded rules. Each software installation 
needs to upgrade to the new libraries.

Having said that, I don't think schemas are the way to go. Personally 
I'd prefer a reasonable set of heuristics to deduce structure from the 
compound format xhtml rather than mess about with yet-another-schema 
language. I don't think it would be hard either - something like "don't 
put text between the parent element and its children" would probably do 
it (I've haven't found an example of a microformat that violates this 
anyway).

Cheers,

Phil


More information about the microformats-discuss mailing list