[uf-discuss] xdmp profiles not enough for parsing?
phil at phildawes.net
Wed Nov 16 11:26:25 PST 2005
Tantek Çelik wrote:
[.. a load of sage stuff about xdmp, which I agree with BTW.. - I'm not
trying to change XDMP]
..and then wrote:
> Worse than that, you WILL make mistakes in terms of thinking, oh you would
> ONLY want to embed A in B, until someone figures out that oops, in
> *practice* you actual *do* want to embed <ol> or <ul> inside a <p> for
> example (from HTML4 DTD).
That's cool, but microformats are already constrained by an informal
spec and a pre-existing schema. adr elemnents *do* only contain
(post-office-box, extended-address, street-address, locality, region,
postal-code, country-name, type) because the spec says so.
In fact things are made worse by a lack of mechanically processable
structure info:- if the above rules change, each parser needs to be
re-written with the new hardcoded rules. Each software installation
needs to upgrade to the new libraries.
Having said that, I don't think schemas are the way to go. Personally
I'd prefer a reasonable set of heuristics to deduce structure from the
compound format xhtml rather than mess about with yet-another-schema
language. I don't think it would be hard either - something like "don't
put text between the parent element and its children" would probably do
it (I've haven't found an example of a microformat that violates this
More information about the microformats-discuss