[uf-dev] Defining and Extending Value Excepting

Ben Ward lists at ben-ward.co.uk
Sun May 18 05:09:23 PDT 2008


Hey Toby,

On 17 May 2008, at 23:08, Toby A Inkster wrote:
> Although this sounds like a nice idea, I've previously been informed  
> that requiring empty inline elements is a non-starter, as many HTML  
> processors (including "tidy" with its default settings) strip these  
> out.
>
> Preliminary testing with tidy (version: 1 September 2005) shows this  
> to be true. Some parsers, including X2V IIRC, pre-process non-XHTML  
> HTML by running it through tidy to get it into well-formed XML.  
> Skimming through the tidy documentation, I can't see a way of  
> disabling this empty inline element stripping behaviour.

hKit does this too (via the W3C hosted version, although there was  
some talk of switching to PHP's native HTML DOM parser instead).  
Looking over the HTMLTidy bug tracker, it does seem to be an open  
issue, but there's one bug — http://is.gd/i8E — proposing that it not  
drop empty elements with class attributes, and includes a simple fix  
for it, fixing that would resolve this.

> If people *want* to publish data that uses empty inline elements,  
> then that's fair enough, but with the current state of HTML  
> processors, it's probably unwise to publish a pattern that  
> *requires* the use of empty inline elements.

I'm not entirely comfortable with a broken part of the parser stack  
being a blocker for a mark-up level pattern.

Of course, If we can't work out a fix, then you're absolutely right  
that we can't go requiring something that's too expensive to parse  
(especially given parsing expense is the whole reason for having  
specified data formats within microformats in the first place!). But,  
if it's feasible to fix tidy for microformat parsers, then I'd be in  
favour of doing so.

B


More information about the microformats-dev mailing list