[uf-discuss] Human and machine readable data format

Fri Jul 11 01:17:54 PDT 2008

Martin McEvoy wrote:

> <div class="item updated">
> 	<p>Date <span class="value 2008-07-11T00:01+0100">Friday, July the  
> 11th 2008</span></p>
> </div>

There are a couple of problems with this:

Firstly, the class element may contain more than two classes - e.g.  
it may contain some others that have been added for styling or  
Javascript purposes. When there are more than two classes, parsers  
will need to have some kind of heuristic to figure out which one to  
parse as the value. This may be pretty easy for dates, but if someone  
wanted to use the pattern for one of the other problematic properties  
that have been identified (e.g "type" in hCard tel, or in hReview),  
this would become harder.

Secondly, and more importantly, it breaks the existing interpretation  
of class="value", which on <span> elements is currently used to mean  
that the textual content of the element should be used as the value.  
Faced with a "value" class, how should parsers know whether to parse  
by the old method (take a value from the class attribute) or the new  
method (from the element contents)? And yes, they will need to  
continue to support the old method because of the existing corpus of  
published data out there.

Frances' proposal with the "data-" prefix can suffer from the first  
problem (if there are two classes with a "data-" prefix), but that is  
easily spec'ed around by saying that in those situations, the longest  
such value is to be used. And it doesn't suffer from the second  
problem at all - the existence of a class with a data-prefix is a  
clear heuristic for parsers to determine whether to use the old  
method or new method.

-- 
Toby A Inkster
<mailto:mail at tobyinkster.co.uk>
<http://tobyinkster.co.uk>