[uf-discuss] More responses to slashdot comments

Fri Jul 14 01:38:23 PDT 2006

On 7/13/06, Sho Kuwamoto <skuwamot at adobe.com> wrote:
> Exactly my point. There are two competing schema living in the same
> document: the world of HTML (semantically poor and unextensible), and
> the world of microformats. While this works out OK usually, I believe
> there are cases where the two worlds combine in uncomfortable ways.

There have to be two types of schema because there are, as you rightly
said, two orthogonal sets of semantics in HTML.

The first is the tag/attribute based semantics, which are very
strictly defined by the W3C spec.  These are mainly to do with
document structure and so on, and everyone understands what they mean.

The second set of semantics are class/id based, and are completely
'unregulated', that is to say the specific meanings aren't specified
by the HTML spec.  If I want @id="shopping-list" then there's no
reason I shouldn't mark my pages up that way, and there's some
semantic value in doing so over something like @id="centre-column".

Microformats form conventions for how the *both* sets of semantics
should be used.  Microformats will, by preference, use the first set
as far as possible (i.e. using ADDR in hCard) and then define sensible
semantic ids/classes for stuff that isn't covered.

Microformats differ from schemas like the W3C's HTML spec, because
pages don't have a mechanism for declaring that they conform to a
specific microformat.  I don't think this is so much a weakness as a
strength!

Take @rel="tag" for instance.  The microformat for this declares very
specifically what semantics we can read from the relationship between
the current page and the URL being linked. However, there's nothing to
stop someone who's never heard of microformats deciding to use
@rel="tag" on one of their pages, because it seems a sensible value to
use, and you can't tell my looking at a page whether the author had
the microformat in mind or not.

I believe that the strength of microformats is that they are always
sensible markup, so it doesn't matter if someone knows about the
microformat being used or not, the markup still makes sense to them -
if I'm looking at a link and see @rel="tag" in there then that's not
cryptic - I can understand what the link is saying even if I haven't
heard of the microformat.

The converse of this is that if I build a parser that understands
@rel="tag" into my search engine, then I have a spec that tells me a
sensible way to parse and understand the semantics of the link.  When
my search engine finds the hypothetical page above, that uses
@rel="tag" without knowing the microformat, then because the spec
defines a sensible way of parsing it, my search engine will have a
good chance of correctly understanding what the link relationship
means.

-Ciaran