validating microformats (was Re: [uf-discuss] Google Gdata new
syndication protocol!)
Scott Reynen
scott at randomchaos.com
Fri Apr 21 08:34:11 PDT 2006
On Apr 21, 2006, at 10:03 AM, Benjamin Carlyle wrote:
> So what does validation mean for a micrormat? I think the only
> criteria
> for success that we can meaningfully apply is that the data we put
> into
> the document came back out again through a machine-operated
> process. We
> already have the machine operated processes for various microformats
> (x2v, hAtom2Atom.xsl, etc), but a human must still be in the loop to
> determine whether all of their data got through or not. Unfortunately,
> that's another "by definition" problem. If the data isn't
> machine-readable in the first place, a machine won't know it's
> missing.
I imagine a microformat validator would be relatively short on errors
and long on warnings or "tips". Each class could have a list of
potential sub-classes, and when those don't turn up, I think a
message like "Tip: vcards can have telephone numbers. Did you mean to
include a telephone number? If so, you need to use the following
syntax:" In addition to catching actual oversights, such messages
would encourage more complete descriptions, putting more
microformatted data on the web.
On the other end, any node found with no recognizable class name
could be checked against recognizable content patterns. If there's
an unmarked node within "tel" with a bunch of numbers, I'd like a
validator to suggest that I might want to put class="value" around
it, because it looks like it might be the value of my telephone number.
> We can try and do
> heuristic validation ("this class name you used looks like one that
> could mean something if it were written in a different way"), but the
> heuristics would have to be bourne out of implementation experience
> with
> "common errors" for particular microformats.
I can't think of a better way to discover those common errors than a
validator. I think most of the formatting errors we see on this list
could be recognized by a machine, which would save everyone time and
make authors feel more sure about whether or not they are doing
hwhatever correctly.
Peace,
Scott
More information about the microformats-discuss
mailing list