validating microformats (was Re: [uf-discuss] Google Gdata new syndication protocol!)

Mark Pilgrim pilgrim at gmail.com
Fri Apr 21 12:29:24 PDT 2006


On 4/21/06, Mark Pilgrim <pilgrim at gmail.com> wrote:
> Perhaps the only way to convince you that a microformats validator would
> be useful is to build the validator I'm imagining and show you.

On the road to working code, here are some notes I threw together on
the sorts of things I would expect a microformats validator to catch. 
I concentrated on hCard since it's complex and I'm familiar with it,
but many of the rules would apply to any microformat.



Errors (the W3C's XHTML validator won't catch any of these):

- Properties that are supposed to have a URI value but do not conform
to RFC 3986

<address class="vcard"><span class="photo">This is a photo if my
dog</span></address>

- URI property with data: URI value that does not conform to RFC 2397

- email property value with a mailto: URI that does not conform to RFC 2368

- date property with non-date value (or illegal date, or date in wrong format)

<address class="vcard"><span class="bday">April 21th, 2006</abbr></address>
<address class="vcard"><span class="bday">1993-06-31</span></address>
<address class="vcard"><span class="rev">1993-10-15T25:01:00Z</span></address>
<address class="vcard"><abbr class="rev" title="Fri, 21 Apr 2006
04:39:10 +0000">April 21th, 2006</abbr></address>

- geo/latitude and geo/longitude properties with improper values

<address class="vcard"><span class="geo"><span class="latitude">North
America</span></span></address>
<address class="vcard"><span class="geo"><span
class="latitude">370</span></span></address>

- Section 2.4.3 of RFC 2426 says phone numbers are supposed to conform
to [CCITT E.163] and [CCITT X.121] (haven't looked into those yet)

- tz property value that does not conform to section 2.4.4 of RFC 2426

- agent property value that is not an hCard

- required FN property missing (see section 1 of RFC 2426)

- <abbr> without title attribute (see
http://microformats.org/wiki/hcard#Human_vs._Machine_readable )

- <img> without an alt attribute (only an error if it has an hCard
class that is not a URI property, see
http://microformats.org/wiki/hcard#Human_vs._Machine_readable )

- illegal property present (NAME, PROFILE, SOURCE, PRODID, VERSION,
see http://microformats.org/wiki/hcard#Property_Exceptions )

<address class="vcard"><span class="prodid">-//ONLINE
DIRECTORY//NONSGML Version 1//EN</span></address>

- more than one sort-string property present

<address class="vcard">
<span class="fn n">
 <span class="additional-names">Robert</span>
 <span class="family-name sort-string">Pau</span>
 <span class="given-name sort-string">Shou Chang</span>
</span>
</address>

- N property present and non-empty when FN and ORG properties are also
present and have the same value as each other (see
http://microformats.org/wiki/hcard#Organization_Contact_Info )

- N property missing and FN property is not exactly two words
separated by whitespace (see
http://microformats.org/wiki/hcard#Implied_.22n.22_Optimization )



Warnings (may indicate publishing errors and/or cause interoperability
problems for consumers):

- no profile URI present (can only be a warning once we, you know,
define a profile URI)

- property with empty value (that does not match one of the
error-producing rules above)

<address class="vcard"><span class="honorific-suffix"></span></address>

- subproperty out of its proper container

<address class="vcard"><span class="given-name">Mark</span></address>

- type with no value (this may be an error, I can't tell)

<address class="vcard"><span class="tel"><span
class="type">home</span>: +1.415.555.1212</span></address>

- type with a value not defined in RFC 2426 (it is not clear from
section 3.2.1 whether this is an error)

<address class="vcard"><div class="label"><abbr class="type"
title="foo">my new adr type</abbr></div></address>

- geo information with latitude but not longitude (or vice-versa)

<address class="vcard"><span class="geo"><span
class="latitude">37.386013</span></span></address>

--
Cheers,
-Mark


More information about the microformats-discuss mailing list