Misc (was: [uf-discuss] Disambiguation Conventions? (was CommentsfromIBM/Lotus rep about Microformats))

Joe Andrieu joe at andrieu.net
Thu Dec 14 02:45:45 PST 2006

Andy Mabbett wrote:
> Joe Andrieu wrote:
> >For example, since it was initially stabilized hCard has 
> been changed 
> >to include "place" in its semantics, yet we have no way to 
> let parsers 
> >know that the "new" hCards may not be people, companies, or 
> >organizations, but instead may also be places.
> This interests me. It's not apparent from the 'wiki' that 
> that's a more recent development. Please can you site links 
> for the relevant discussion?

Well, it is actually more convoluted than that.

The hcard profile[1], which one might expect to be the most definitive
reference of what an hCard is, says this:
  All values are defined according to the semantics defined in the hCard
specification and thus in RFC 2426.

Semantics by RFC2426. If you track RFC 2426[2] down, it says this:
This memo defines the profile of the MIME Content-Type [MIME-DIR] for
   directory information for a white-pages person object, based on a
   vCard electronic business card.

A person object, based on vCard. Tracking down the vCard 2.0[3]
specification finds this:
The schema is based on the attributes for the person object defined in
the X.520 and X.521 directory services recommendations.

X.521 says this:
person OBJECT-CLASS ::= {
  SUBCLASS OF   {top}
  MUST CONTAIN  {commonName | surname}
  MAY CONTAIN   {description | telephoneNumber | userPassword | seeAlso}
  ID            id-oc-person

However this is is perhaps better clarified in RFC4519, Lightweight
Directory Access Protocol (LDAP): Schema for User Applications:
   The 'person' object class is the basis of an entry that represents a
   human being.
   (Source: X.521 [X.521])

      ( NAME 'person'
         SUP top
         MUST ( sn $
               cn )
         MAY ( userPassword $
               telephoneNumber $
               seeAlso $ description ) )

So, by following the trail of the hcard-profile, we see that hCards are
for People.

However the description of hCard on the wiki today says (2006/12/14):
hCard is a simple, open, distributed format for representing people,
companies, organizations, and places, using a 1:1 representation of the
properties and values of the vCard standard (RFC2426
(http://www.ietf.org/rfc/rfc2426.txt)) in semantic XHTML. 

It used to say (2005/6/19)
hCard is a simple, open, distributed contact information format for
people companies and organizations which is suitable for embedding in
(X)HTML, Atom, RSS, and arbitrary XML.

The change to add places was added by Tantek on Aug 3, 2006 after a
fairly brief discussion on the list.  I happened to be one of the voices
opposing the change[7], although I spoke up a bit late in the
conversation after the powers that be were convinced the change was a
good thing.  The unique fact is that microformats is the only entity in
the entire chain of authorship, from CCIT to Versit to IETF to change
the spec from being about people to being about /more/ than people.
Other changes were about adding features to extend what you could say
about people.  We just up and redefined the core semantic.

First, we included companies and organizations in the very beginning.
Then, a year later, we added places.  And yet, our own documentation
contradicts itself: the profile specifies hcards for PEOPLE only, via
RFC 2426, but the description of hCard on the wiki now extends that to
companies, organizations, and even locations.

This is a pretty significant loss of semantic specificity. While a human
can disambiguate between hCards for places and hCards for people, a
/machine/ would have a very hard time of it.  The entire point of the
semantic web is to make it easy^H^H^H^H /possible/ for machines to make
sense of the information that's out there.  Now, when a spider finds an
hCard, it can't tell if it is a person, company, organization, or place.
That sucks.  It would be much more useful if hCards could actually be
expected to be people!  Imagine that. Then machines might be able to do
something useful with this class of entities it discovered while
cruising around.  But that route is lost to us.

Standards aren't meant to "evolve".  The are revised. Updated. Changed.
Explicitly. Intentionally. And with clear versioning. The nature of a
standard is to be /standard/ across contexts, especially time.    

With the current ad-hoc approval and revision process, I have serious
concerns about whether or not we are capable of maintaining the kind of
rigor that would protect semantic meaning over time and in fact allow a
real standard to exist in any meaningful way.  Especially as our
virtually non-existent versioning precludes the ability for anyone to
meaningfully track such significant changes as the addition of "places"
in the semantics of hCard.  It's frustrating. 

I don't intend this email as an attack on anyone. I do mean it as a flag
for problems that many in the community have dismissed as unimportant.

My $.02


[1] http://microformats.org/wiki/hcard-profile
[2] http://www.ietf.org/rfc/rfc2426.txt
[3] http://www.imc.org/pdi/vcard-21.txt
[5] http://rfc.net/rfc4519.html
[6] http://microformats.org/wiki/hcard

Joe Andrieu
joe at andrieu.net
+1 (805) 705-8651

More information about the microformats-discuss mailing list