"uid" microformats? (was Re: [uf-discuss] ISBN mark-up)

Tantek Ç elik tantek at cs.stanford.edu
Tue Apr 25 12:29:32 PDT 2006

On 4/25/06 12:01 PM, "Xiaoming Liu" <liu_x at lanl.gov> wrote:

> On Tue, 25 Apr 2006, Tantek Çelik wrote:
>> Please list the specific problems you've found with UID or URL, so we can
>> make sure they are documented and properly explored/resolved.
> Thanks for taking into consideration.
> First hopefully we all agree on the problem to be addressed here, I think
> it is "a microformat for indicating something *is* an identifier", and I
> will presume there are three possible solutions: URL, UID, URI.
> URL is good and I agree we should choose to resolvable URLs if possible.
> However URL identifies a resource via a network location, thus limits its
> scope,

This isn't a limitation, this is a deliberate preference.  We *want*
resources that can be identified by network location and thus a system that
shows a bias *for* that is a *good* thing.

In short, it's a feature, not a bug. ;)

> many well-established identifiers are not based on URL. e.g. In a
> typical library application, we really want to identify the books in
> Amazon and local catalog are referencing same thing,

Ah, you just introduced a new requirement, and perhaps that is where the
disconnect is.

You are assuming that we need to design into the format a way to identify
that two *different* references to a book are the *same* thing.

We don't actually need to solve that problem in the scope of the microformat
design. That is something that we may leave up to implementations instead.
In other words, I see no reason to solve this problem at the format level.

The requirement that we are looking at is: globally unique, that is, two
*different* events/contacts don't end up using the same UID.  That's it.  If
the same event in two places uses different UIDs, that is actually ok.
That's something that implementations can actually deal with.

> UID might be good in their original scope (vcard and iCalendar), but I am
> afraid it is not sufficient for a wider scope, both semantically and
> syntactically.

The original scope was the global exchange of contact and event information.
The scope for hCard and hCalendar is the same.  The scope has not changed.

<snip rfc2426 quotation for the 3rd time on the list ;) >

> In rfc 2445 (iCalendar), UID is defined as:
> Property Name: UID
> Purpose: This property defines the persistent, globally unique
>   identifier for the calendar component.
> Value Type: TEXT
> Semantically, while the definition is perfectly OK in original context, it
> may deserve consideration in a wider scope of defining an identifier
> microformat, "globally unique" or "persistent" are not necessarily
> applying to many established identifiers,

That shows more of a problem with those so-called "established" identifiers
than with UID semantics.

And actually, I really like the "persistent" detail there as well, it
essentially introduces "permalink" semantics when you combine UID+URL, where
the UID brings the "perma" and the URL brings the "link".

> are we going to exclude these
> "non-persistent"

No, the market will select against "non-persistent" identifiers.  Just as
bloggers tend to link less often to news sites whose article URLs disappear
after weeks vs. those whose article URLs remain valid for years.

There is no reason for us to go out of our way to support things which the
market is selecting against.  In fact, when the market ignores something
like that, it is a good sign of something to NOT spend any time/energy
supporting (nor even discussing at some point).

> or "non-globally unique" identifiers?

No-one is suggesting non-globally unique identifiers in any use case at all.
So yes, by design we are excluding that.

> Syntactically, I think microformats might want to encourage to use both
> standard classes and values, e.g. in hCalendar "dstart" and ISO8601 format
> is suggested, such as:  <abbr class="dtstart" title="2005-10-05">October
> 5</abbr>,


> similarly, in an "identifier microformat" you may also want to
> encourage standard value to be used to allow interoperability. While UID
> allows *any* text value, there are good practices of URI schemes and
> specifications.

Hence we are suggesting that a good UID is actually a URL, which is an even
stronger statement than just URI.

> I believe the syntax issue is rather important, because without clear
> specification, all kinds of things can be stuffed in, therefore defeating
> the purpose of interoperability, as illustrated by DOI/ISBN examples in
> [1].

It does not defeat the purpose, it merely allows the market to select the
better scheme.  Those that use poor identifiers are more often ignored by
the market.  We don't need to solve this problem at the format level.

> URI is well defined, its semantic and syntax are easily accessible.

As is URL.

> And it 
> is the very foundation of semantic web and RDF
> I think introduction of
> URI fit nicely with other parts of semantic web.

Please read http://microformats.org/wiki/microformats

Neither the Uppercase Semantic Web, nor RDF is a required design center for
microformats.  The fact that we have figured out ways to make microformats
work with an RDF model is a nice feature, but not a requirement.

Therefore, RDF/SemWeb needs cannot be used to require new
features/functionality/requirements in microformats,

BUT we should continue implementing in a compatible manner where possible.

> Although URI may add a
> new term to microformat, IMHO the benefit outweighs this drawback.

On the Web, in practice URI has become effectively unnecessary outside of
URL.  Thus we will continue to use URL until strong real world use cases can
be shown that require a URI.  And this does not prevent folks from using
URIs as UIDs.  We simply have no need to require that.



More information about the microformats-discuss mailing list