uid-brainstorming: Difference between revisions

From Microformats Wiki
Jump to navigation Jump to search
No edit summary
m (Reverted edits by AllieLtrle (Talk) to last version by TobyInk)
 
(45 intermediate revisions by 8 users not shown)
Line 6: Line 6:
== Authors ==  
== Authors ==  


* Tantek Çelik
* Ed Summers
* Ed Summers


== Experience ==
== Experience ==
* a microformat for indicating something *is* an identifier rather than the solved problem of providing a microformat *for* identifiers ([http://www.ietf.org/rfc/rfc2396.txt RFC 2396])
* Tantek has had conversations with LiveClipboard folks and upcoming.org who have had questions about how to do UID properly in hCard and hCalendar. So a separate microformat that those two could call out to would serve a real need.
* It would be useful for autodiscovery puposes to be able to follow a network resolvabale UID and extract more metadata from the referenced UID. Having a well defined UID pattern would prevent an explosion of rel values: rel-vcard, rel-vevent, etc.
** [http://www.google.com/search?q=rel+meta rel=meta] can be used to suggest that additional relevant metadata can be found by following a link. This is often used to link to RDF files, but rel values aren't restricted to any particular MIME types - an HTML document could acceptably be linked to.
* Efforts such as [http://unapi.info unAPI] have a real need for marking up identifiers so that they can be used for retrieving identified objects.
* Greasemonkey and other browser based scripts could make real use of URIs found in pages as opposed to resorting to elaborate regexen. See Jon Udell's  [http://weblog.infoworld.com/udell/stories/2002/12/11/librarylookup.html LibraryLookup] project.
* GoogleScholar embed identifiers in their citations and it would enable moving citations from the browser to a citation manager greatly if there were a way to mark them up.


== Goals Requirements ==
== Goals Requirements ==


== microformat-discuss threads ==
* a method of publishing an asserted globally unique identifier for a piece of content or a referenced item
 
== Thoughts ==
 
=== UID + SOURCE -> permalink? ===
URL is used in hcalendar examples to point not to the permalink of the vevent, but to the permalink for the event home page. If the event home page contains the authorative hcalendar entry, that's fine. Alternatively the source attribute could be borrowed from vcard to mean "microformat permalink" instead of "permalink of the thing the microformat is about".
 
There seems to be a conflict between iCalendar and vCard rfcs as to what a url is. iCalendar says permalink of iCalendar object. vCard says identifying url (eg home page) of the person or object the vCard is about. vCard uses "source" for iCalendar's "url".
 
=== UIDs that are URLs ===
 
It seems like in the 80% case (perhaps 99.99% case on the Web), a UID is going to be a URL, thus a common pattern will likely be things like:
 
<pre><nowiki>
<a class="url uid" href="http://example.com/contentspace/somenumber">the item</a>
</nowiki></pre>
 
 
=== SHOULD rather than MUST ===
 
A UID SHOULD be a URL rather than MUST. The UID microformat will ordinarily be a URL, but it should be flexible enough to allow it to contain non-network resolvable URIs.
 
=== UID + URL -&gt; permalink? ===
 
Can you infer that if something is a URL and a UID that it is also a permalink?  It seems so.  I can't think of any semantic of "permalink" that isn't covered by the union of the semantics of URL and UID.
 
=== abbr pattern ===
 
Use the [[abbr-design-pattern]] to allow identifiers to be more fully described.
<pre>
<abbr class="uid" title="urn:isbn:0950788120">0 9507881-2-0</abbr>
</pre>
 
=== HTML ID attribute ===
 
How does/doesn't the ID attribute from HTML fit into all this? Can it be repurposed to help here?
 
* Yes. Here's what I ([[User:TobyInk|TobyInk]]) use for hCard and hCalendar:
*# If the object has an element with class="uid" and that element links somewhere (e.g. &lt;a href>, &lt;img src>) then the linked URL is the UID;
*# Otherwise, if the element with class="uid" has an ID attribute set, then the page's URL followed by "#", followed by the ID attribute is the UID;
*# Otherwise, use the content of the element with class="uid", respecting the ABBR pattern.
*# If there is no element with class="uid", then if the root element of the microformatted object (e.g. &lt;div class="vcard">) has an ID attribute set, then the page's URL followed by "#", followed by the ID attribute is the UID;
*# Otherwise, there is no UID for this object. The processor {{may}} generate its own UID if the presence of a UID is necessary for further processing.
 
== Proposals ==
 
=== Just use UID from hCard ===
 
* Tantek proposed that we see if we can reuse uid from [[hcard|hCard]], similar to how we have reused [[geo]] and [[adr]] from [[hcard|hCard]].
* In particular we should define per the RFC2426 and RFC2445 definitions of UID, and also state that UID identifies the singular thing which this microformat is about (primary reference). (Thanks Joe Andrieu for this wording).
* Since microformats are about things published on the *Web*, we can say:
** UIDs SHOULD be URLs and if you cannot use a URL (for whatever reason), then you SHOULD at least use a URI, thereby indicating our preference for UIDs which can be resolved to a network location, and barring that, UIDs which follow the URN registry.
 
=== Use rel-bookmark from hAtom ===
hAtom currently uses the HTML standard rel-bookmark to identify its bookmark. This appears to be precisely eqivalent to the class combination "uid url". This does not appear to permit non-url uids, but if we are identifying the authorative id of a piece of microformatted data that we already have in-hand a non-url uid may never be required. Non-url uids (eg isbn) can still appear in citation formats without affecting the development of this "identify myself" microformat.
 
If rel-bookmark is not appropriate for this format, we should consider retrofiting this format back to hAtom. It may be appropriate to replace rel-bookmark with the separate url and uid classes, or to allow either form to be used.
 
=== Create a URI microformat ===
 
* Xiaoming proposed leaving UID intact in hcalendar and hcard, because whatever written in rfc2426/rfc2445 and their examples cannot be easily changed, and they seem to work well with hcalendar/hcard. Instead a new "URI" microformat should be established for the purpose of indicating something *is* an identifier in general.In this case you can easily reference URI RFC and no further elaboration about persistence, resolvability or uniqueness will be necessary because these issues are addressed by various URI specifications.
** The problem with "just use URI" is that URI (or URL for that matter) merely is a *type* of data.  What that data *means* to the microformat still needs additional semantics, and that's why we need a property name like UID (even if it is defined to be of type URI or URL) which specifies this particular semantic.  Thanks to Joe Andrieu for asking the questions which lead to this clarification. - Tantek
 
== References ==
* [[adr]]
* [[geo]]
* [[hatom|hAtom]]
* [[hcard|hCard]]
* [[hcalendar|hCalendar]]
* [[abbr-design-pattern]]


== See Also ==
== See Also ==
* [[isbn]] (and [[issn]])
* [http://www.iana.org/assignments/urn-namespaces IANA URN Namespaces - RFC2141, RFC3406]
* [http://www.iana.org/assignments/uri-schemes.html IANA Uniform Resource Identifer (URI) Schemes]
* [http://ocoins.info COinS] for putting attaching openurl context objexts to a span.
* [http://unapi.info unAPI] has a technique for embedding IDs in html so that they can be retrieved from an unAPI service.
* [http://www.taguri.org/ Tag URI] an algorithm that lets people mint identifiers that no one else using the same algorithm could ever mint.


== Related Discussion ==
== Related Discussion ==
* [http://microformats.org/discuss/mail/microformats-discuss/2006-April/003726.html UID, URL, live microformats]
* [http://microformats.org/discuss/mail/microformats-discuss/2006-April/003726.html UID, URL, live microformats]
* [http://microformats.org/discuss/mail/microformats-discuss/2005-November/002046.html format for identifiers]

Latest revision as of 21:55, 20 December 2008

UID Brainstorming

This page is for brainstorming about ideas, proposals, constraints, requirements for a UID microformat.

Authors

  • Tantek Çelik
  • Ed Summers

Experience

  • a microformat for indicating something *is* an identifier rather than the solved problem of providing a microformat *for* identifiers (RFC 2396)
  • Tantek has had conversations with LiveClipboard folks and upcoming.org who have had questions about how to do UID properly in hCard and hCalendar. So a separate microformat that those two could call out to would serve a real need.
  • It would be useful for autodiscovery puposes to be able to follow a network resolvabale UID and extract more metadata from the referenced UID. Having a well defined UID pattern would prevent an explosion of rel values: rel-vcard, rel-vevent, etc.
    • rel=meta can be used to suggest that additional relevant metadata can be found by following a link. This is often used to link to RDF files, but rel values aren't restricted to any particular MIME types - an HTML document could acceptably be linked to.
  • Efforts such as unAPI have a real need for marking up identifiers so that they can be used for retrieving identified objects.
  • Greasemonkey and other browser based scripts could make real use of URIs found in pages as opposed to resorting to elaborate regexen. See Jon Udell's LibraryLookup project.
  • GoogleScholar embed identifiers in their citations and it would enable moving citations from the browser to a citation manager greatly if there were a way to mark them up.

Goals Requirements

  • a method of publishing an asserted globally unique identifier for a piece of content or a referenced item

Thoughts

UID + SOURCE -> permalink?

URL is used in hcalendar examples to point not to the permalink of the vevent, but to the permalink for the event home page. If the event home page contains the authorative hcalendar entry, that's fine. Alternatively the source attribute could be borrowed from vcard to mean "microformat permalink" instead of "permalink of the thing the microformat is about".

There seems to be a conflict between iCalendar and vCard rfcs as to what a url is. iCalendar says permalink of iCalendar object. vCard says identifying url (eg home page) of the person or object the vCard is about. vCard uses "source" for iCalendar's "url".

UIDs that are URLs

It seems like in the 80% case (perhaps 99.99% case on the Web), a UID is going to be a URL, thus a common pattern will likely be things like:

<a class="url uid" href="http://example.com/contentspace/somenumber">the item</a>


SHOULD rather than MUST

A UID SHOULD be a URL rather than MUST. The UID microformat will ordinarily be a URL, but it should be flexible enough to allow it to contain non-network resolvable URIs.

UID + URL -> permalink?

Can you infer that if something is a URL and a UID that it is also a permalink? It seems so. I can't think of any semantic of "permalink" that isn't covered by the union of the semantics of URL and UID.

abbr pattern

Use the abbr-design-pattern to allow identifiers to be more fully described.

<abbr class="uid" title="urn:isbn:0950788120">0 9507881-2-0</abbr>

HTML ID attribute

How does/doesn't the ID attribute from HTML fit into all this? Can it be repurposed to help here?

  • Yes. Here's what I (TobyInk) use for hCard and hCalendar:
    1. If the object has an element with class="uid" and that element links somewhere (e.g. <a href>, <img src>) then the linked URL is the UID;
    2. Otherwise, if the element with class="uid" has an ID attribute set, then the page's URL followed by "#", followed by the ID attribute is the UID;
    3. Otherwise, use the content of the element with class="uid", respecting the ABBR pattern.
    4. If there is no element with class="uid", then if the root element of the microformatted object (e.g. <div class="vcard">) has an ID attribute set, then the page's URL followed by "#", followed by the ID attribute is the UID;
    5. Otherwise, there is no UID for this object. The processor MAY generate its own UID if the presence of a UID is necessary for further processing.

Proposals

Just use UID from hCard

  • Tantek proposed that we see if we can reuse uid from hCard, similar to how we have reused geo and adr from hCard.
  • In particular we should define per the RFC2426 and RFC2445 definitions of UID, and also state that UID identifies the singular thing which this microformat is about (primary reference). (Thanks Joe Andrieu for this wording).
  • Since microformats are about things published on the *Web*, we can say:
    • UIDs SHOULD be URLs and if you cannot use a URL (for whatever reason), then you SHOULD at least use a URI, thereby indicating our preference for UIDs which can be resolved to a network location, and barring that, UIDs which follow the URN registry.

Use rel-bookmark from hAtom

hAtom currently uses the HTML standard rel-bookmark to identify its bookmark. This appears to be precisely eqivalent to the class combination "uid url". This does not appear to permit non-url uids, but if we are identifying the authorative id of a piece of microformatted data that we already have in-hand a non-url uid may never be required. Non-url uids (eg isbn) can still appear in citation formats without affecting the development of this "identify myself" microformat.

If rel-bookmark is not appropriate for this format, we should consider retrofiting this format back to hAtom. It may be appropriate to replace rel-bookmark with the separate url and uid classes, or to allow either form to be used.

Create a URI microformat

  • Xiaoming proposed leaving UID intact in hcalendar and hcard, because whatever written in rfc2426/rfc2445 and their examples cannot be easily changed, and they seem to work well with hcalendar/hcard. Instead a new "URI" microformat should be established for the purpose of indicating something *is* an identifier in general.In this case you can easily reference URI RFC and no further elaboration about persistence, resolvability or uniqueness will be necessary because these issues are addressed by various URI specifications.
    • The problem with "just use URI" is that URI (or URL for that matter) merely is a *type* of data. What that data *means* to the microformat still needs additional semantics, and that's why we need a property name like UID (even if it is defined to be of type URI or URL) which specifies this particular semantic. Thanks to Joe Andrieu for asking the questions which lead to this clarification. - Tantek

References

See Also

Related Discussion