[microformats-discuss] URIs please!
Ryan King
ryan at technorati.com
Thu Jul 14 10:14:53 PDT 2005
On Jul 14, 2005, at 3:14 AM, Danny Ayers wrote:
> A little plea.
>
> I just noticed that in the example for Bud's XFolk [1] that the blocks
> containing microformats are demarcated using class attributes, i.e.
>
> <div class="xfolkentry">
>
> As it stands, the only way of discovering whether such a document
> contains microformat data is to scrape it and look for that value.
> Consider the following scenario:
>
> You have a subscription to certain del.icio.us RSS feeds; checking the
> linked pages to see if they contain microcontent markup, if they do
> extracting the data and putting it into a queryable store. All
> automatic.
>
> Now you are likely to get a lot more docs which don't contain
> microformat data than those which do. Ok, so "xfolkentry" is unlikely
> to be misinterpreted. But what if the documents contain e.g.
>
> <div class="name">
>
> Is this microformat data? Which microformat?
Depends on the context. For example, we have "description" in several
microformats, which can be disambiguated by its context. For example,
this:
<div class="hreview">
<p class="description>
..
</p>
</div>
and this:
<div class="vcalendar>
<span class="description">...</span>
</div>
Are not ambiguous.
> There is a mechanism for recognising microformats in the docs - use a
> profile, e.g.
>
> <head profile="http://example.org/some/microformat/schema">
Yes, and its likely that this will be expanded to at least:
<link rel="profile" href="..." />
if not also:
<a rel="profile" href="...">...</a>
> The microformats docs do cover a simple schema language XDMP, but how
> the schema is done in this context is less significant than it having
> a URI. Having it in the <head> is good too, it isn't necessary to
> parse the whole doc looking for any "known" attributes. It also makes
> it possible to offer support for microformats unknown to the system at
> design time (for RDF-based apps this is straightforward using GRDDL).
>
> Sure, there's an advantage in having well-known semantic markup terms,
> the vocabularies defined in microformats. But for automatic discovery
> and processing it's also hugely beneficial to be able to recognise
> microformat data unambiguously. The doc can be processed in a way
> appropriate for the microformat. The profile URI provides this
> disambiguation and allows deterministic processing. This doesn't in
> any way compromise the "simplicity" aim of microformats, in fact the
> net effect is overall simplification. Hunting for arbitrary strings in
> attributes is hard work!
I'm not sure what point you're trying to make here. I don't think
anyone's arguing against profile urls.
> I understand there's ongoing discussion about declaring that a
> microformat is in use in doc fragments (where the <head> is
> unavailable). I don't know whether the use of an <a> hyperlink is the
> best mechanism or not (a possible alternative might be to use a URI
> for the outermost microformat term, e.g. <div
> class="http://example.org/some/microformat/schema/xfolkentry">).
How is this different from
<div class="xfolkentry">
?
> But however it's done, identification of the microformat used within
> the doc by means of a URI (the GUID of the Web) is essential IMHO to
> make the difference between making quality, globally unambiguous data
> available and something barely less fragile than screenscraping as it
> stands.
I don't think anyone is disagreeing with you.
-ryan
More information about the microformats-discuss
mailing list