Disambiguation [was RE: "aid" microformats? (was Re:
[uf-discuss]ISBN mark-up)]
Scott Reynen
scott at randomchaos.com
Mon May 1 05:47:49 PDT 2006
On May 1, 2006, at 2:29 AM, Joe Andrieu wrote:
>>> You have to at least start parsing the html document in order to
>>> know which profiles are used.
>>
>> Agreed.
>
> The presumption here is that processing is cheap and undirected.
There's no way to download only the DOCTYPE or the <head> of a
document, and processing is cheaper than bandwidth. Once you've
already downloaded a whole document, you might as well parse it all
because the <head> might be wrong about what's in the <body>.
> See Kaboodle[1] or Backpack[2] or Scrapbook[3] for examples where
> realtime, directed parsing is useful.
>
> [A] http://www.kaboodle.com
> [B] http://www.backpackit.com
> [C] http://amb.vis.ne.jp/mozilla/scrapbook/
>
> Basically, all of these could be seen as variants on Live Clipboard.
Right, but these all work client-side, where the document is already
completely loaded. Many microformat parsers work server-side.
> If there are only a handful of Microformats and they are all well-
> known,
> (and we have effectively hijacked the "class" default namespace),
> then the
> processing should be manageable.
It is manageable. It's just not worth doing because:
1) the whole document is already downloaded, which is the largest burden
2) <head>s lie.
> But if there are thousands or tens of thousands of Microformats--
> and yes, I
> know this presumption is at odds with some of the expectations
> behind a
> socially moderated namespace--in that scenario, it is easy to
> calculate the
> difference of running a single attribute check for "microformat"
> instead of
> checking against the entire Microformats space.
>
> This was what I meant when I asked "How do Microformats scale?"
Microformats scale by re-use. Thousands or tens of thousands of
microformats is an anti-goal.
> I don't believe we are in the latter situation where we need tight
> coordination as in a protocol.
We need tight coordination as in a dictionary. A formal definition
of a shared lexicon is what allows us to communicate with new
symbols. You can use whatever class names you want, but if I don't
know what they mean, I can't parse them, and a profile doesn't tell
me what they mean, it just tells me whether they follow a certain
syntactic structure. Profiles are like a grammar check, but we still
need a dictionary.
> Instead, what we need is a simple way for
> human authors to say "This is what I mean".
Profiles don't do that. No technology does that. That's an
incredibly complex problem that no one has yet solved. When they do,
we'll have usable machine translation, artificial intelligence, and
microformats will be the least of our worries.
> There is value in forging a tight class of well understood, easily
> human
> authored, semantic tags. However, Allowing rich variation on the
> existing
> classes doesn't "split" the community--the community is the social
> network,
> not the semantic space.
In practice, social networks require shared understanding of what
things mean. The lack of this shared understanding leads to civil
war in the real world, and unused specs in the tech world.
> Instead, it allows exploration and differentiation,
> which ultimately can be incorporated back into the foundation
> classes. More
> importantly, it allows user-driven innovation.
You can already explore and develop your own specs. But if you want
someone else to understand them, you have to explain to them what the
spec means in clear human language. Machines don't understand.
People understand. Clear human language is easier to accomplish in
community than in isolation.
> I think it is hubris to expect that the first adopted version of a
> microformat is the orthodox way to do it and that variations are
> heresy.
It would be if microformats, or dictionaries, did much more than
document and formalize existing use. Do you find hubris in
dictionaries as well? Who is this Webster to tell us what "dog"
means? He's someone who documented how a lot of people use the word
"dog" and wrote it down in a dictionary, just like we're documenting
how a bunch of people mark up citations, and writing it down in a wiki.
> If our mantra includes basing our developments on real-world
> examples, then how does the spec evolve if we don't have real-world
> examples
> of derivative implementations?
We have a web full of real-world data publishing examples.
> Without variations, we risk stagnation.
No one is preventing anyone from using whatever class names they
want. No one is preventing anyone from telling others what their
class names mean. But no one has invented a technology to automate
shared meaning. We can't use it because it doesn't exist.
> I think the type of disambiguation I am talking about can be
> addressed with
> a simple microformat="profile" attribute.
Have you looked at profile URIs?
http://microformats.org/wiki/profile-uris
That accomplishes exactly what your microformat="profile" would,
except it's valid XHTML. But neither accomplish shared meaning,
which despite great effort, is still a human problem that requires
human solutions.
Peace,
Scott
More information about the microformats-discuss
mailing list