focus of microformats (was: Re: [uf-discuss] haudio contributor)
guillaume at lebleu.org
Tue Feb 5 12:02:07 PST 2008
Andy Mabbett wrote:
> Everything is an edge case, depending on which point you're looking from.
I'm conceding that I'm looking at these natural language examples from a
particular perspective, the economic one, to decide what is an edge case
or not, and that I'm just assuming that this economic perspective is the
most important perspective to have when deciding where to focus the
In particular, I'm looking at the following costs:
- costs for the community to define a microformat that support these
natural language, unstructured cases (this encompasses the include
- costs for a human editor to understand the microformat and implement it.
- costs for a software developer to implement a microformat parser.
From what I have seen, these costs are high for the natural language
examples in general, whether it's for hCard or hAudio. For other
examples (structured content) these costs are low. The value is the same
in both cases. Note that because these costs relate to unstructured
content, they only apply to compound microformats which by nature imply
structure, not elementary microformats. For instance, a microformat for
a person's name (fn) is relatively easy to define, use and implement.
Same thing for a money amount. A microformat for a person's complete
contact information that works in all cases, not just the seminal blog's
about box, has in comparison much higher costs of definition, use and
implementation. A microformat for a resume, to the extent that a resume
is highly structured is possible to define, use and implement. In
comparison, a bio with the same content would most likely not be as easy.
> how many of us have access to "a well-trained semantic software
> extractor", and what "music metadata" is widely used?
It's not because I don't have something that others don't have it, and
that I should ignore the fact that they have it in my decisions.
Obviously, semantic extraction from the Web is of utmost importance for
a lot of organizations, some with lots of resources, some with less, but
all operating under the same economic rule: how can we lower the costs
of understanding what people put on the Web with no/minimum costs/change
of habits on their end. We all compete, and hopefully not a single one
will win but some will be more successful for some classes of problems
than others. Knowing what class of problems microformats are most likely
to compete on is to me very important for the maximizing the returns on
our time spent here.
> By your argument, we wouldn't need microformats at all.
No. As I mentioned already, the costs listed above are the smallest in
the case of structured content with little context. "Guillaume Lebleu.
T(W): 415 408 5856" has less context and more structure than "My name is
Guillaume Lebleu. My phone number at the office is 408-5856. My area
code is 415.". The first example is a charm to microformat, the second
one is less so. A semantic extractor will most likely do a poor job in
first example and will do a better job at understanding the second example.
As a result, it is my opinion that if microformats were officially
focusing on structured content publishing (most known as blogging/social
networking), we would have less discussions and probably more microformats.
> If that's where you want to concentrate your use of microformats,
> that's fine, but that's not how I see them, and I see nothing in any
> of the specs or other defining documentation which restricts them in
> that way.
There are no written rules as of today, and in theory we don't need
such. But I've seen a lot of discussions and time spent on cases that
don't make economic sense. The "it's on the Web, so it's relevant for
microformats" was excellent to avoid the known pitfalls of standard
organzation ("what if?") but also opened the can of worms in my opinion.
It should in my opinion be: "it's elementary pieces of data or
structured content on the Web, so it's relevant for microformats", or
> I think that's an opinion - a restrictive one at that - not shared by
> everyone here, certainly not by me, and not supported by past
> experience of developing and using microformats.
Restriction is the negative name of focus. Focus is key to success. I've
yet to see someone successful at "boiling the ocean".
But I indeed don't know to what extent this opinion, or similar, is
shared in this community. I only know you disagree with it. I'd be glad
to see it put for a vote.
More information about the microformats-discuss