focus of microformats (was: Re: [uf-discuss] haudio contributor)

Guillaume Lebleu guillaume at lebleu.org
Tue Feb 5 12:02:07 PST 2008


Andy Mabbett wrote:
>
> Everything is an edge case, depending on which point you're looking from.

I'm conceding that I'm looking at these natural language examples from a 
particular perspective, the economic one, to decide what is an edge case 
or not, and that I'm just assuming that this economic perspective is the 
most important perspective to have when deciding where to focus the 
discussions.

In particular, I'm looking at the following costs:
- costs for the community to define a microformat that support these 
natural language, unstructured cases (this encompasses the include 
discussion).
- costs for a human editor to understand the microformat and implement it.
- costs for a software developer to implement a microformat parser.

 From what I have seen, these costs are high for the natural language 
examples in general, whether it's for hCard or hAudio. For other 
examples (structured content) these costs are low. The value is the same 
in both cases. Note that because these costs relate to unstructured 
content, they only apply to compound microformats which by nature imply 
structure, not elementary microformats. For instance, a microformat for 
a person's name (fn) is relatively easy to define, use and implement. 
Same thing for a money amount. A microformat for a person's complete 
contact information that works in all cases, not just the seminal blog's 
about box, has in comparison much higher costs of definition, use and 
implementation. A microformat for a resume, to the extent that a resume 
is highly structured is possible to define, use and implement. In 
comparison, a bio with the same content would most likely not be as easy.
>
> how many of us have access to "a well-trained semantic software 
> extractor", and what "music metadata" is widely used?

It's not because I don't have something that others don't have it, and 
that I should ignore the fact that they have it in my decisions.

Obviously, semantic extraction from the Web is of utmost importance for 
a lot of organizations, some with lots of resources, some with less, but 
all operating under the same economic rule: how can we lower the costs 
of understanding what people put on the Web with no/minimum costs/change 
of habits on their end. We all compete, and hopefully not a single one 
will win but some will be more successful for some classes of problems 
than others. Knowing what class of problems microformats are most likely 
to compete on is to me very important for the maximizing the returns on 
our time spent here.

>
> By your argument, we wouldn't need microformats at all.

No. As I mentioned already, the costs listed above are the smallest in 
the case of structured content with little context. "Guillaume Lebleu. 
T(W): 415 408 5856" has less context and more structure than "My name is 
Guillaume Lebleu. My phone number at the office is 408-5856. My area 
code is 415.". The first example is a charm to microformat, the second 
one is less so. A semantic extractor will most likely do a poor job in 
first example and will do a better job at understanding the second example.
As a result, it is my opinion that if microformats were officially 
focusing on structured content publishing (most known as blogging/social 
networking), we would have less discussions and probably more microformats.

>
> If that's where you want to concentrate your use of microformats, 
> that's fine, but that's not how I see them, and I see nothing in any 
> of the specs or other defining documentation which restricts them in 
> that way.

There are no written rules as of today, and in theory we don't need 
such. But I've seen a lot of discussions and time spent on cases that 
don't make economic sense. The "it's on the Web, so it's relevant for 
microformats" was excellent to avoid the known pitfalls of standard 
organzation ("what if?") but also opened the can of worms in my opinion. 
It should in my opinion be: "it's elementary pieces of data or 
structured content on the Web, so it's relevant for microformats", or 
something similar.

> I think that's an opinion - a restrictive one at that - not shared by 
> everyone here, certainly not by me, and not supported by past 
> experience of developing and using microformats.
>
Restriction is the negative name of focus. Focus is key to success. I've 
yet to see someone successful at "boiling the ocean".
But I indeed don't know to what extent this opinion, or similar, is 
shared in this community. I only know you disagree with it. I'd be glad 
to see it put for a vote.

Guillaume


More information about the microformats-discuss mailing list