[uf-discuss] haudio contributor

Guillaume Lebleu guillaume at lebleu.org
Mon Feb 4 15:48:39 PST 2008

Andy Mabbett wrote:
> In message <47A73D8E.30406 at digitalbazaar.com>, Manu Sporny 
> <msporny at digitalbazaar.com> writes
>> If you really want to make the distinction between a publisher, a 
>> drummer, a singer, a technician, and someone else, you can always use 
>> an hCard and utilize the "role" property
> That presumes that the roles are exposed in the page; they may be if 
> or, say a producer, but often using the verb ("produced by..."), and 
> frequently are not, We don't need to say that Beethoven is a composer, 
> when saying "Beethoven's fifth". That's clear to a human (well, mist 
> humans of any western education!) from context; but not to a machine.
> Before anyone cries "hidden metadata", how often to we explicitly say 
> that "Mabbett" is my family name?, or that "21 High street" is a 
> street address?
I agree with others that these are edge cases for microformats.

I don't think you are correct when you say that only a human can infer 
Beethoven--(composerOf)-->fifth, from "Beethoven's fifth". As far as 
I've seen in other more lucrative domains than music, a well-trained 
semantic software extractor working off sufficient content, plain old 
grammatically-correct english and music metadata would do that job with 
less sweat than an editor will take to write the content and mark it up 
in hAudio or something else (not to say to come up with the markup that 
works in these edge cases in the first place). Grammatically-correct 
english IS semantic markup, in a way.

I think microformats' sweet spot is easing semantic extraction in cases 
where the level of structure is high, and the plain english context is 
low. The back of an album that lists tracks is such a case, its entry in 
Gracenote, a list of friends, electronic business cards, etc. are good 
examples as well. A plain english critics' review of an album on the 
other hand with lots of context, but little structure is a case that is 
economically much better handled using semantic analysis than with "$1M 

I'm not saying that microformats should not try to make formats that 
work with plain old English or natural language (I've been trying 
myself), I'm just saying that we may consider the fact that the ROI will 
most likely be low and other technologies will compete better there, so 
we might just focus our time on where we have the biggest chance of 
straightforward adoption, then only look at solving the plain english 
cases, instead of trying to solve everything at once.


More information about the microformats-discuss mailing list