[uf-discuss] haudio contributor
Guillaume Lebleu
guillaume at lebleu.org
Mon Feb 4 15:48:39 PST 2008
Andy Mabbett wrote:
> In message <47A73D8E.30406 at digitalbazaar.com>, Manu Sporny
> <msporny at digitalbazaar.com> writes
>
>> If you really want to make the distinction between a publisher, a
>> drummer, a singer, a technician, and someone else, you can always use
>> an hCard and utilize the "role" property
>
> That presumes that the roles are exposed in the page; they may be if
> or, say a producer, but often using the verb ("produced by..."), and
> frequently are not, We don't need to say that Beethoven is a composer,
> when saying "Beethoven's fifth". That's clear to a human (well, mist
> humans of any western education!) from context; but not to a machine.
>
> Before anyone cries "hidden metadata", how often to we explicitly say
> that "Mabbett" is my family name?, or that "21 High street" is a
> street address?
>
I agree with others that these are edge cases for microformats.
I don't think you are correct when you say that only a human can infer
Beethoven--(composerOf)-->fifth, from "Beethoven's fifth". As far as
I've seen in other more lucrative domains than music, a well-trained
semantic software extractor working off sufficient content, plain old
grammatically-correct english and music metadata would do that job with
less sweat than an editor will take to write the content and mark it up
in hAudio or something else (not to say to come up with the markup that
works in these edge cases in the first place). Grammatically-correct
english IS semantic markup, in a way.
I think microformats' sweet spot is easing semantic extraction in cases
where the level of structure is high, and the plain english context is
low. The back of an album that lists tracks is such a case, its entry in
Gracenote, a list of friends, electronic business cards, etc. are good
examples as well. A plain english critics' review of an album on the
other hand with lots of context, but little structure is a case that is
economically much better handled using semantic analysis than with "$1M
markup".
I'm not saying that microformats should not try to make formats that
work with plain old English or natural language (I've been trying
myself), I'm just saying that we may consider the fact that the ROI will
most likely be low and other technologies will compete better there, so
we might just focus our time on where we have the biggest chance of
straightforward adoption, then only look at solving the plain english
cases, instead of trying to solve everything at once.
Guillaume
More information about the microformats-discuss
mailing list