[uf-new] hAudio Issue Duration
martin at weborganics.co.uk
Mon Aug 4 08:40:33 PDT 2008
Brian Suda wrote:
> 2008/8/4, Martin McEvoy <martin at weborganics.co.uk>:
>> There are (in my view) only three real ways to resolve this issue
>> 3 Support NLP (Natural Language Processing) <span class="duration">3
>> minutes 23 seconds</span>.
>> I personally am in favour of number 3 as I believe it is not too difficult
>> to build a parser that will process just durations (hours minutes seconds)
>> as long as there is an agreed format.
> --- any sort of NLP is much harder than you think! If we are back to
> codifying something, either we build it in english (which people would
> disagree with) or having an list of all known way to spell, decline
> and abbreviate hours in all known human languages. Is is very much a
> boiling the oceans solution.
I disagree (slightly) consider this
<span class="duration">1 hour 3 minutes 23 seconds</span>
<span class="duration">1 heures 3 minutes 23 secondes</span>
The parser already knows that this is a duration and the contents are a
Numerical value and thus text (words) are striped as they are nothing
to do with the value, they are only there for a human to understand
would leave us with...
<span class="duration">1 3 23</span>
as long as we we know what format this is supposed to represent , the
first number is an hour, the second minutes, and third seconds and this
is documented as a decided format then it would be fairly straight
forward after that to output any format you like.
More information about the microformats-new