[uf-discuss] Human and machine readable data format
Dan Brickley
danbri at danbri.org
Fri Jul 11 01:47:35 PDT 2008
Toby A Inkster wrote:
> Paul Wilkins wrote:
>
>> We should leverage the computers ability to do the hard work for us.
>> <p>Date <span class="date">Friday, July the 11th 2008</span></p>
>
> As I've said before, although my parser does support dates in this
> format, I strongly recommend *not* allowing these per spec, as it will
> lead to unpredictable and inconsistent results.
>
> Yes, many programming languages do have libraries to do natural language
> parsing of dates, but these all differ subtly in what formats they
> support, how they interpret certain ambiguous dates, and how well they
> internationalise. e.g. I know that Perl's DateTime::Format::Natural,
> while it can perform very sophisticated parsing ("Saturday evening 3
> months ago" => 2008-05-12T19:00:00, "thursday morning last week" =>
> 2008-07-03T09:00:00) only includes English in the distributed module
> (though it has hooks allowing support for other languages). PHP's
> strtotime function is English only too, and there are differences in how
> it interprets some natural language dates, not just with Perl, but
> between different versions of PHP.
>
> Natural language parsing is really not the way to go, nor is a limited
> range of date formats that *look* like NLP, because publishers will
> believe them to *be* NLP and start publishing in any old date format.
> ISO8601 is what we must stick with - we just must agree a better way of
> embedding it than <abbr>.
Thank you for spelling this out so clearly. Please let's not slip into
treating the non-English-speaking Web as a corner case. ISO8601's the
thing. And it won't always be what the party reading the page expects
(either in terms of language, script or even calendar).
cheers,
Dan
-
http://danbri.org/
More information about the microformats-discuss
mailing list