[uf-discuss] Human and machine readable data format
Toby A Inkster
mail at tobyinkster.co.uk
Fri Jul 11 01:38:13 PDT 2008
Paul Wilkins wrote:
> We should leverage the computers ability to do the hard work for us.
> <p>Date <span class="date">Friday, July the 11th 2008</span></p>
As I've said before, although my parser does support dates in this
format, I strongly recommend *not* allowing these per spec, as it
will lead to unpredictable and inconsistent results.
Yes, many programming languages do have libraries to do natural
language parsing of dates, but these all differ subtly in what
formats they support, how they interpret certain ambiguous dates, and
how well they internationalise. e.g. I know that Perl's
DateTime::Format::Natural, while it can perform very sophisticated
parsing ("Saturday evening 3 months ago" => 2008-05-12T19:00:00,
"thursday morning last week" => 2008-07-03T09:00:00) only includes
English in the distributed module (though it has hooks allowing
support for other languages). PHP's strtotime function is English
only too, and there are differences in how it interprets some natural
language dates, not just with Perl, but between different versions of
PHP.
Natural language parsing is really not the way to go, nor is a
limited range of date formats that *look* like NLP, because
publishers will believe them to *be* NLP and start publishing in any
old date format. ISO8601 is what we must stick with - we just must
agree a better way of embedding it than <abbr>.
--
Toby A Inkster
<mailto:mail at tobyinkster.co.uk>
<http://tobyinkster.co.uk>
More information about the microformats-discuss
mailing list