datetime-design-pattern: Difference between revisions
m (→Current uses: Changed to YYYY-MM-DDTHH:MM:SS+ZZ:ZZ) |
DavidJanes (talk | contribs) (Add RE for parsing datetime) |
||
Line 141: | Line 141: | ||
== Code == | |||
The following regular expression (parsed VERBOSE) should break apart a datetime and cover many lightly broken cases seen in the wild. This has been tested under Python. | |||
<pre><nowiki> | |||
^ | |||
(?P<year>\d\d\d\d) | |||
([-])?(?P<month>\d\d) | |||
([-])?(?P<day>\d\d) | |||
( | |||
(T|\s+) | |||
(?P<hour>\d\d) | |||
( | |||
([:])?(?P<minute>\d\d) | |||
( | |||
([:])?(?P<second>\d\d) | |||
( | |||
([.])?(?P<fraction>\d+) | |||
)? | |||
)? | |||
)? | |||
)? | |||
( | |||
(?P<tzzulu>Z) | |||
| | |||
(?P<tzoffset>[-+]) | |||
(?P<tzhour>\d\d) | |||
([:])?(?P<tzminute>\d\d) | |||
)? | |||
$ | |||
</nowiki></pre> | |||
== See Also == | == See Also == |
Revision as of 14:05, 6 December 2005
This page is a draft.
Datetime Design Pattern
This is a page for exploring a datetime design pattern.
Purpose
- Use the datetime-design-pattern to make datetimes that are human readable also formally machine readable.
Practical Need
- This design pattern arose as a result of solving the practical need for human readable dates for hCalendar.
How to use it
- enclose the human-friendly datetime that you want to make machine readable with
<abbr>
- as per the class-design-pattern, add the appropriate
class
attribute to theabbr
element - add a
title
attribute to theabbr
element with the machine readable ISO8601 datetime or date as the value
Current uses
The pattern which is now used in hCalendar and hReview is something like this:
<abbr class="foo" title="YYYY-MM-DDTHH:MM:SS+ZZ:ZZ">Date Time</abbr>
where foo is the semantic classname which is being applied to this date/time, the title of the <abbr> is an ISO 8601 date/time and "Date Time" is a human-friendly representation of the same date/time.
Profile of ISO8601
We recommend that any microformat using the date-time-design pattern use a profile of ISO8601. There are currently two widely used profiles which SHOULD be reused.
- RFC 3339
- W3C Note on Datetimes
Discussion
This pattern is likely to be highly resuable.
Can this not be viewed as a microformat in itself?
It could, but inventing a microformat for the sake of inventing a microformat is against the microformat principles. If there is a specific real world problem (and uses cases) that such an elemental microformat would solve, then it would be worth considering.
Until then it is best to keep the <abbr> datetime concept merely as a microformat design pattern, to be used in _actual_ microformats that have a demonstrated practical need.
-- Tantek
Excerpt from #microformats Aug 18th. Please edit!
Aug 18 15:16:14 <Tantek> DanC, what do you think of RFC3339? Aug 18 15:17:14 <Tantek> ISO8601 subset Aug 18 15:17:19 <DanC> Date and Time on the Internet: Timestamps http://www.ietf.org/rfc/rfc3339.txt Aug 18 15:17:30 <DanC> Klyne is a good guy. I wonder if I talked with him about this. Aug 18 15:17:32 <Tantek> compat with W3C-NOTE-DATETIME Aug 18 15:17:50 <Tantek> compat with xsd:dateTime Aug 18 15:17:57 <Tantek> it's a strict intersection subset Aug 18 15:17:59 <DanC> I consider W3C-NOTE-DATETIME obsoleted by XML Schema datatype-- yeah.. xsd:dateTime Aug 18 15:18:32 <Tantek> compare/contrast normatively using xsd:dateTime vs. RFC3339 Aug 18 15:18:41 <Tantek> note: Atom 1.0 chose RFC3339 Aug 18 15:18:50 <Tantek> i would like input from the microformats community on this Aug 18 15:19:27 <DanC> in what context are you evaluating RFC 3339? Aug 18 15:19:28 <jcgregorio> http://bitworking.org/news/Date_Constructs_in_the_Atom_Syndication_Format Aug 18 15:21:24 <DanC> which microformat is the question coming from, Tantek ? Aug 18 15:23:31 <DanC> " The grammar element time-second may have the value "60" at the end of Aug 18 15:23:31 <DanC> months in which a leap second occurs" The XML Schema WG is in the 27th level of leap-second-hell for the past few months, I gather. Aug 18 15:24:21 <DanC> yeah... here's the scary bit: " Leap seconds cannot be predicted far into the future. The Aug 18 15:24:21 <DanC> International Earth Rotation Service publishes bulletins [IERS] that Aug 18 15:24:21 <DanC> announce leap seconds with a few weeks' warning." Aug 18 15:26:03 <Tantek> DanC, which microformats? any/all that use datetime fields. Aug 18 15:26:36 <DanC> hard to give useful advice, then. Aug 18 15:26:58 <DanC> I expect they'll use datetime fields for different things that have different cost/benefit trade-offs Aug 18 15:27:26 <DanC> do you know of any particular differences that matter to anybody? Aug 18 15:56:43 <KragenSitaker> RFC3339 suggests -07:00, which seems like an improvement over -0700 anyway Aug 18 15:56:49 <Tantek> Kragen, agreed Aug 18 15:57:01 <Tantek> RFC3339 is certainly preferable to the ISO8601 subset in iCalendar Aug 18 16:05:57 <DanC> Tantek's right, Kragen; iCalendar looks like it solves the local timezone problem but doesn't. Aug 18 16:06:14 <DanC> and it's true that there's no standard solution to the local timezone problem Aug 18 16:06:39 <Tantek> so instead of appearing to solve the problem but not solving it, we chose to provide the ability to *approximate* the local timezone using e.g. "-07:00" Aug 18 16:06:49 <DanC> the simplest thing is to have people use Z time in hCalendar. But I gather that's unacceptably unusable? Aug 18 16:07:35 <Tantek> DanC, yes, the simplest thing is to have everyone use UTC Z Aug 18 16:07:38 <Tantek> However Aug 18 16:07:50 <Tantek> it is not *nearly* as usuable/verifiable Aug 18 16:07:55 <Tantek> as -07:00 etc. Aug 18 16:08:02 <Tantek> hence the decision to go with the latter Aug 18 16:08:12 <Tantek> some degree of human verifiability is important here Aug 18 16:14:21 <Tantek> DanC, my perception is that RFC3339 is a subset Aug 18 16:17:00 <DanC> time-numoffset = ("+" / "-") time-hour ":" time-minute Aug 18 16:17:34 <DanC> ok, then I can't see any differences. (modulo recent leap seconds issues that may affect xsd:dateTime ) Aug 18 16:18:07 <Tantek> would be interesting to know why Atom 1.0 chose RFC3339 over xsd:dateTime Aug 18 16:18:21 <Tantek> if there was a "real" reason or if it was arbitrary / coin-flip.
Here's an exhaustive comparison from ndw. I think xsd:dateTime also allows unqualified local times, while RFC3339 allows only UTC with no known timezone (-00:00). In the end, Atompub followed the advice of Sam Ruby and Scott Hollenbeck, our area director. Atom dates make some additional restrictions on RFC3339, such as uppercase T and Z characters for compatibility with xsd:dateTime, RFC3339, W3C-DTF, and ISO8601. --Robert Sayre
Aug 18 16:18:43 <KragenSitaker> rfc3339 is pretty short. Aug 18 16:19:36 <Tantek> DanC, BTW, which came first? REC for xsd:dateTime or RFC3339? Aug 18 16:19:50 <DanC> RFC3339 is dated July 2002 ... Aug 18 16:19:54 <KragenSitaker> Right --- and you might be able to understand xsd:dateTime without reading all of xml schema, you wouldn't be confident of it Aug 18 16:20:25 <DanC> W3C Recommendation 28 October 2004 ... but that's 2nd ed... Aug 18 16:20:47 <DanC> W3C Recommendation 02 May 2001 Aug 18 16:22:10 <DanC> I don't see a BNF in http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#dateTime ... Aug 18 16:22:43 <KragenSitaker> yeah, appendix D of the current xml schema datatypes document seems a little scanty, actually Aug 18 16:23:28 <DanC> ah... 2nd ed of http://www.w3.org/TR/xmlschema-2/#date is much more explicit about syntax. Aug 18 16:23:30 <KragenSitaker> it's 1100 words but still doesn't give any examples Aug 18 16:23:35 <DanC> still, it's given in prose and not BNF Aug 18 16:24:17 <KragenSitaker> sections 3.2.9 through 3.2.14 seem to be the relevant ones around #date Aug 18 16:24:29 <KragenSitaker> which is another 2200 words Aug 18 16:24:42 <DanC> wow... they changed the canonical form of date from always-Z to timezone-allowed between 1st edition and 2nd edition Aug 18 16:25:01 <Tantek> Kragen, DanC, these are very good analyses Aug 18 16:25:21 <Tantek> could I ask you to summarize the pros/cons for each in a new section at end of http://microformats.org/wiki/datetime-design-pattern Aug 18 16:25:22 <Tantek> ? Aug 18 16:25:58 <KragenSitaker> rfc 3339 is 4000 words, excluding the last two pages of boilerplate. Aug 18 16:26:31 <KragenSitaker> so it's actually longer than the datetime-relevant parts of XSD but it seems much more rigorous and clear Aug 18 16:28:37 <DanC> my advice is: normatively cite both, and claim they specify the same syntax, and let anybody who discovers otherwise send you a bug report with a test case Aug 18 16:29:12 <KragenSitaker> danc: nice hack
The RFC3339 has a mandatory TIME portion of the DATE-TIME. Some vCard/iCalendar DATE-TIME stamps can omit the TIME. For instance, DTSTART, if that is a full day event, then you can omit the time. BDAY in vCard can be respresented by only a DATE. I like the idea of restricting the possible date formats, but i think that TIME should be optional, which it isn't in RFC3339. - brian suda
RFC 3339 allows lowercase 't' and 'z' while XSD doesn't. Specifying RFC 3339 plus 'T' and 'Z' MUST be caps will make them the same. - Joe Gregorio
Code
The following regular expression (parsed VERBOSE) should break apart a datetime and cover many lightly broken cases seen in the wild. This has been tested under Python.
^ (?P<year>\d\d\d\d) ([-])?(?P<month>\d\d) ([-])?(?P<day>\d\d) ( (T|\s+) (?P<hour>\d\d) ( ([:])?(?P<minute>\d\d) ( ([:])?(?P<second>\d\d) ( ([.])?(?P<fraction>\d+) )? )? )? )? ( (?P<tzzulu>Z) | (?P<tzoffset>[-+]) (?P<tzhour>\d\d) ([:])?(?P<tzminute>\d\d) )? $
See Also
- All microformat design patterns
- abbr-design-pattern is used by datetime-design-pattern
- HTML 4.01 definition of
<abbr>
element - RFC 3339: Date and Time on the Internet: Timestamps
- W3C: Note on Datetimes
- Markus Kuhn: A summary of the international standard date and time notation
- Wikipedia: ISO 8601