datetime-design-pattern: Difference between revisions
|  (→Machine-data in class:  already discussed/rejected *many times*. violates existing modern web designer use of class names. totally opposed.) |  (drafted proposal/brainstorm "date and time separation using value excerption" summary and simple example. more to come, documenting from IRC logs (link noted inline).) | ||
| Line 270: | Line 270: | ||
| * The data-X class must be found on the same element as the microformat property class. That is, you cannot use: <pre><nowiki><span class="dtstart"><span class="data-20051010T10:10:10-0100">10 o'clock</span></span></nowiki></pre> | * The data-X class must be found on the same element as the microformat property class. That is, you cannot use: <pre><nowiki><span class="dtstart"><span class="data-20051010T10:10:10-0100">10 o'clock</span></span></nowiki></pre> | ||
| <ul><li>Multiple data-X classes may occur on the same element. When these are found, the longest string is used. This allows for: <pre><nowiki><span class="dtstart data-2005 data-200510 data-20051010">The 10th</span></nowiki></pre> which may be useful for styling or other non-microformat purposes.</li></ul> | <ul><li>Multiple data-X classes may occur on the same element. When these are found, the longest string is used. This allows for: <pre><nowiki><span class="dtstart data-2005 data-200510 data-20051010">The 10th</span></nowiki></pre> which may be useful for styling or other non-microformat purposes.</li></ul> | ||
| === date and time separation using value excerption === | |||
| In short, by specifying a more precise parsing of the use of "value" excerption inside all datetime properties (e.g. dtstart, dtend, published, updated etc.), dates and times can be marked up separately, thus reducing/minimizing (and potentially eliminating) the readability issues that come with compound ISO8601 datetimes. | |||
| Example: | |||
| The sentence: | |||
| <pre><nowiki> | |||
|  The weekly dinner is tonight at 6:30pm. | |||
| </nowiki></pre> | |||
| would be marked up as: | |||
| <pre><nowiki> | |||
|  The weekly dinner is <span class="dtstart"><abbr class="value" title="2008-06-24">tonight</abbr>  | |||
|  at <abbr class="value" title="18:30">6:30pm</abbr></span>. | |||
| </nowiki></pre> | |||
| (more to come, documenting from [http://rbach.priv.at/Microformats/IRC/2008-06-24#T161740 IRC logs]) | |||
| Some requirements which enhance both human readability, ''and'' machine parsability (best of both) : | |||
| * date value excerpts MUST use hyphen separators. E.g. 2008-06-24.  Not ok:20080624. | |||
| * time value excerpts MUST use colon separators. E.g. 18:30:00.  Not ok:183000. | |||
| * timezone value excerpts MUST use leading plus or minus and colon separator. E.g. -07:00.  Not ok:-0700. | |||
| == See Also == | == See Also == | ||
Revision as of 18:45, 24 June 2008
Datetime Design Pattern
This is a page for exploring a datetime design pattern.
Purpose
- Use the datetime-design-pattern to make datetimes that are human readable also formally machine readable.
Practical Need
- This design pattern arose as a result of solving the practical need for human readable dates for hCalendar.
How to use it
- enclose the human-friendly datetime that you want to make machine readable with <abbr>
- as per the class-design-pattern, add the appropriate classattribute to theabbrelement
- add a titleattribute to theabbrelement with the machine readable ISO8601 datetime or date as the value
Current uses
The pattern, which is now available as part of hAtom, hCalendar, hCard and hReview, is:
- <abbr class="foo" title="YYYY-MM-DDTHH:MM:SS+ZZ:ZZ">Date Time</abbr> 
where foo is the semantic classname which is being applied to this date/time, the title of the <abbr> is an ISO 8601 date/time, with an appropriate level of specificity, and "Date Time" is a human-friendly representation of the same date/time.
An alternative, if you are using UTC-based timestamps, would be:
- <abbr class="foo" title="YYYY-MM-DDTHH:MM:SSZ">Date Time</abbr> 
with a single "Z" as per ISO 8601
Ruby: An easy way to get this format from a DateTime is this:
- DateTime.now.to_s 
Profile of ISO8601
Any microformat using the date-time-design pattern SHOULD use a profile of ISO8601. There are currently two widely used profiles which SHOULD be reused.
- RFC 3339
- W3C Note on Datetimes
Accessibility issues
Note: Some accessibility issues have been raised([1]) with Datetime Design Pattern, and concerns that its use could breach WCAG accessibility guidelines, that are being addressed as part of the abbr-design-pattern-issues discussion. Possible change recommendations may follow after the accessibility testing is complete. The accessibility concerns are considerably lessened, even eliminated when using the date-design-pattern, a subset of the datetime-design-pattern.
Discussion
This pattern is likely to be highly resuable.
Can this not be viewed as a microformat in itself?
It could, but inventing a microformat for the sake of inventing a microformat is against the microformat principles. If there is a specific real world problem (and uses cases) that such an elemental microformat would solve, then it would be worth considering.
Until then it is best to keep the <abbr> datetime concept merely as a microformat design pattern, to be used in _actual_ microformats that have a demonstrated practical need.
-- Tantek
Excerpt from #microformats Aug 18th. Please edit!
Aug 18 15:16:14 <Tantek>	DanC, what do you think of RFC3339?
Aug 18 15:17:14 <Tantek>	ISO8601 subset
Aug 18 15:17:19 <DanC>	        Date and Time on the Internet: Timestamps http://www.ietf.org/rfc/rfc3339.txt
Aug 18 15:17:30 <DanC>	        Klyne is a good guy. I wonder if I talked with him about this.
Aug 18 15:17:32 <Tantek>	compat with W3C-NOTE-DATETIME
Aug 18 15:17:50 <Tantek>	compat with xsd:dateTime
Aug 18 15:17:57 <Tantek>	it's a strict intersection subset
Aug 18 15:17:59 <DanC>	        I consider W3C-NOTE-DATETIME obsoleted by XML Schema datatype-- yeah.. xsd:dateTime
Aug 18 15:18:32 <Tantek>	compare/contrast normatively using xsd:dateTime vs. RFC3339
Aug 18 15:18:41 <Tantek>	note: Atom 1.0 chose RFC3339
Aug 18 15:18:50 <Tantek>	i would like input from the microformats community on this
Aug 18 15:19:27 <DanC>	        in what context are you evaluating RFC 3339?
Aug 18 15:19:28 <jcgregorio>	http://bitworking.org/news/Date_Constructs_in_the_Atom_Syndication_Format
Aug 18 15:21:24 <DanC>	        which microformat is the question coming from, Tantek ?
Aug 18 15:23:31 <DanC>	        "   The grammar element time-second may have the value "60" at the end of
Aug 18 15:23:31 <DanC>	        months in which a leap second occurs" The XML Schema WG is in the 27th level of
                                leap-second-hell for the past few months, I gather.
Aug 18 15:24:21 <DanC>	        yeah... here's the scary bit: "   Leap seconds cannot be predicted far into the
                                future.  The
Aug 18 15:24:21 <DanC>	        International Earth Rotation Service publishes bulletins [IERS] that
Aug 18 15:24:21 <DanC>	        announce leap seconds with a few weeks' warning."
Aug 18 15:26:03 <Tantek>	DanC, which microformats? any/all that use datetime fields.
Aug 18 15:26:36 <DanC>	        hard to give useful advice, then.
Aug 18 15:26:58 <DanC>	        I expect they'll use datetime fields for different things that have different
                                cost/benefit trade-offs
Aug 18 15:27:26 <DanC>	        do you know of any particular differences that matter to anybody?
Aug 18 15:56:43 <KragenSitaker>	RFC3339 suggests -07:00, which seems like an improvement over -0700 anyway
Aug 18 15:56:49 <Tantek>	Kragen, agreed
Aug 18 15:57:01 <Tantek>	RFC3339 is certainly preferable to the ISO8601 subset in iCalendar
Aug 18 16:05:57 <DanC>	        Tantek's right, Kragen; iCalendar looks like it solves the local timezone
                                problem but doesn't.
Aug 18 16:06:14 <DanC>	        and it's true that there's no standard solution to the local timezone problem
Aug 18 16:06:39 <Tantek>	so instead of appearing to solve the problem but not solving it, we chose to
                                provide the ability to *approximate* the local timezone using e.g. "-07:00"
Aug 18 16:06:49 <DanC>	        the simplest thing is to have people use Z time in hCalendar. But I gather
                                that's unacceptably unusable?
Aug 18 16:07:35 <Tantek>	DanC, yes, the simplest thing is to have everyone use UTC Z
Aug 18 16:07:38 <Tantek>	However
Aug 18 16:07:50 <Tantek>	it is not *nearly* as usuable/verifiable
Aug 18 16:07:55 <Tantek>	as -07:00 etc.
Aug 18 16:08:02 <Tantek>	hence the decision to go with the latter
Aug 18 16:08:12 <Tantek>	some degree of human verifiability is important here
Aug 18 16:14:21 <Tantek>	DanC, my perception is that RFC3339 is a subset
Aug 18 16:17:00 <DanC>	        time-numoffset  = ("+" / "-") time-hour ":" time-minute
Aug 18 16:17:34 <DanC>	        ok, then I can't see any differences. (modulo recent leap seconds issues that
                                may affect xsd:dateTime )
Aug 18 16:18:07 <Tantek>	would be interesting to know why Atom 1.0 chose RFC3339 over xsd:dateTime
Aug 18 16:18:21 <Tantek>	if there was a "real" reason or if it was arbitrary / coin-flip.
Here's an exhaustive comparison from ndw. I think xsd:dateTime also allows unqualified local times, while RFC3339 allows only UTC with no known timezone (-00:00). In the end, Atompub followed the advice of Sam Ruby and Scott Hollenbeck, our area director. Atom dates make some additional restrictions on RFC3339, such as uppercase T and Z characters for compatibility with xsd:dateTime, RFC3339, W3C-DTF, and ISO8601. --Robert Sayre
Aug 18 16:18:43 <KragenSitaker>	rfc3339 is pretty short.
Aug 18 16:19:36 <Tantek>	DanC, BTW, which came first? REC for xsd:dateTime or RFC3339?
Aug 18 16:19:50 <DanC>	        RFC3339 is dated July 2002 ...
Aug 18 16:19:54 <KragenSitaker>	Right --- and you might be able to understand xsd:dateTime without
                                reading all of xml schema, you wouldn't be confident of it
Aug 18 16:20:25 <DanC>	        W3C Recommendation 28 October 2004 ... but that's 2nd ed...
Aug 18 16:20:47 <DanC>	        W3C Recommendation 02 May 2001
Aug 18 16:22:10 <DanC>	        I don't see a BNF in http://www.w3.org/TR/2001/REC-xmlschema-2-20010502/#dateTime ...
Aug 18 16:22:43 <KragenSitaker>	yeah, appendix D of the current xml schema datatypes document seems
                                a little scanty, actually
Aug 18 16:23:28 <DanC>	        ah... 2nd ed of http://www.w3.org/TR/xmlschema-2/#date is much more
                                explicit about syntax.
Aug 18 16:23:30 <KragenSitaker>	it's 1100 words but still doesn't give any examples
Aug 18 16:23:35 <DanC>	        still, it's given in prose and not BNF
Aug 18 16:24:17 <KragenSitaker>	sections 3.2.9 through 3.2.14 seem to be the relevant ones around #date
Aug 18 16:24:29 <KragenSitaker>	which is another 2200 words
Aug 18 16:24:42 <DanC>	        wow... they changed the canonical form of date from always-Z to
                                timezone-allowed between 1st edition and 2nd edition
Aug 18 16:25:01 <Tantek>	Kragen, DanC, these are very good analyses
Aug 18 16:25:21 <Tantek>	could I ask you to summarize the pros/cons for each in a new section at
                                end of http://microformats.org/wiki/datetime-design-pattern
Aug 18 16:25:22 <Tantek>	?
Aug 18 16:25:58 <KragenSitaker>	rfc 3339 is 4000 words, excluding the last two pages of boilerplate.
Aug 18 16:26:31 <KragenSitaker>	so it's actually longer than the datetime-relevant parts of XSD but it
                                seems much more rigorous and clear
Aug 18 16:28:37 <DanC>	        my advice is: normatively cite both, and claim they specify the same
                                syntax, and let anybody who discovers otherwise send you a bug report
                                with a test case
Aug 18 16:29:12 <KragenSitaker>	danc: nice hack
The RFC3339 has a mandatory TIME portion of the DATE-TIME. Some vCard/iCalendar DATE-TIME stamps can omit the TIME. For instance, DTSTART, if that is a full day event, then you can omit the time. BDAY in vCard can be respresented by only a DATE. I like the idea of restricting the possible date formats, but i think that TIME should be optional, which it isn't in RFC3339. - brian suda
RFC 3339 allows lowercase 't' and 'z' while XSD doesn't. Specifying RFC 3339 plus 'T' and 'Z' MUST be caps will make them the same. - Joe Gregorio
---
A few questions: asked by CharlesBelov 16:57, 24 Apr 2007 (PDT), answered by JamesCraig on 15:58, 5 Jul 2007 (PDT).
- Would it make more sense for documenting the alternative codings pitting the abbr tag vs. other tags to be on this page? Answer: That documentation should go on the assistive-technology-abbr-results page.
- Would using the title attribute of the abbr tag to encode the machine-readable date in fact cause a failure of WCAG 2.0 Accessibility? What about USA Section 508? It does appear to violate Technique for WCAG 2.0 H28: Providing definitions for abbreviations by using the abbr and acronym elements, although that is a supporting document and does not have the force of a guideline. Answer: Yes, it appears that is in violation of WCAG, 508, et al, so alternatives are being discussed on the assistive-technology-abbr-results page.
- In order to maintain accessibility, would it make sense to enclose the machine-readable date in a span with a style of "display:none" instead of using the abbr tag? Answer: please refer to and add any suggestions to assistive-technology-abbr-results.
- For that matter, wouldn't you want to style such an abbr tag with text-decoration:none to hide that an abbr tag was used? Otherwise, visitors might cursor over the time, see the machine time, and be annoyed that their time was wasted or else be confused. And I don't think you can suppress the title from coming up if the human-readable time was inadvertently hovered. Answer: Microformats should not rely on CSS in order to work properly, but again, that discussion can be found here: assistive-technology-abbr-results.
Code
The following regular expression (parsed VERBOSE) should break apart a datetime and cover many lightly broken cases seen in the wild. This has been tested under Python.
 ^
 (?P<year>\d\d\d\d)
 ([-])?(?P<month>\d\d)
 ([-])?(?P<day>\d\d)
 (
  (T|\s+)
  (?P<hour>\d\d)
  (
   ([:])?(?P<minute>\d\d)
   (
    ([:])?(?P<second>\d\d)
    (
     ([.])?(?P<fraction>\d+)
    )?
   )?
  )?
 )?
 (
  (?P<tzzulu>Z)
  |
  (?P<tzoffset>[-+])
  (?P<tzhour>\d\d)
  ([:])?(?P<tzminute>\d\d)
 )?
 $
Other Proposals
strtime instructions as class names
Proposal by DavidLaban (alsuren on freenode) on 8 Jun 2008 It might be possible to have a slightly more readable/extensible/elegant format:
<span class="strtime format:_%d_%B_%Y_" > 16 March 1987 </span>
Notes:
- Underscores are used to replace whitespace, because otherwise the the formatting string will be split into an unordered set of class attributes by many parsers (thanks go to bogdanlazarsb and gsnedders on irc for explaining this to me).
- Some subset of the placeholders should be chosen from those which are supported by both python http://docs.python.org/lib/module-time.html and php http://uk3.php.net/manual/en/function.strftime.php
- A name for the class should be decided upon. strtime might not be the best name.
- Measures should be taken to avoid the format string accidentally conflicting with other valid classes (In the above example, I have prefixed it with the string "format:")
- It might be sensible (when parsing) to strip excess whitespace from the format string and contents. This is not done in this example.
- Example python code follows.
date = (1987,03,16,0,0,0,0,0,0)
format = " %d %B %Y "
# To encode:
classes = ["strtime"]
encoded_format = "format:" + format.replace(' ', '_')
classes.append(encoded_format)
content = time.strftime(format, birthday)
# ... dump classes and content into your document however you want
# To decode (assuming that you have managed to extract class and format from the document already):
if "strtime" in classes:
    possible_formats = [ item for item in classes if item.startswith('format:') ]
    assert len(possible_formats) == 1
    format = possible_formats[0].strip('format:').replace('_', ' ')
    date = time.strptime(content, format)
problems with strtime proposal
- Possible abuse of the class attribute. microformats limit the use of the class attribute to marking up additional semantics about the data, not for (potentially) arbitrary processing/programming instructions
- Requires authors to think like programmers. The larger problem is that the proposal asks web authors to think like programmers, which severely limits the number of web authors which will be able to use the technique, since the vast majority of web authors are not programmers and have never heard of "strtime", whereas most authors (even people) on the web have seen dates like 2005-06-20 and easily understand what they mean.
In general, any publishing method that requires the author to think like a programmer is a non-starter. It is a much more of a barrier than simply using ISO8601/RFC3339, and that barrier is a far worse tradeoff than the duplication / DRY violation compromise. Tantek 09:52, 8 Jun 2008 (PDT)
- Another problem: if %A/%a/%B/%b are allowed, this raises potential problems with internationalisation. Will parsers be required to understand the names and abbreviations for days and months in potentially hundreds of different languages? TobyInk 14:09, 8 Jun 2008 (PDT)
Machine-data in class
The BBC (uf-dev archive, 20/06/08, "Using class for non-human data") has proposed as an alternative to the empty span and title solution to use the class name in the following way:
<span class="dtstart data-20051010T10:10:10-0100">10 o'clock on the 10th</span>
Pros:
- Allows data to be represented in a "non-harmful" way. Will not be read aloud by screenreaders or seen as tooltips.
- Minimises mark-up used.
- Arguably more semantic than use of "title" attribute for non-human data.
Cons:
- data in the class attribute has already been discussed numerous times in the mailing list over the years and rejected and documented as an anti-pattern - captured on the wiki this past January 2008.
- Possible misuse of class attribute, although as noted previously, the HTML spec states "for general purpose processing by user agents".
- The class attribute has been adopted by the broader web design community to "subclass" element semantics, and to layer additional semantics. To date, microformats has followed this existing practice developed by modern web designers ("paving the cow-paths"). This use of class for data is outside all current practices.
Discussion:
- This proposal smells icky, but I can't quite put my finger on why. Considered objectively, it does seem to be the least harmful solution proposed so far. TobyInk 06:06, 21 Jun 2008 (PDT)
- I really like it, especially given the HTML4 spec gives this as an IMO perfectly valid use (on both id and class, with the following examples given in the id section: "identifying fields when extracting data from HTML pages into a database, translating HTML documents into other formats, etc."). Clean and simple. Dracos 03:53, 23 Jun 2008 (PDT)
- I suggest dropping the redundant 'data-' prefix, unless someone can suggest a feasible case with two time-stamps requiring different prefixes.  The proposal then becomes one I've made before AshSearle
- Valid class names cannot begin with a number, so a date needs some sort of letter prefix.  It's sensible to make this prefix meaningful and reusable in some way. Phae
- They can't in CSS, but they can in (X)HTML. http://examples.tobyinkster.co.uk/numeric-classes.html TobyInk 06:53, 24 Jun 2008 (PDT)
 
- Not to advocate too strongly for designing for parsers (generally a bad idea), *but* having a 'data-' prefix on a class name would make identifying data orders of magnitude easier for parsing. Otherwise, how do we know what's data and what's just another class name for some other purpose? Drew
- A 'data-' prefix would help authors tasked with maintaining or reviewing a page to understand the purpose of a class name that may have been applied by another author. The data prefix communicates very simply that the class name is precisely that, data. Therefore the value is less likely to be accidentally removed or changed, making for a more robust design. Drew
 
- Valid class names cannot begin with a number, so a date needs some sort of letter prefix.  It's sensible to make this prefix meaningful and reusable in some way. Phae
- -1 Tantek. I'm vehemently opposed to putting data in the class attribute. We *must* find better alternatives. We must not go down the path of invisible (dark) (meta)data - IMHO that principle is inviolable for microformats.
Experimental Parser Support
Cognition 0.1 alpha 10 will include experimental support for this pattern, and the Cognition web service already does. Notes:
- Support is opt-in. Publishers must explicitly request support for the pattern, by including a profile URI of http://purl.org/uF/pattern-data-class/1in their document head.
- Support is not limited to date-time properties, but any microformat properties.
- data-X classes must use percent-encoding to encode spaces and other characters not allowed in class names.
- The data-X class must be found on the same element as the microformat property class. That is, you cannot use: <span class="dtstart"><span class="data-20051010T10:10:10-0100">10 o'clock</span></span> 
- Multiple data-X classes may occur on the same element. When these are found, the longest string is used. This allows for: <span class="dtstart data-2005 data-200510 data-20051010">The 10th</span> which may be useful for styling or other non-microformat purposes.
date and time separation using value excerption
In short, by specifying a more precise parsing of the use of "value" excerption inside all datetime properties (e.g. dtstart, dtend, published, updated etc.), dates and times can be marked up separately, thus reducing/minimizing (and potentially eliminating) the readability issues that come with compound ISO8601 datetimes.
Example:
The sentence:
The weekly dinner is tonight at 6:30pm.
would be marked up as:
The weekly dinner is <span class="dtstart"><abbr class="value" title="2008-06-24">tonight</abbr> at <abbr class="value" title="18:30">6:30pm</abbr></span>.
(more to come, documenting from IRC logs)
Some requirements which enhance both human readability, and machine parsability (best of both) :
- date value excerpts MUST use hyphen separators. E.g. 2008-06-24. Not ok:20080624.
- time value excerpts MUST use colon separators. E.g. 18:30:00. Not ok:183000.
- timezone value excerpts MUST use leading plus or minus and colon separator. E.g. -07:00. Not ok:-0700.
See Also
- All microformat design patterns
- abbr-design-pattern is used by datetime-design-pattern
- date-design-pattern is a subset of datetime-design-pattern
- HTML 4.01 definition of <abbr>element
- RFC 3339: Date and Time on the Internet: Timestamps
- W3C: Note on Datetimes
- Markus Kuhn: A summary of the international standard date and time notation
- Wikipedia: ISO 8601