[microformats-discuss] Microformat for timestamp of updated content

Robert Bachmann rbach at rbach.priv.at
Mon Aug 22 18:41:12 PDT 2005


Mark Rickerby wrote:
>>If some people would use 'class="last-modified"' to mark up the last
>>modifcation of a page, and other people would use
>>'class="last-modified"' to mark up the last modification of an
>>item/entry how could parsers recognize if the whole page or a specific
>>item is meant?
>>

> It depends what problem the parser is trying to solve (hopefully not
> some mythic general case, because that would be a nightmare). 
The hypothetical parser is solely trying to extract the last
modification 'datetime' of a page.

> The nesting of container elements should go some way into making such a
> distinction relatively recognizable.
> 
> For example...
> 
> <div id="content">
>   <div class="article">
>      <abbr class="last-modified" title="[ISO8601datetime]">22 Aug</abbr>
>   </div>
>   <abbr class="last-modified" title="[ISO8601datetime]">23 Aug</abbr>
> </div>
> 
> Individual article dates can be recognized here because of the parent
> article elements, thus the later date would clearly belong to the page
> content itself.

It would be possible for your example.

So our hypothetical parser does the following:
 1. Find all <abbr> element where class="last-modified"
    and title represents a 'datetime'.
 2. Store the nesting levels and the dates in some structure.
 3. After the whole document is parsed, pick the one with the lowest
    nesting level.

But what about those examples?

<div id="content">
 <div class="article">
  <abbr class="last-modified" title="[ISO8601datetime]">22 Aug</abbr>
 </div>
 <div class="footer">
  <abbr class="last-modified" title="[ISO8601datetime]">23 Aug</abbr>
 </div>
</div>

 ---

<div id="content">
 <div class="article">
  <abbr class="last-modified" title="[ISO8601datetime]">22 Aug</abbr>
 </div>
 <div class="footer">
  <p>This page was written by John Doe.</p>
  <p>This page was last updated on <abbr class="last-modified"
title="[ISO8601datetime]">23 Aug</abbr></p>
 </div>
</div>

In the first case we have two elements with the same nesting level - so
which one is the right one?

In the second case the "page date element" has a greater nesting
level than the "article/item date element".

So our hypothetical parser would get both cases wrong.

(dirty) solution:

Let's suppose: The last modification date of a page is the
last modification date of the <html> element.

Since <html> is the parent node of each other HTML elements and common
sense tells us that children can't be older than their parents,
we simply could use the datetime with the highest value (the latest date).


This way we could use 'class="last-modified"' for a whole page
and for the items/entries of a page.

Call me ignorant, but I still don't see the benefits of _not_ using
'class="page-last-modified"' (or 'id="page-last-modified"') for
page-level timestamps and 'class="last-modified"' for item-level timestamps.

> Though I don't know why an author would want to display dates like
> this unless there was substantial content on the page in addition to
> the list of articles.
The most common real-world usage of last modified timestamps for
individual items I know is in forums which allow users to edit their posts.



BTW:
I was curious how hard it would be to teach MediaWiki to include
a machine-readable date, so I've written a simple patch [1] for
Mediawiki 1.4.x.

See http://wiki.4any.org/MediaWikiExample/ for a live demonstration.


Robert

[1] http://rbach.priv.at/Misc/20050822T2230.diff
-- 
Robert Bachmann <rbach at rbach.priv.at> (OpenPGP KeyID: 0x4A5CCF10)



More information about the microformats-discuss mailing list