Difference between revisions of "blog-post-brainstorming"

From Microformats Wiki
Jump to navigation Jump to search
m (Reverted edits by VilinOerca (Talk) to last version by Tantek)
 
(96 intermediate revisions by 14 users not shown)
Line 1: Line 1:
== Discussion Participants ==
+
= Discussion Participants =
  
=== Editors ===
+
== Editors ==
 
* [http://www.blogmatrix.com David Janes]
 
* [http://www.blogmatrix.com David Janes]
  
=== Authors ===
+
== Authors ==
 
* [http://www.blogmatrix.com David Janes]
 
* [http://www.blogmatrix.com David Janes]
 
+
* [http://tantek.com/ Tantek Çelik]
=== Contributors ===
+
* [[MikeTaylor|Mike Taylor]]
 
* [http://www.oreillynet.com Justin Watt]
 
* [http://www.oreillynet.com Justin Watt]
  
== Purpose ==
+
= Purpose =
 
The 'blog-post-microformat' proposes a codification of how blog posts are indentifies within weblogs. It is hoped that this should be considered to be 'expansive': for example, the proposal could be used on [http://www.cnn.com CNN.com] to mark up news articles and summary pages.
 
The 'blog-post-microformat' proposes a codification of how blog posts are indentifies within weblogs. It is hoped that this should be considered to be 'expansive': for example, the proposal could be used on [http://www.cnn.com CNN.com] to mark up news articles and summary pages.
  
=== Related Pages ===
+
= Terminology =
 
 
* [[blog-post-formats]] - many examples taken from the real world about how blog content is marked up
 
* [[blog-post]] - coming soon; a proposal for a microformat
 
* [[blog-description-format]] - how to describe a blog (as opposed to the individual entries, which is what we're doing here)
 
 
 
== Terminology ==
 
  
This section explores the terminology that should used to discuss a blog post microformat. To make it easier to talk about the various different types of teminology, We're using a XML-like namespace version so we can make statements like <code>atom:entry</code> is roughly equivalent to <code>rss20:feed</code>, <code>atom:feed/atom:link@rel=alternate</code> is roughly equivalent to <code>rss20:channel/rss20:link</code> or <code>atom:author</code> is '''not''' equivalent to <code>rss:entry/rss:author</code> (because RSS 2.0 is only the definition of an email address).
+
This section explores the terminology that should used to discuss a blog post microformat. To make it easier to talk about the various different types of teminology, We're using a XML-like namespace version so we can make statements like <code>atom:entry</code> is roughly equivalent to <code>rss20:feed</code>, <code>atom:feed/atom:link@relalternate</code> is roughly equivalent to <code>rss20:channel/rss20:link</code> or <code>atom:author</code> is '''not''' equivalent to <code>rss:entry/rss:author</code> (because RSS 2.0 is only the definition of an email address).
  
=== Common terminology in weblogs ===
+
== Common terminology in weblogs ==
  
 
Reviewing [[blog-post-formats#Tools]], one can see that there's little standardization amongst tools or even within a individual tool (such as 'blogger') for names of elements of blog posts. There are however many common elements, including:
 
Reviewing [[blog-post-formats#Tools]], one can see that there's little standardization amongst tools or even within a individual tool (such as 'blogger') for names of elements of blog posts. There are however many common elements, including:
Line 37: Line 31:
 
Furthermore, in developing a microformat for weblog posts, we want to be careful not to break any (or many) templates. Note that many weblog templates will have to be updated as they produce somewhat crufty HTML rather than shiny XHTML.
 
Furthermore, in developing a microformat for weblog posts, we want to be careful not to break any (or many) templates. Note that many weblog templates will have to be updated as they produce somewhat crufty HTML rather than shiny XHTML.
  
=== Atom Terminology ===
+
== Atom Terminology ==
  
 
See [http://www.atomenabled.org/ here] for the spec and [[blog-post-formats#Atom]] for analysis.
 
See [http://www.atomenabled.org/ here] for the spec and [[blog-post-formats#Atom]] for analysis.
Line 46: Line 40:
 
** <code>atom:title</code> - the title of an atom:entry or a atom:feed
 
** <code>atom:title</code> - the title of an atom:entry or a atom:feed
 
** <code>atom:updated</code> - the last time the feed was updated
 
** <code>atom:updated</code> - the last time the feed was updated
** <code>atom:link@rel=alternate</code> - the home page of a feed
+
** <code>atom:link@relalternate</code> - the home page of a feed
** <code>atom:link@rel=self</code> - the URI of the feed (where it can be downloaded)
+
** <code>atom:link@relself</code> - the URI of the feed (where it can be downloaded)
 
** <code>atom:entry</code> - (composite) an entry within the feed
 
** <code>atom:entry</code> - (composite) an entry within the feed
 
*** <code>atom:content</code> - the feed's content
 
*** <code>atom:content</code> - the feed's content
 
*** <code>atom:summary</code> - a summary of the feed's content
 
*** <code>atom:summary</code> - a summary of the feed's content
 
*** <code>atom:entry/link</code> - the permament URI of the entry
 
*** <code>atom:entry/link</code> - the permament URI of the entry
 +
*** <code>atom:published</code> - the time of the initial creation or first availability of the entry
  
=== RSS Terminology ===
+
== RSS 2.0 Terminology ==
  
 
See [http://blogs.law.harvard.edu/tech/rss here] for the spec and [[blog-post-formats#RSS]] for analysis. There are a lot more elements in RSS but this covers the most commonly used ones.
 
See [http://blogs.law.harvard.edu/tech/rss here] for the spec and [[blog-post-formats#RSS]] for analysis. There are a lot more elements in RSS but this covers the most commonly used ones.
Line 59: Line 54:
 
* <code>rss2:channel</code> - (composite) a collection of entries plus information about them
 
* <code>rss2:channel</code> - (composite) a collection of entries plus information about them
 
** <code>rss2:author</code> - (composite) the author of a feed (may contain atom:email, atom:name, atom:uri)
 
** <code>rss2:author</code> - (composite) the author of a feed (may contain atom:email, atom:name, atom:uri)
** <code>rss2:link</code> - The URL to the HTML website corresponding to the channel (compare to atom:link@rel=alternate)
+
** <code>rss2:link</code> - The URL to the HTML website corresponding to the channel (compare to atom:link@relalternate)
 
** <code>rss2:title</code> - the title of an rss2:channel or a rss2:item
 
** <code>rss2:title</code> - the title of an rss2:channel or a rss2:item
 
** <code>rss2:pubDate</code> - The publication date for the content in the channel.
 
** <code>rss2:pubDate</code> - The publication date for the content in the channel.
Line 67: Line 62:
 
*** <code>rss2:author</code> - email address of the author of the item
 
*** <code>rss2:author</code> - email address of the author of the item
  
=== Recommendation ===
+
== Recommendation ==
  
 
Atom has a much more precise mechanism for defining syndication feeds and weblog data. A mechanical transformation from Atom -> RSS will always lead to a correct RSS feed; a RSS -> Atom translation would have to make a decision amongst multiple definitions that may not always be correct. For example, the format of markup, the role of an author, or the meaning of a link.
 
Atom has a much more precise mechanism for defining syndication feeds and weblog data. A mechanical transformation from Atom -> RSS will always lead to a correct RSS feed; a RSS -> Atom translation would have to make a decision amongst multiple definitions that may not always be correct. For example, the format of markup, the role of an author, or the meaning of a link.
Line 73: Line 68:
 
IMPORTANT: we shall talk about things such as 'marking elements <code>atom:feed</code>'; consider this a purely conceptual thing. The text 'atom:feed' will not appear in the XHTML microformat -- we may decide later to use the actual phrase 'atom_feed', 'feed', 'items' or 'googlybear'. In the case where there is no clear or applicable atom terminology, we shall use 'weblog:xxx'.
 
IMPORTANT: we shall talk about things such as 'marking elements <code>atom:feed</code>'; consider this a purely conceptual thing. The text 'atom:feed' will not appear in the XHTML microformat -- we may decide later to use the actual phrase 'atom_feed', 'feed', 'items' or 'googlybear'. In the case where there is no clear or applicable atom terminology, we shall use 'weblog:xxx'.
  
== Discovered Elements ==
+
= Discovered Elements =
  
 
This section explores the information discovered from [[blog-post-formats]] using the terminology discussed above. We will only focus on the major elements of weblog posts:
 
This section explores the information discovered from [[blog-post-formats]] using the terminology discussed above. We will only focus on the major elements of weblog posts:
  
* the entry container
+
* the EntryGroup
* the individual entry
+
* the individual Entry
* the entry title
+
* the Entry Title
* the content
+
* the Entry Content
* the permalink
+
* the Entry Permalink
 +
* the Entry Datetimes
  
 
For now, the codification of the following major elements will be deferred as there is/may be overlap with other microformats that should be explored further
 
For now, the codification of the following major elements will be deferred as there is/may be overlap with other microformats that should be explored further
  
* the poster/author
+
* the EntryGroup Title
* the posting date
+
* the EntryGroup Permalink
* the modified date
+
* the Entry Poster/Author - in particular, should hcard be used?
  
 
Further input from the community would be appreciated here
 
Further input from the community would be appreciated here
  
=== Entry Container ===
+
== EntryGroup ==
 
 
Roughly speaking, this corresponds to 'atom:feed' or 'rss2:channel' (in particular, the items within those elements).
 
 
 
==== Forms seen in the wild ====
 
  
* entries are within a container; that is, all entries are within an enclosing 'div'. This is common with weblog home pages ([http://microformats.org/ example]) or archive with multiple entries.
+
Roughly speaking, this corresponds to 'atom:feed' or 'rss2:channel' (in particular, the items within those XML elements). See [[blog-post-examples#EntryGroup]] for the various forms seen in the wild.
* entries are not within a container; that is, there are multiple entries on a single page but there is no explicit container element ([http://thecommunityengine.com/home/ example]). This is also a common use case for weblogs and archives also.
 
* there may be multiple groups of entries on a single page that are tenously connected ([http://www.truthlaidbear.com/ example-1] [http://news.google.ca/nwshp?hl=en&tab=wn&q= example-2]).
 
* there is only a single entry on a page. This is common with weblogs that archive on a per entry basis ([http://www.microformats.org/blog/2005/09/30/web-essentials-audio/ example]).
 
  
==== Recommendation for blog-post-format microformat ====
+
=== Microformat Recommendation ===
  
 
* weblog pages (including home pages, archives, category pages, tag pages and so forth) that may container multiple entries MUST enclose the entries in a <code>atom:feed</code> element
 
* weblog pages (including home pages, archives, category pages, tag pages and so forth) that may container multiple entries MUST enclose the entries in a <code>atom:feed</code> element
 
* weblog pages MAY have multiple <code>atom:feed</code> element enclosing different groups of entries
 
* weblog pages MAY have multiple <code>atom:feed</code> element enclosing different groups of entries
 
* <code>atom:feed</code> elements MUST NOT be nested
 
* <code>atom:feed</code> elements MUST NOT be nested
* weblog pages that have exactly on entry MAY use the <code>atom:feed</code>
+
* weblog pages that have exactly one entry MAY use the <code>atom:feed</code>
 +
 
 +
=== Example Transformation ===
 +
 
 +
''Note that the string 'atom:feed' is a placeholder for something to be decided later.''
 +
 
 +
Original (obviously, if there is no existing EntryGroup block element, one can be added):
 +
 
 +
<pre><nowiki>
 +
<div id="content">
 +
<h2 id="home-title">
 +
  Latest microformats news
 +
  <a href="http://www.microformats.org/feed/" title="link to RSS feed" id="feed-link">
 +
  <img src="/img/xml.gif" width="23" height="13" alt="XML" />
 +
  </a>
 +
</h2>
 +
 
 +
<div class="entry">
 +
  <h3 id="post-60">
 +
  <a href="http://www.microformats.org/blog/2005/...">Wiki Attack</a>
 +
  </h3>
 +
  ...
 +
</div>
 +
 
 +
...
 +
</div>
 +
</nowiki></pre>
 +
 
 +
Transformed:
 +
 
 +
<pre><nowiki>
 +
<div id="content" class="atom:feed">
 +
<h2 id="home-title">
 +
  Latest microformats news
 +
  <a href="http://www.microformats.org/feed/" title="link to RSS feed" id="feed-link">
 +
  <img src="/img/xml.gif" width="23" height="13" alt="XML" />
 +
  </a>
 +
</h2>
 +
 
 +
<div class="entry">
 +
  <h3 id="post-60">
 +
  <a href="http://www.microformats.org/blog/2005/...">Wiki Attack</a>
 +
  </h3>
 +
  ...
 +
</div>
  
=== Individual Entry ===
+
...
 +
</div>
 +
</nowiki></pre>
  
This corresponds almost exactly to the <code>atom:entry</code> or <code>rss2:item</code> elements.
+
== EntryGroup Title ==
 +
Not covered by this proposal yet.
  
==== Forms seen in the wild ====
+
== EntryGroup Permalink ==
 +
Not covered by this proposal yet.
  
* individual entries are within a container (commonplace)
+
== Individual Entry ==
* individual entries are not within a container (rare-ish)
 
* not all sub-elements of an individual entry are in the container (for example, the author and date may follow in a separate block)
 
  
As the latter two forms are more happenstance than design, we believe building from the first form is best.
+
This corresponds almost exactly to the <code>atom:entry</code> or <code>rss2:item</code> elements. See [[blog-post-examples#Individual_Entry]] for the various forms seen in the wild.
  
==== Recommendation for blog-post-format microformat ====
+
=== Microformat Recommendation ===
  
 
* weblog entries MUST be enclosed in a single <code>atom:entry</code> element
 
* weblog entries MUST be enclosed in a single <code>atom:entry</code> element
 
* <code>atom:entry</code> elements MUST NOT be nested
 
* <code>atom:entry</code> elements MUST NOT be nested
* <code>atom:entry</code> MUST NOT not belong to more than one <code>atom:feed</code> element
+
* <code>atom:entry</code> MUST NOT belong to more than one <code>atom:feed</code> element
 +
 
 +
=== Example Transformation ===
 +
''Note that the string 'atom:entry' is a placeholder for something to be decided later.''
 +
 
 +
==== Entries in existing block ====
 +
 
 +
Original:
 +
 
 +
<pre><nowiki>
 +
<div class="entry">
 +
  <h3 id="post-60">
 +
  <a href="http://www.microformats.org/blog/2005/...">Wiki Attack</a>
 +
  </h3>
 +
  ... rest of entry ...
 +
</div>
 +
</nowiki></pre>
 +
 
 +
Transformed:
 +
 
 +
<pre><nowiki>
 +
<div class="atom:feed">
 +
<div class="atom:entry entry">
 +
  <h3 id="post-60">
 +
  <a href="http://www.microformats.org/blog/2005/...">Wiki Attack</a>
 +
  </h3>
 +
  ... rest of entry ...
 +
</div>
 +
... additional entries ...
 +
</div>
 +
</nowiki></pre>
 +
 
 +
==== Entries not in an existing block ====
 +
 
 +
Original:
 +
 
 +
<pre><nowiki>
 +
<a name="112877372228959075">&amp;nbsp;</a>
 +
<br>
 +
  <strong>Just one problem, Minister.</strong> Last week, Bill Rammell,
 +
<br>
 +
</nowiki></pre>
 +
 
 +
Transformed:
 +
 
 +
<pre><nowiki>
 +
<div class="atom:feed">
 +
<div class="atom:entry" id="112877372228959075">
 +
  <br />
 +
  <strong>Just one problem, Minister.</strong> Last week, Bill Rammell,
 +
  <br />
 +
  ... rest of entry ...
 +
</div>
 +
... additional entries ...
 +
</div>
 +
</nowiki></pre>
  
=== Title ===
+
Note the additional changes were also made:
 +
* <code>&lt;br></code> was made XHTML compliant
 +
* <code>&lt;a name="..."></code> was converted to a <code>id="..."</code> (''confirm this is OK'')
  
This corresponds almost exactly to the <code>atom:title</code> or <code>rss2:title</code> elements.
+
==== Disjointed entries ====
  
==== Forms seen in the wild ====
+
Ignore any existing blocks and treat as the previous case of no block.
  
* entry titles are enclosed in <code>&lt;h#></code> block ([http://www.microformats.org/blog/2005/09/30/web-essentials-audio/ example])
+
== Entry Title ==
* entry titles are enclosed in a <code>&lt;div></code> (I've seen this but I can't find an example, hopefully implying this is somewhat rare)
 
* entry titles are enclosed in a presentation element, such as <code>&lt;b></code> ([http://nataliesolent.blogspot.com/ example])
 
* entry titles are enclosed in a <code>&lt;span></code> ([http://www.andrewsullivan.com/ example])
 
* an entry has no title ([http://www.instapundit.com/ example])
 
  
Thus there are two fundemental ways titles are used in the wild: at the block level and inline. Our proposal must be capable of handling both forms.
+
This corresponds almost exactly to the <code>atom:title</code> or <code>rss2:title</code> elements. See [[blog-post-formats#Titles]] for examples from which we see that there are two fundemental ways titles are used in the wild: at the block level and inline. Our proposal must be capable of handling both forms.
  
==== Recommendation for blog-post-format microformat ====
+
=== Microformat Recommendation ===
  
* <code>atom:entry</code>s SHOULD have at most 1 title
+
* <code>atom:entry</code>s SHOULD have at most one title
* block level titles SHOULD be represented using <code>&lt;h#></code>, the first such element for in a <code>atom:entry</code> being considered to be the title; this need not be marked up or identified in any other way as the title  
+
* block level titles SHOULD be represented using <code>&lt;h#></code>, the first such element in a <code>atom:entry</code> should be considered the title; this need not be marked up or identified in any other way as the title  
 
* inline titles MUST be marked as <code>atom:title</code>; it is also possible to do this using block level formatting such as <code>&lt;div></code>, but this is discouraged
 
* inline titles MUST be marked as <code>atom:title</code>; it is also possible to do this using block level formatting such as <code>&lt;div></code>, but this is discouraged
  
=== Content ===
+
=== Discussion: why not always <code>&lt;h#></code>? ===
 +
 
 +
Using CSS <code>display: inline</code>, block level elements can be converted to inline elements. Unfortunately, we cannot nest <code>&lt;h#></code> inside of a <code>&lt;p></code> block to achieve the correct effect. I.e. we cannot convert ...
 +
 
 +
<pre><nowiki>
 +
<p><strong>The Title</strong>: The Text...</p>
 +
</nowiki></pre>
 +
 
 +
... into ...
 +
 
 +
<pre><nowiki>
 +
<p><h3 style="display: inline">The Title</h3>: The Text ...</p>
 +
</nowiki></pre>
 +
 
 +
... because the XHTML will not validate. Also...
 +
 
 +
<pre><nowiki>
 +
<h3 style="display: inline">The Title</h3><p>: The Text ...</p>
 +
</nowiki></pre>
 +
 
 +
... will not work because presentation effect will be different than what the user intends (because the <code>&lt;p></code> will introduce a line break).
 +
 
 +
=== Example Transformation ===
 +
 
 +
''Note that the string 'atom:title' is a placeholder for something to be decided later.''
 +
 
 +
==== Header in <code>&lt;h#></code> block ====
 +
Original (and Final):
 +
 
 +
<pre>
 +
<div class="atom:entry">
 +
<h2 id="post-59">Web Essentials Audio</h2>
 +
... reset of entry ...
 +
</div>
 +
</pre>
 +
 
 +
No transformation is needed -- the blog-post microformat will recognize this as the <code>atom:title</code>.
 +
 
 +
==== Header in other block element  ====
 +
 
 +
Original:
 +
 
 +
<pre>
 +
<div class="atom:entry">
 +
<div class"header">Web Essentials Audio</div>
 +
</div>
 +
</pre>
 +
 
 +
Transformed (the header level is to taste):
 +
 
 +
<pre>
 +
<div class="atom:entry">
 +
<h3>Web Essentials Audio</h3>
 +
</div>
 +
</pre>
 +
 
 +
It is possible to add <code>class"atom:title"</code> to the <code>div</code> but we recommend against it. However, we recognize that there may be certain [http://microformats.org/wiki/blog-post-formats#Discussion_Forum_.2F_Bulletin_Board_Formats BB Tools] that making this change may be too difficult.
 +
 
 +
==== Header in inline element  ====
 +
 
 +
Original:
 +
 
 +
<pre><nowiki>
 +
<div class="atom:entry" id="112877372228959075">
 +
  <br />
 +
  <strong>Just one problem, Minister.</strong> Last week, Bill Rammell,
 +
  <br />
 +
  ... rest of entry ...
 +
</div>
 +
</nowiki></pre>
 +
 
 +
Transformed:
  
This roughly corresponds to <code>atom:content</code> and/or <code>atom:summary</code> elements.
+
<pre><nowiki>
 +
<div class="atom:entry" id="112877372228959075">
 +
  <br />
 +
  <strong class="atom:title">Just one problem, Minister.</strong> Last week, Bill Rammell,
 +
  <br />
 +
  ... rest of entry ...
 +
</div>
 +
</nowiki></pre>
  
==== Forms seen in the wild ====
+
== Entry Content ==
  
* entry with no content present -- that is, just a link and the title pointing to a different URI which may actually have content
+
This roughly corresponds to <code>atom:content</code> and/or <code>atom:summary</code> elements. See [[blog-post-examples#Entry_Content]] for the various forms seen in the wild.
* entry with summary content only ([example http://www.torontosun.com/Money/home.html])
 
* entry with complete content
 
* entry with complete content, but the content is broken into multiple sections ([example http://www.samizdata.net/blog/] - look for "Read More" sections)
 
  
==== The split content problem ====  
+
=== Discussion: the split content problem ===  
  
 
The last item above (content broken into multiple sections) introduces a few unique problems. It is not sufficient to enclose all the different content sections in a <code>atom:content</code> element, as the following example illustrates:
 
The last item above (content broken into multiple sections) introduces a few unique problems. It is not sufficient to enclose all the different content sections in a <code>atom:content</code> element, as the following example illustrates:
  
<pre><code>
+
<pre><nowiki>
  &lt;atom:entry>
+
  <div class="atom:entry">
   &lt;atom:content>
+
   <div class="atom:content">
   first part of the content
+
   ... first part of the content ...
 
   "Read More"
 
   "Read More"
   second part of the content
+
   ... second part of the content ...
   &lt;/atom:content>
+
   </div>
  &lt;/atom:entry>
+
  </div>
</code></pre>
+
</nowiki></pre>
  
 
"Read More" is not part of the content! Therefore, we propose that ''multiple'' content sections be allowed in a single <code>atom:entry</code>. The concatenation of all these content blocks will define the complete content:
 
"Read More" is not part of the content! Therefore, we propose that ''multiple'' content sections be allowed in a single <code>atom:entry</code>. The concatenation of all these content blocks will define the complete content:
  
<pre><code>
+
<pre><nowiki>
  &lt;atom:entry>
+
  <div class="atom:entry">
   &lt;atom:content>
+
   <div class="atom:content">
   first part of the content
+
   ... first part of the content ...
   &lt;/atom:content>
+
   </div>
  "Read More"
+
  "Read More"
   &lt;atom:content>
+
   <div class="atom:content">
   second part of the content
+
   ... second part of the content ...
   &lt;/atom:content>
+
   </div>
  &lt;/atom:entry>
+
  </div>
</code></pre>
+
</nowiki></pre>
  
Note: once again, don't confuse <code>&lt;atom:content></code> with something we're going to actual see in the end microformat -- it's just a placeholder for a concept we're going to implement!
+
The same argument is applicable to <code>atom:summary</code>.
  
==== Recommendation for blog-post-format microformat ====
+
=== Microformat Recommendation ===
  
 
* an <code>atom:entry</code> MAY have zero or more <code>atom:summary</code> sections. There is no requirement that different representations of the same entry (on different URIs) use the same summaries.
 
* an <code>atom:entry</code> MAY have zero or more <code>atom:summary</code> sections. There is no requirement that different representations of the same entry (on different URIs) use the same summaries.
* an <code>atom:entry</code> MAY have zero or more <code>atom:content</code> sections. The serial concatenation of all the <code>atom:content</code> sections within the entry MUST represent the complete content of the entry.
+
* an <code>atom:entry</code> MAY have zero or more <code>atom:content</code> sections. The serial concatenation of all the <code>atom:content</code> sections within the entry MUST represent the complete content of the entry. Note that the rule here is slightly different than [http://www.atomenabled.org/developers/syndication/atom-format-spec.php#rfc.section.4.1.2 Atom] which only allows one <code>atom:content</code>.
 +
 
 +
=== Example Transformation ===
 +
 
 +
''Note that the strings 'atom:summary' and 'atom:content' (etc.) are placeholders for something to be decided later.''
 +
 
 +
==== Entry with summary content ====
 +
Original:
 +
 
 +
<pre><nowiki>
 +
<div class="inlineBlog">
 +
<h3 id="a003068">
 +
  <a href="http://thecommunityengine.com/h.../xfolk_vegomatic.html" class="taggedlink">xFolk Veg-o-matic Alpha</a>
 +
</h3>
 +
<p class="abstract extended">
 +
  We provide a way to surf the web and slice and dice information you find there into your own custom output stream.
 +
</p>
 +
... some tag and category stuff ...
 +
<p>
 +
  The folks at ... the rest of the content
 +
</p>
 +
<p class="extended">
 +
  <a href="http://thecommunityengine.com/.../xfolk_vegomatic.html#more">Continue reading "xFolk Veg-o-matic Alpha"</a>
 +
</p>
 +
...
 +
</div>
 +
</nowiki></pre>
 +
 
 +
Transformation:
 +
 
 +
<pre><nowiki>
 +
<div class="inlineBlog atom:entry">
 +
<h3 id="a003068">
 +
  <a href="http://thecommunityengine.com/h.../xfolk_vegomatic.html" class="taggedlink atom:permalink">xFolk Veg-o-matic Alpha</a>
 +
</h3>
 +
<p class="abstract extended">
 +
  We provide a way to surf the web and slice and dice information you find there into your own custom output stream.
 +
</p>
 +
... some tag and category stuff ...
 +
<div class="atom:summary">
 +
  <p>
 +
  The folks at ... the rest of the content
 +
  </p>
 +
</div>
 +
<p class="extended">
 +
  <a href="http://thecommunityengine.com/.../xfolk_vegomatic.html#more">Continue reading "xFolk Veg-o-matic Alpha"</a>
 +
</p>
 +
...
 +
</div>
 +
</nowiki></pre>
 +
 
 +
Notes:
 +
* we didn't do anything with the "abstract" section -- this is a discussion for another day
 +
* we didn't include the tag stuff in the summary, and probably wouldn't if this was the complete content
 +
 
 +
==== Entry with complete content ====
 +
Original:
 +
 
 +
<pre><nowiki>
 +
<div class="entry single">
 +
<h2 id="post-61">Class attributes are about more than styling</h2>
 +
 
 +
<p>When people talk about microformats, ... </p>
 +
 +
<blockquote cite="http://www.w3.org/TR/REC-html40/struct/global.html#h-7.5.2">
 +
  ... quoted text from elsewhere
 +
</blockquote>
 +
 
 +
<p>There&#8217;s a couple of points I&#8217;d like to highlight here:</p>
 +
 +
... more content ...
 +
 
 +
<h4 class="tags">Technorati Tags:</h4>
 +
<ul class="tags">
 +
  <li><a href="http://www.technorati.com/tag/css" rel="tag">css</a></li>
 +
  ...
 +
</ul>
 +
 
 +
<ul class="post-info">
 +
  ... footer stuff ...
 +
</ul>
 +
</div>
 +
</nowiki></pre>
 +
 
 +
Transformation:
  
=== Permalink ===
+
<pre><nowiki>
 +
<div class="entry single atom:entry">
 +
<h2 id="post-61">Class attributes are about more than styling</h2>
  
Permalinks roughly correspond to <code>atom:entry/link</code>.
+
<div class="atom:content">
 +
  <p>When people talk about microformats, ... </p>
  
A permalink is called '''canonical''' if it is the best representation of the URI for that entry; the definition of what 'best representation' is is entirely at the discretion of the webblog's publisher. The issue of whether a URI is canonical or not adds some additional complexity to this microfomat; the value in explicitly spelling this out is that we can the use the URI without transformation to link together multiple syndication feeds and multiple XHTML copies of weblog posts together.
+
  <blockquote cite="http://www.w3.org/TR/REC-html40/struct/global.html#h-7.5.2">
 +
  ... quoted text from elsewhere
 +
  </blockquote>
  
==== Forms seen in the wild ====
+
  <p>There&#8217;s a couple of points I&#8217;d like to highlight here:</p>
==== Recommendation for blog-post-format microformat ====
 
  
* weblog entries MUST have exactly one <code>atom:entry/link</code>
+
  ... more content ...
* permalinks SHOULD be marked as <code>atom:entry/link</code>
+
</div>
* canonical permalinks SHOULD also be marked <code>blogpost:canonical</code>
+
 
* permalinks which are not canonical MUST NOT be marked <code>blogpost:canonical</code>
+
<h4 class="tags">Technorati Tags:</h4>
 +
<ul class="tags">
 +
  <li><a href="http://www.technorati.com/tag/css" rel="tag">css</a></li>
 +
  ...
 +
</ul>
 +
 
 +
<ul class="post-info">
 +
  ... footer stuff ...
 +
</ul>
 +
</div>
 +
</nowiki></pre>
 +
 
 +
Notes:
 +
* the only thing that really needed to be done is enclose the content
 +
* my preference would be to move the post <code>id</code> to the <code>atom:entry</code>
 +
 
 +
==== Entry with split content (multiple sections) ====
 +
Original:
 +
 
 +
<pre><nowiki>
 +
<div class="blogbody">
 +
<a name="008148"></a>
 +
 
 +
<div class="title">
 +
  Face to face: why places will continue to exist
 +
</div>
 +
 
 +
<div class="posted">
 +
  <strong>Brian Micklethwait (London)</strong>
 +
  &nbsp;&nbsp;
 +
  <a href="...">Science &amp; Technology</a>
 +
</div>
 +
 
 +
<p>It is not just that I dislike filling in forms....</p>
 +
... the first section of the content ...
 +
 
 +
... this link makes the extended section show ...
 +
<span id="varP8148">
 +
  <img src="http://www.samizdata.net/blog/img/bullet_tri.gif" width="16" height="10" alt="" />
 +
  <a href="..." onclick="showMore(8148,'...');return false;">
 +
  Read more.
 +
  </a>
 +
</span>
 +
 
 +
<div id="varXYZ8148" style="display: none">
 +
  <p>The very gadgets – computers linked...</p>
 +
  ... the rest of the extended content ...
 +
 
 +
  ... this link makes the extended section hide ...
 +
  <img src="..." width="16" height="10" alt="" />
 +
  <a href="#008148" onclick="showMore(8148,0);return true;">
 +
    Read less.
 +
  </a>
 +
  </div>
 +
</div>
 +
</div>
 +
</nowiki></pre>
 +
 
 +
Transformation:
 +
 
 +
<pre><nowiki>
 +
<div class="blogbody atom:entry" id="008148">
 +
<h3>
 +
  Face to face: why places will continue to exist
 +
</h3>
 +
 
 +
<div class="posted">
 +
  <strong>Brian Micklethwait (London)</strong>
 +
  &nbsp;&nbsp;
 +
  <a href="...">Science & Technology</a>
 +
</div>
 +
 
 +
<div class="atom:content">
 +
  <p>It is not just that I dislike filling in forms....</p>
 +
  ... the first section of the content ...
 +
</div>
 +
 
 +
... this link makes the extended section show ...
 +
<span id="varP8148">
 +
  <img src="http://www.samizdata.net/blog/img/bullet_tri.gif" width="16" height="10" alt="" />
 +
  <a href="..." onclick="showMore(8148,'...');return false;">
 +
  Read more.
 +
  </a>
 +
</span>
 +
 
 +
<div id="varXYZ8148" style="display: none">
 +
  <div class="atom:content">
 +
  <p>The very gadgets – computers linked...</p>
 +
  ... the rest of the extended content ...
 +
  </div>
 +
 
 +
  ... this link makes the extended section hide ...
 +
  <img src="..." width="16" height="10" alt="" />
 +
  <a href="#008148" onclick="showMore(8148,0);return true;">
 +
    Read less.
 +
  </a>
 +
  </div>
 +
</div>
 +
</div>
 +
</nowiki></pre>
 +
 
 +
Note:
 +
* <code>atom:content</code> <code>div</code>s were created for each the of the text sections, so that non-content coded would not be incorrectly marked
 +
* there are '''two''' <code>atom:content</code> sections; together they make the complete content
 +
* the conversion of <code>&lt;div class="header"></code> to <code>&lt;h3></code>
 +
* the addition of <code>atom:entry</code> as needed
 +
* the removal of the <code>&lt;a name="008148"></code> in favor of placing an <code>id</code> on the <code>atom:entry</code>
 +
* further manipulation of the author could be done
 +
* further manipulation of the category could be done
 +
 
 +
== Entry Permalink ==
 +
 
 +
Permalinks roughly correspond to <code>atom:link</code>. See [[blog-post-examples#Entry_Permalinks]] for examples.
 +
 
 +
A permalink is called '''canonical''' if it is the best representation of the URI for that entry; the definition of what 'best representation' is is entirely at the discretion of the weblog's publisher. We recommend that weblogs use canonical URIs because it allows "threading" together multiple posts and sources with byte-level comparisons. In general, the canonical URI should be the link used in an Atom entry.
 +
 
 +
===  Microformat Recommendation ===
 +
 
 +
* an Entry MUST NOT have more than one permalink marked as <code>atom:link</code>
 
* permalinks SHOULD be absolute URIs
 
* permalinks SHOULD be absolute URIs
 
* permalinks SHOULD be canonical
 
* permalinks SHOULD be canonical
* permalinks SHOULD be the same as the <code>atom:entry/link</code> used in syndication feeds
+
* permalinks SHOULD be the same as the <code>atom:link</code> used in syndication feeds
 +
 
 +
=== Example Transformations ===
 +
 
 +
Original:
 +
 
 +
<pre><nowiki>
 +
<div class="entry">
 +
<h3 id="post-45">
 +
  <a
 +
  href="http://www.microformats.org/blog/2005/08/21/foobar-microformats/"
 +
  rel="bookmark"
 +
  title="Permanent Link to FooBar Microformats">FooBar Microformats</a>
 +
  </h3>
 +
  ...
 +
</div>
 +
</nowiki></pre>
 +
 
 +
Transformation:
 +
 
 +
<pre><nowiki>
 +
<div class="atom:entry entry">
 +
<h3 id="post-45">
 +
  <a
 +
  href="http://www.microformats.org/blog/2005/08/21/foobar-microformats/"
 +
  rel="atom:link bookmark"
 +
  title="Permanent Link to FooBar Microformats">FooBar Microformats</a>
 +
  </h3>
 +
  ...
 +
</div>
 +
</nowiki></pre>
 +
 
 +
Original:
 +
 
 +
<pre><nowiki>
 +
<h3>YET ANOTHER INSTANCE OF THE WORLD FINALLY CATCHING UP TO THE BLOG</h3>
 +
<p>Today's news: Neuticles win ... award.</p>
 +
<p class="posted">
 +
Posted by judi on October  7, 2005 at 05:00 PM |
 +
<a href="http://blogs.herald.com/dave_barrys_blog/2005/10/yet_another_ins.html">Permalink</a>
 +
</p>
 +
</nowiki></pre>
 +
 
 +
Transformation:
 +
 
 +
<pre><nowiki>
 +
<div class="atom:entry">
 +
<h3>YET ANOTHER INSTANCE OF THE WORLD FINALLY CATCHING UP TO THE BLOG</h3>
 +
<p>Today's news: Neuticles win ... award.</p>
 +
<p class="posted">
 +
Posted by judi on October  7, 2005 at 05:00 PM |
 +
<a rel="atom:link" href="http://blogs.herald.com/dave_barrys_blog/2005/10/yet_another_ins.html">Permalink</a>
 +
</p>
 +
</nowiki></pre>
 +
 
 +
== Entry Datetimes - Creation and Modified ==
 +
Weblogs typically display (in HTML) the creation time of their posts (roughly but not exactly corresponding to <code>atom:published</code>) and not so much the last modified time (<code>atom:updated</code>).
  
== Possible Uses ==
+
Also see [[datetime-design-pattern]] for more information on specifying datetimes. The recommendation here is styled after datetimes in [[hcalendar]].
 +
 
 +
=== Forms seen in the wild ===
 +
See [[blog-post-formats#Datetimes]]
 +
 
 +
=== Microformat Recommendation ===
 +
 
 +
* date headers between weblog entries are outside of this microformat
 +
* <code>atom:published</code> SHOULD be indicated by an <code>abbr</code> element around the human readable version of the date or datetime.
 +
** the 'class' attribute MUST indicate <code>atom:published</code>
 +
** the 'title' attribute MUST be a complete datetime, in the format of [[datetime-design-pattern]]
 +
* likewise for <code>atom:updated</code>, if present
 +
 
 +
=== Example transformation ===
 +
 
 +
''Note that the string 'atom:published' is a placeholder for something to be decided later.''
 +
 
 +
Original:
 +
 
 +
<pre>
 +
<a href"...">Friday, September 30th, 2005 at 12:31 pm</a>
 +
</pre>
 +
 
 +
Transformed:
 +
 
 +
<pre>
 +
<a href"..."><abbr
 +
class"atom:published"
 +
title"200050930T12:31:01-0500">Friday, September 30th, 2005 at 12:31 pm</abbr></a>
 +
</pre>
 +
 
 +
== Entry Author ==
 +
A work in progress
 +
 
 +
=== Microformat Recommendation ===
 +
 
 +
* Entry Authors SHOULD be inside a <code>&lt;address></code> block
 +
 
 +
= Possible Atom to microformat(s) mapping =
 +
 
 +
* feed - "hfeed"
 +
** title - imply from &lt;title&gt; element
 +
** subtitle - re-use "description" per vCalendar, iCalendar, [[hcalendar|hCalendar]], [[xfolk|xFolk]], and [[hreview|hReview]].
 +
** id - imply from page URL
 +
** updated - "updated"
 +
** author - "author", if none found, imply from &lt;address&gt; (which SHOULD be used anyway), either way, MUST be an [[hcard|hCard]].
 +
** generator - set by the converting script / XSLT, omit from hAtom.  Similar to PRODID in [[hcalendar|hCalendar]].
 +
** logo - re-use "logo" from [[hcard|hCard]]
 +
** icon - define new [[rel-icon]] (see XHTML2) for this
 +
** category - [[rel-tag]] + [[rel-directory]]
 +
** rights - [[rel-license]]
 +
* entry - "hentry"
 +
** title - "headline"
 +
** link - [[rel-bookmark]] from HTML4
 +
** id - imply from permalink
 +
** summary - "excerpt"
 +
** content - "content"
 +
** published - "published"
 +
** updated - "updated"
 +
** author - "author", MUST be [[hcard|hCard]], SHOULD be &lt;address&gt;
 +
** rights - [[rel-license]]
 +
 
 +
== Multiple feeds on a page ==
 +
 
 +
Post hAtom 1.0: support multiple feeds on a single page.  Changes from above.
 +
 
 +
* feed
 +
** title - "headline", same as entry
 +
** id - define new [[rel-canonical]] microformat for this.
 +
** author - "author" required.
 +
* entry
 +
** id - re-use "uid" from [[hcalendar|hCalendar]].
 +
 
 +
=== Discussion ===
 +
 
 +
==== feed title ====
 +
 
 +
I initially thought "fn" would make sense for the feed title, but having looked at some blogs/feeds, though in many cases the title of the blog/feed *is* the name of the blog/feed, this is often not the case.
 +
 
 +
Two examples:
 +
#. Some blog titles consist of the blog name and the date, e.g. Scripting News does this.
 +
#. Some blog titles consist of the blog name and a short temporary phrase or saying
 +
 
 +
I have seen both of these in the wild often enough to believe that blog title and blog name are not the same, thus it is inappropriate to re-use "fn" from [[hcard|hCard]], since the feed title does not mean the same thing as the *name* of the feed.  Thus I have removed the suggestion to re-use "fn" for feed title, and instead propose re-using "headline" from the entry, which does appear to have the same semantic.
 +
 
 +
== Additional Possibilities ==
 +
 
 +
More post hAtom 1.0 thoughts:
 +
 
 +
* entry
 +
** summary - "excerpt" or "abstract"
 +
** contributor - "contributor"
 +
** source - use &lt;blockquote cite=""&gt;, put source in cite attribute.
 +
 
 +
= Possible Uses =
  
 
This section describes potential applications for a blog post microformat
 
This section describes potential applications for a blog post microformat
  
=== Transformational Uses ===
+
== Transformational Uses ==
  
 
By transformational, we mean feeding a weblog post to some sort of transformation tool (such as XSLT) to produce a different version of the post fit for a different use.
 
By transformational, we mean feeding a weblog post to some sort of transformation tool (such as XSLT) to produce a different version of the post fit for a different use.
  
==== Printing Weblog Posts ====
+
=== Printing Weblog Posts ===
==== Reblogging ====
+
=== Reblogging ===
 +
 
 +
* [http://blogs.zdnet.com/BTL/?p=2052&part=rss&tag=feed&subj=zdblog ZDNet] has a reblog button that would be made obsolete (or could be substantially improved) by use of this microformat
 +
* [http://reblg.com/ Reblog.com] was the inspiration for this idea. This may be renamed [http://redirectthis.com/ RedirectThis]?
  
=== Archival Uses ===
+
== Archival Uses ==
  
 
By 'archival', we mean taking weblog entries and placing them in a database for later analysis, searching, aggregation and so forth.
 
By 'archival', we mean taking weblog entries and placing them in a database for later analysis, searching, aggregation and so forth.
  
==== Personal Database ====
+
=== Personal Database ===
==== Search Engines ====
+
=== Search Engines ===
 +
 
 +
=== Partial Text Blogs ===
 +
Partial content blogs can be created by producing the full html content of a blog entry but not marking it up as such. The atom:summary portion of that entry can be marked up as summary, or could be written up and placed in a hidden block element within the html. hAtom parsers would ignore the unannotated content and produce summary information only.
  
==Obstacles==
+
=Obstacles=
  
===Header Tag for Entry Title?===
+
==Header Tag for Entry Title?==
 
--[[User:Bryan|Bryan]] 14:55, 14 Aug 2005 (PDT)
 
--[[User:Bryan|Bryan]] 14:55, 14 Aug 2005 (PDT)
  
Line 244: Line 769:
 
:Whether an h3 or h1 is used is irrelevant, the semantics will be applied with classnames. This is a non-issue. --[[User:RyanKing|RyanKing]] 22:35, 18 Aug 2005 (PDT)
 
:Whether an h3 or h1 is used is irrelevant, the semantics will be applied with classnames. This is a non-issue. --[[User:RyanKing|RyanKing]] 22:35, 18 Aug 2005 (PDT)
  
==See Also==
+
 
 +
 
 +
=See Also=
 +
* [[hatom|hAtom]] - the draft proposal
 +
* [[hatom-issues]] - problems? complaints? ideas? Put them here
 +
* [[hatom-faq]] - knowledge base
 +
* [[blog-post-brainstorming]]
 
* [[blog-post-formats]]
 
* [[blog-post-formats]]
 +
* [[blog-post-examples]]
 +
* [[blog-description-format]] - how to describe a blog (as opposed to the individual entries, which is what we're doing here)
 +
 
* [http://blogs.oreillynet.com/beasts/archives/2005/10/blog_post_microformat_proposal.html Blog Post Microformat Proposal] Some thoughts on the topic with useful illustrations.
 
* [http://blogs.oreillynet.com/beasts/archives/2005/10/blog_post_microformat_proposal.html Blog Post Microformat Proposal] Some thoughts on the topic with useful illustrations.
 +
* [http://dannyayers.com/archives/2005/08/27/hatom-no-seriously/ Danny Ayers] proposes the name hAtom and some applications
 +
* [http://torrez.us/archives/2005/10/07/404 Elias Torres] says we need 'hAtom'

Latest revision as of 06:18, 27 December 2008

Discussion Participants

Editors

Authors

Purpose

The 'blog-post-microformat' proposes a codification of how blog posts are indentifies within weblogs. It is hoped that this should be considered to be 'expansive': for example, the proposal could be used on CNN.com to mark up news articles and summary pages.

Terminology

This section explores the terminology that should used to discuss a blog post microformat. To make it easier to talk about the various different types of teminology, We're using a XML-like namespace version so we can make statements like atom:entry is roughly equivalent to rss20:feed, atom:feed/atom:link@relalternate is roughly equivalent to rss20:channel/rss20:link or atom:author is not equivalent to rss:entry/rss:author (because RSS 2.0 is only the definition of an email address).

Common terminology in weblogs

Reviewing blog-post-formats#Tools, one can see that there's little standardization amongst tools or even within a individual tool (such as 'blogger') for names of elements of blog posts. There are however many common elements, including:

  • a container for all posts/entries
  • a container for individual posts
  • the post content, which can be complete, summarized with a link to the complete link, or a couple of paragraphs with javascript/CSS tricks to reveal the remainder of the content
  • the name of the author
  • the posting date (in many many formats)

Although this looks like a bit of a dog's breakfast, there is usually a fair amount of rigour behind the presentation, as Atom and/or RSS feeds can be produced also from the same tools.

Furthermore, in developing a microformat for weblog posts, we want to be careful not to break any (or many) templates. Note that many weblog templates will have to be updated as they produce somewhat crufty HTML rather than shiny XHTML.

Atom Terminology

See here for the spec and blog-post-formats#Atom for analysis.

  • atom:feed - (composite) a collection of entries plus information about them
    • atom:author - (composite) the author of a feed (may contain atom:email, atom:name, atom:uri)
    • atom:id - a permament identifier for a feed
    • atom:title - the title of an atom:entry or a atom:feed
    • atom:updated - the last time the feed was updated
    • atom:link@relalternate - the home page of a feed
    • atom:link@relself - the URI of the feed (where it can be downloaded)
    • atom:entry - (composite) an entry within the feed
      • atom:content - the feed's content
      • atom:summary - a summary of the feed's content
      • atom:entry/link - the permament URI of the entry
      • atom:published - the time of the initial creation or first availability of the entry

RSS 2.0 Terminology

See here for the spec and blog-post-formats#RSS for analysis. There are a lot more elements in RSS but this covers the most commonly used ones.

  • rss2:channel - (composite) a collection of entries plus information about them
    • rss2:author - (composite) the author of a feed (may contain atom:email, atom:name, atom:uri)
    • rss2:link - The URL to the HTML website corresponding to the channel (compare to atom:link@relalternate)
    • rss2:title - the title of an rss2:channel or a rss2:item
    • rss2:pubDate - The publication date for the content in the channel.
    • rss2:item - (composite) an entry within the feed
      • rss2:item/link - The URL of the item. Note that this may not be a permalink for the item; it may be a link to some other page on the Internet that the rss2:item is about
      • rss2:description - The item synopsis [sic]. There is no special indication whether this is the full content of an entry, a summary, or a precis of what the rss2:item/link is pointing to
      • rss2:author - email address of the author of the item

Recommendation

Atom has a much more precise mechanism for defining syndication feeds and weblog data. A mechanical transformation from Atom -> RSS will always lead to a correct RSS feed; a RSS -> Atom translation would have to make a decision amongst multiple definitions that may not always be correct. For example, the format of markup, the role of an author, or the meaning of a link.

IMPORTANT: we shall talk about things such as 'marking elements atom:feed'; consider this a purely conceptual thing. The text 'atom:feed' will not appear in the XHTML microformat -- we may decide later to use the actual phrase 'atom_feed', 'feed', 'items' or 'googlybear'. In the case where there is no clear or applicable atom terminology, we shall use 'weblog:xxx'.

Discovered Elements

This section explores the information discovered from Current Blog Formats using the terminology discussed above. We will only focus on the major elements of weblog posts:

  • the EntryGroup
  • the individual Entry
  • the Entry Title
  • the Entry Content
  • the Entry Permalink
  • the Entry Datetimes

For now, the codification of the following major elements will be deferred as there is/may be overlap with other microformats that should be explored further

  • the EntryGroup Title
  • the EntryGroup Permalink
  • the Entry Poster/Author - in particular, should hcard be used?

Further input from the community would be appreciated here

EntryGroup

Roughly speaking, this corresponds to 'atom:feed' or 'rss2:channel' (in particular, the items within those XML elements). See blog-post-examples#EntryGroup for the various forms seen in the wild.

Microformat Recommendation

  • weblog pages (including home pages, archives, category pages, tag pages and so forth) that may container multiple entries MUST enclose the entries in a atom:feed element
  • weblog pages MAY have multiple atom:feed element enclosing different groups of entries
  • atom:feed elements MUST NOT be nested
  • weblog pages that have exactly one entry MAY use the atom:feed

Example Transformation

Note that the string 'atom:feed' is a placeholder for something to be decided later.

Original (obviously, if there is no existing EntryGroup block element, one can be added):

<div id="content">
 <h2 id="home-title">
  Latest microformats news 
  <a href="http://www.microformats.org/feed/" title="link to RSS feed" id="feed-link">
   <img src="/img/xml.gif" width="23" height="13" alt="XML" />
  </a>
 </h2>

 <div class="entry">
  <h3 id="post-60">
   <a href="http://www.microformats.org/blog/2005/...">Wiki Attack</a>
  </h3>
  ...
 </div>

 ...
</div>

Transformed:

<div id="content" class="atom:feed">
 <h2 id="home-title">
  Latest microformats news 
  <a href="http://www.microformats.org/feed/" title="link to RSS feed" id="feed-link">
   <img src="/img/xml.gif" width="23" height="13" alt="XML" />
  </a>
 </h2>

 <div class="entry">
  <h3 id="post-60">
   <a href="http://www.microformats.org/blog/2005/...">Wiki Attack</a>
  </h3>
  ...
 </div>

 ...
</div>

EntryGroup Title

Not covered by this proposal yet.

EntryGroup Permalink

Not covered by this proposal yet.

Individual Entry

This corresponds almost exactly to the atom:entry or rss2:item elements. See blog-post-examples#Individual_Entry for the various forms seen in the wild.

Microformat Recommendation

  • weblog entries MUST be enclosed in a single atom:entry element
  • atom:entry elements MUST NOT be nested
  • atom:entry MUST NOT belong to more than one atom:feed element

Example Transformation

Note that the string 'atom:entry' is a placeholder for something to be decided later.

Entries in existing block

Original:

 <div class="entry">
  <h3 id="post-60">
   <a href="http://www.microformats.org/blog/2005/...">Wiki Attack</a>
  </h3>
  ... rest of entry ...
 </div>

Transformed:

<div class="atom:feed">
 <div class="atom:entry entry">
  <h3 id="post-60">
   <a href="http://www.microformats.org/blog/2005/...">Wiki Attack</a>
  </h3>
  ... rest of entry ...
 </div>
 ... additional entries ...
</div>

Entries not in an existing block

Original:

 <a name="112877372228959075">&nbsp;</a>
 <br>
  <strong>Just one problem, Minister.</strong> Last week, Bill Rammell, 
 <br>

Transformed:

<div class="atom:feed">
 <div class="atom:entry" id="112877372228959075">
  <br />
   <strong>Just one problem, Minister.</strong> Last week, Bill Rammell, 
  <br />
  ... rest of entry ...
 </div>
 ... additional entries ...
</div>

Note the additional changes were also made:

  • <br> was made XHTML compliant
  • <a name="..."> was converted to a id="..." (confirm this is OK)

Disjointed entries

Ignore any existing blocks and treat as the previous case of no block.

Entry Title

This corresponds almost exactly to the atom:title or rss2:title elements. See blog-post-formats#Titles for examples from which we see that there are two fundemental ways titles are used in the wild: at the block level and inline. Our proposal must be capable of handling both forms.

Microformat Recommendation

  • atom:entrys SHOULD have at most one title
  • block level titles SHOULD be represented using <h#>, the first such element in a atom:entry should be considered the title; this need not be marked up or identified in any other way as the title
  • inline titles MUST be marked as atom:title; it is also possible to do this using block level formatting such as <div>, but this is discouraged

Discussion: why not always <h#>?

Using CSS display: inline, block level elements can be converted to inline elements. Unfortunately, we cannot nest <h#> inside of a <p> block to achieve the correct effect. I.e. we cannot convert ...

<p><strong>The Title</strong>: The Text...</p>

... into ...

<p><h3 style="display: inline">The Title</h3>: The Text ...</p>

... because the XHTML will not validate. Also...

<h3 style="display: inline">The Title</h3><p>: The Text ...</p>

... will not work because presentation effect will be different than what the user intends (because the <p> will introduce a line break).

Example Transformation

Note that the string 'atom:title' is a placeholder for something to be decided later.

Header in <h#> block

Original (and Final):

<div class="atom:entry">
 <h2 id="post-59">Web Essentials Audio</h2>
 ... reset of entry ...
</div>

No transformation is needed -- the blog-post microformat will recognize this as the atom:title.

Header in other block element

Original:

<div class="atom:entry">
 <div class"header">Web Essentials Audio</div>
</div>

Transformed (the header level is to taste):

<div class="atom:entry">
 <h3>Web Essentials Audio</h3>
</div>

It is possible to add class"atom:title" to the div but we recommend against it. However, we recognize that there may be certain BB Tools that making this change may be too difficult.

Header in inline element

Original:

 <div class="atom:entry" id="112877372228959075">
  <br />
   <strong>Just one problem, Minister.</strong> Last week, Bill Rammell, 
  <br />
  ... rest of entry ...
 </div>

Transformed:

 <div class="atom:entry" id="112877372228959075">
  <br />
   <strong class="atom:title">Just one problem, Minister.</strong> Last week, Bill Rammell, 
  <br />
  ... rest of entry ...
 </div>

Entry Content

This roughly corresponds to atom:content and/or atom:summary elements. See blog-post-examples#Entry_Content for the various forms seen in the wild.

Discussion: the split content problem

The last item above (content broken into multiple sections) introduces a few unique problems. It is not sufficient to enclose all the different content sections in a atom:content element, as the following example illustrates:

 <div class="atom:entry">
  <div class="atom:content">
   ... first part of the content ...
   "Read More"
   ... second part of the content ...
  </div>
 </div>

"Read More" is not part of the content! Therefore, we propose that multiple content sections be allowed in a single atom:entry. The concatenation of all these content blocks will define the complete content:

 <div class="atom:entry">
  <div class="atom:content">
   ... first part of the content ...
  </div>
   "Read More"
  <div class="atom:content">
   ... second part of the content ...
  </div>
 </div>

The same argument is applicable to atom:summary.

Microformat Recommendation

  • an atom:entry MAY have zero or more atom:summary sections. There is no requirement that different representations of the same entry (on different URIs) use the same summaries.
  • an atom:entry MAY have zero or more atom:content sections. The serial concatenation of all the atom:content sections within the entry MUST represent the complete content of the entry. Note that the rule here is slightly different than Atom which only allows one atom:content.

Example Transformation

Note that the strings 'atom:summary' and 'atom:content' (etc.) are placeholders for something to be decided later.

Entry with summary content

Original:

<div class="inlineBlog">
 <h3 id="a003068">
  <a href="http://thecommunityengine.com/h.../xfolk_vegomatic.html" class="taggedlink">xFolk Veg-o-matic Alpha</a>
 </h3>
 <p class="abstract extended">
  We provide a way to surf the web and slice and dice information you find there into your own custom output stream.
 </p>
 ... some tag and category stuff ...
 <p>
  The folks at ... the rest of the content
 </p>
 <p class="extended">
  <a href="http://thecommunityengine.com/.../xfolk_vegomatic.html#more">Continue reading "xFolk Veg-o-matic Alpha"</a>
 </p>
 ...
</div>

Transformation:

<div class="inlineBlog atom:entry">
 <h3 id="a003068">
  <a href="http://thecommunityengine.com/h.../xfolk_vegomatic.html" class="taggedlink atom:permalink">xFolk Veg-o-matic Alpha</a>
 </h3>
 <p class="abstract extended">
  We provide a way to surf the web and slice and dice information you find there into your own custom output stream.
 </p>
 ... some tag and category stuff ...
 <div class="atom:summary">
  <p>
   The folks at ... the rest of the content
  </p>
 </div>
 <p class="extended">
  <a href="http://thecommunityengine.com/.../xfolk_vegomatic.html#more">Continue reading "xFolk Veg-o-matic Alpha"</a>
 </p>
 ...
</div>

Notes:

  • we didn't do anything with the "abstract" section -- this is a discussion for another day
  • we didn't include the tag stuff in the summary, and probably wouldn't if this was the complete content

Entry with complete content

Original:

<div class="entry single">
 <h2 id="post-61">Class attributes are about more than styling</h2>

 <p>When people talk about microformats, ... </p>
 
 <blockquote cite="http://www.w3.org/TR/REC-html40/struct/global.html#h-7.5.2">
  ... quoted text from elsewhere
 </blockquote>

 <p>There’s a couple of points I’d like to highlight here:</p>
 
 ... more content ...

 <h4 class="tags">Technorati Tags:</h4>
 <ul class="tags">
  <li><a href="http://www.technorati.com/tag/css" rel="tag">css</a></li>
  ...
 </ul>

 <ul class="post-info">
  ... footer stuff ...
 </ul>
</div>

Transformation:

<div class="entry single atom:entry">
 <h2 id="post-61">Class attributes are about more than styling</h2>

 <div class="atom:content">
  <p>When people talk about microformats, ... </p>

  <blockquote cite="http://www.w3.org/TR/REC-html40/struct/global.html#h-7.5.2">
   ... quoted text from elsewhere
  </blockquote>

  <p>There’s a couple of points I’d like to highlight here:</p>

  ... more content ...
 </div>

 <h4 class="tags">Technorati Tags:</h4>
 <ul class="tags">
  <li><a href="http://www.technorati.com/tag/css" rel="tag">css</a></li>
  ...
 </ul>

 <ul class="post-info">
  ... footer stuff ...
 </ul>
</div>

Notes:

  • the only thing that really needed to be done is enclose the content
  • my preference would be to move the post id to the atom:entry

Entry with split content (multiple sections)

Original:

<div class="blogbody">
 <a name="008148"></a>

 <div class="title">
  Face to face: why places will continue to exist
 </div>

 <div class="posted">
  <strong>Brian Micklethwait (London)</strong>
    
  <a href="...">Science & Technology</a>
 </div>

 <p>It is not just that I dislike filling in forms....</p>
 ... the first section of the content ...

 ... this link makes the extended section show ...
 <span id="varP8148">
  <img src="http://www.samizdata.net/blog/img/bullet_tri.gif" width="16" height="10" alt="" />
  <a href="..." onclick="showMore(8148,'...');return false;">
   Read more.
  </a>
 </span>
  
 <div id="varXYZ8148" style="display: none">
  <p>The very gadgets – computers linked...</p>
  ... the rest of the extended content ...

  ... this link makes the extended section hide ...
  <img src="..." width="16" height="10" alt="" />
   <a href="#008148" onclick="showMore(8148,0);return true;">
    Read less.
   </a>
  </div>
 </div>
</div>

Transformation:

<div class="blogbody atom:entry" id="008148">
 <h3>
  Face to face: why places will continue to exist
 </h3>

 <div class="posted">
  <strong>Brian Micklethwait (London)</strong>
    
  <a href="...">Science & Technology</a>
 </div>

 <div class="atom:content">
  <p>It is not just that I dislike filling in forms....</p>
  ... the first section of the content ...
 </div>

 ... this link makes the extended section show ...
 <span id="varP8148">
  <img src="http://www.samizdata.net/blog/img/bullet_tri.gif" width="16" height="10" alt="" />
  <a href="..." onclick="showMore(8148,'...');return false;">
   Read more.
  </a>
 </span>
  
 <div id="varXYZ8148" style="display: none">
  <div class="atom:content">
   <p>The very gadgets – computers linked...</p>
   ... the rest of the extended content ...
  </div>

  ... this link makes the extended section hide ...
  <img src="..." width="16" height="10" alt="" />
   <a href="#008148" onclick="showMore(8148,0);return true;">
    Read less.
   </a>
  </div>
 </div>
</div>

Note:

  • atom:content divs were created for each the of the text sections, so that non-content coded would not be incorrectly marked
  • there are two atom:content sections; together they make the complete content
  • the conversion of <div class="header"> to <h3>
  • the addition of atom:entry as needed
  • the removal of the <a name="008148"> in favor of placing an id on the atom:entry
  • further manipulation of the author could be done
  • further manipulation of the category could be done

Entry Permalink

Permalinks roughly correspond to atom:link. See blog-post-examples#Entry_Permalinks for examples.

A permalink is called canonical if it is the best representation of the URI for that entry; the definition of what 'best representation' is is entirely at the discretion of the weblog's publisher. We recommend that weblogs use canonical URIs because it allows "threading" together multiple posts and sources with byte-level comparisons. In general, the canonical URI should be the link used in an Atom entry.

Microformat Recommendation

  • an Entry MUST NOT have more than one permalink marked as atom:link
  • permalinks SHOULD be absolute URIs
  • permalinks SHOULD be canonical
  • permalinks SHOULD be the same as the atom:link used in syndication feeds

Example Transformations

Original:

<div class="entry">
 <h3 id="post-45">
  <a 
   href="http://www.microformats.org/blog/2005/08/21/foobar-microformats/" 
   rel="bookmark"
   title="Permanent Link to FooBar Microformats">FooBar Microformats</a>
  </h3>
   ...
</div>

Transformation:

<div class="atom:entry entry">
 <h3 id="post-45">
  <a 
   href="http://www.microformats.org/blog/2005/08/21/foobar-microformats/" 
   rel="atom:link bookmark"
   title="Permanent Link to FooBar Microformats">FooBar Microformats</a>
  </h3>
   ...
</div>

Original:

<h3>YET ANOTHER INSTANCE OF THE WORLD FINALLY CATCHING UP TO THE BLOG</h3>
<p>Today's news: Neuticles win ... award.</p>
<p class="posted">
Posted by judi on October  7, 2005 at 05:00 PM |
<a href="http://blogs.herald.com/dave_barrys_blog/2005/10/yet_another_ins.html">Permalink</a>
</p>

Transformation:

<div class="atom:entry">
 <h3>YET ANOTHER INSTANCE OF THE WORLD FINALLY CATCHING UP TO THE BLOG</h3>
 <p>Today's news: Neuticles win ... award.</p>
 <p class="posted">
 Posted by judi on October  7, 2005 at 05:00 PM |
 <a rel="atom:link" href="http://blogs.herald.com/dave_barrys_blog/2005/10/yet_another_ins.html">Permalink</a>
</p>

Entry Datetimes - Creation and Modified

Weblogs typically display (in HTML) the creation time of their posts (roughly but not exactly corresponding to atom:published) and not so much the last modified time (atom:updated).

Also see Datetime Design Pattern for more information on specifying datetimes. The recommendation here is styled after datetimes in hCalendar 1.0.

Forms seen in the wild

See blog-post-formats#Datetimes

Microformat Recommendation

  • date headers between weblog entries are outside of this microformat
  • atom:published SHOULD be indicated by an abbr element around the human readable version of the date or datetime.
    • the 'class' attribute MUST indicate atom:published
    • the 'title' attribute MUST be a complete datetime, in the format of Datetime Design Pattern
  • likewise for atom:updated, if present

Example transformation

Note that the string 'atom:published' is a placeholder for something to be decided later.

Original:

<a href"...">Friday, September 30th, 2005 at 12:31 pm</a>

Transformed:

<a href"..."><abbr 
 class"atom:published" 
 title"200050930T12:31:01-0500">Friday, September 30th, 2005 at 12:31 pm</abbr></a>

Entry Author

A work in progress

Microformat Recommendation

  • Entry Authors SHOULD be inside a <address> block

Possible Atom to microformat(s) mapping

  • feed - "hfeed"
    • title - imply from <title> element
    • subtitle - re-use "description" per vCalendar, iCalendar, hCalendar, xFolk, and hReview.
    • id - imply from page URL
    • updated - "updated"
    • author - "author", if none found, imply from <address> (which SHOULD be used anyway), either way, MUST be an hCard.
    • generator - set by the converting script / XSLT, omit from hAtom. Similar to PRODID in hCalendar.
    • logo - re-use "logo" from hCard
    • icon - define new rel-icon (see XHTML2) for this
    • category - rel="tag" + rel-directory
    • rights - rel="license"
  • entry - "hentry"
    • title - "headline"
    • link - rel-design-pattern from HTML4
    • id - imply from permalink
    • summary - "excerpt"
    • content - "content"
    • published - "published"
    • updated - "updated"
    • author - "author", MUST be hCard, SHOULD be <address>
    • rights - rel="license"

Multiple feeds on a page

Post hAtom 1.0: support multiple feeds on a single page. Changes from above.

  • feed
    • title - "headline", same as entry
    • id - define new rel-canonical microformat for this.
    • author - "author" required.
  • entry

Discussion

feed title

I initially thought "fn" would make sense for the feed title, but having looked at some blogs/feeds, though in many cases the title of the blog/feed *is* the name of the blog/feed, this is often not the case.

Two examples:

  1. . Some blog titles consist of the blog name and the date, e.g. Scripting News does this.
  2. . Some blog titles consist of the blog name and a short temporary phrase or saying

I have seen both of these in the wild often enough to believe that blog title and blog name are not the same, thus it is inappropriate to re-use "fn" from hCard, since the feed title does not mean the same thing as the *name* of the feed. Thus I have removed the suggestion to re-use "fn" for feed title, and instead propose re-using "headline" from the entry, which does appear to have the same semantic.

Additional Possibilities

More post hAtom 1.0 thoughts:

  • entry
    • summary - "excerpt" or "abstract"
    • contributor - "contributor"
    • source - use <blockquote cite="">, put source in cite attribute.

Possible Uses

This section describes potential applications for a blog post microformat

Transformational Uses

By transformational, we mean feeding a weblog post to some sort of transformation tool (such as XSLT) to produce a different version of the post fit for a different use.

Printing Weblog Posts

Reblogging

  • ZDNet has a reblog button that would be made obsolete (or could be substantially improved) by use of this microformat
  • Reblog.com was the inspiration for this idea. This may be renamed RedirectThis?

Archival Uses

By 'archival', we mean taking weblog entries and placing them in a database for later analysis, searching, aggregation and so forth.

Personal Database

Search Engines

Partial Text Blogs

Partial content blogs can be created by producing the full html content of a blog entry but not marking it up as such. The atom:summary portion of that entry can be marked up as summary, or could be written up and placed in a hidden block element within the html. hAtom parsers would ignore the unannotated content and produce summary information only.

Obstacles

Header Tag for Entry Title?

--Bryan 14:55, 14 Aug 2005 (PDT)

Many weblog CMSes allow for concurrent publishing of entries in the following ways:

  • multiple entries on a page (an "Index," monthly archive, category archive, etc. see Example)
  • one entry on a page (see Example)

Early attempts at Current Blog Formats have set the title of the blog post to use the h3 tag.

At least where individual entry pages are concerned (and possible including indexes and archives), I recommend using h1 for the entry title, given that the entry is by far the most important chunk of information on the page, and it's what we'd want search engines to recognize as such. In the case where the h1 was used for the site title, fears about "losing" this information should be allayed by simply including the site name in the title tag, after the title of the article / entry / post.

Whether an h3 or h1 is used is irrelevant, the semantics will be applied with classnames. This is a non-issue. --RyanKing 22:35, 18 Aug 2005 (PDT)


See Also