blog-post-formats: Difference between revisions

From Microformats Wiki
Jump to navigation Jump to search
m (Replace <entry-title> with {{DISPLAYTITLE:}})
 
(79 intermediate revisions by 19 users not shown)
Line 1: Line 1:
= Current Blog Formats =
'''This page needs lots of updating, including:'''
* update participants (or remove)
* move analysis of implicit formats from tools to a section in [[blog-post-examples]]
* review remaining explicit formats for updates
* simplify/flatten sections/hierarchy
[[User:Tantek|Tantek]] 21:06, 3 October 2012 (UTC)
----
{{DISPLAYTITLE: Current Blog Formats }}


There is a need for developing standard classes for blog posts (i.e. a microformat!).  
There is a need for developing standard classes for blog posts (i.e. a microformat!).  


This page serves to document the current list of individual blog post schemas, formats, and efforts as background for the design of a simple blog post MicroFormat.
This page serves to document the current list of individual blog post schemas, formats, and efforts as background for the design of a simple blog post microformat.
 
The result of this exploration is:
* [[hAtom]]


== Discussion Participants ==
== Discussion Participants ==


=== Editor ===
=== Editor ===
* [http://blog.factoryjoe.com/ Chris Messina], [http://roundtwo.com Round Two, Inc.]
* [http://blog.factoryjoe.com/ Chris Messina], [http://flock.com Flock, Inc.]


=== Authors ===
=== Authors ===
* [http://blog.factoryjoe.com/ Chris Messina], [http://roundtwo.com Round Two, Inc.]
* [http://tantek.com/log/ Tantek Çelik], [http://technorati.com/ Technorati]
* [http://blog.factoryjoe.com/ Chris Messina], [http://flock.com Flock, Inc.]
* [http://blogmatrix.blogmatrix.com/ David Janes], [http://www.blogmatrix BlogMatrix, Inc.]
* [http://blogmatrix.blogmatrix.com/ David Janes], [http://www.blogmatrix BlogMatrix, Inc.]
* [http://www.geof.net/ Geof Glass]


=== Interested Folks ===
=== Interested Folks ===
* [http://tantek.com/log/ Tantek Çelik], [http://technorati.com/ Technorati]
* [http://theryanking.com/blog/ Ryan King], [http://technorati.com/ Technorati]
* [http://theryanking.com/blog/ Ryan King], [http://technorati.com/ Technorati]


== Tools ==
== Tools ==
=== Blogger ===
* http://www.blogger.com/home
Blogger is one the earliest, best known and probably most widely used blogging platform. Blogger was bought by [http://www.google.com Google] in February of 2003. Blogger allows users to create and edit their own templates and also provides a large number of (more or less) attractive templates from which the user can select. Unfortunately, you must log into a blogger account to see the template selection.
Here are several Blogger templates, randomly selected from the presets. More recent templates seem to be converging on a vocabulary for identifying parts of posts. This may because they share an evolutionary history from a common template. I've included three examples here:
<pre><nowiki>
<body>
<div id="content">
  <div id="main">
  ENTRIES
  </div>
</div>
</body>
<h2 class="date-header">POST DATE</h2>
<div class="post">
<a name="POST #"></a>
<h3 class="post-title">
  <a href="POST URI" title="external link">POST TITLE</a>
</h3>
<p>POST CONTENT</p>
<p class="post-footer">
  <em>posted by AUTHOR @ <a href="POST URI" title="permanent link">POST DATETIME</a></em>
</p>
</div>
</nowiki></pre>
<pre><nowiki>
<body>
<div id="main">
  <div id="main2">
  ENTRIES
  </div>
</div>
</body>
<div class="post">
<a name="POST #"></a>
<h3 class="post-title">
  <a href="sample_post.html" title="permanent link">POST TITLE</a>
</h3>
<div class="post-body">
  POST CONTENT
</div>
<p class="post-footer">
  <em>posted by AUTHOR @ <a href="POST URI" title="permanent link">POST TIME</a></em>
</p>
</div>
</nowiki></pre>
<pre><nowiki>
<body>
<div id="leftcontent">
ENTRIES
</div>
</body>
 
<div class="Post">
<a name="POST #"></a>
POST CONTENT
<span class="PostFooter">
  <a href="POST URI">POST TIME</a>
</span>
</div>
</nowiki></pre>
==== Key Class Names ====
* newer templates seem to use "main" to identify a enclosure for all entries
* newer templates use "post" to identify a weblog entry
* newer templates use "post-title" to identify the entry's title
* beyond this there is little standardization
==== Template Concepts ====
* all posts
* an individual post
* post title
* post author
* post posting time
* post content
* post URI (permalink)
=== Blosxom===
* http://www.blosxom.com/
* http://pyblosxom.sourceforge.net/


=== Drupal ===
=== Drupal ===
Line 24: Line 122:
* varies per theme
* varies per theme


=== WordPress ===
=== LiveJournal ===
* http://wordpress.org
* http://www.livejournal.com/
 
=== Blogger ===
* http://www.blogger.com/home


=== MovableType ===
=== MovableType ===
Line 47: Line 142:
  </div>
  </div>
</div>
</div>
</nowiki></pre>
The [individual entry archive] template looks like this:


<pre><nowiki>
<div class="content">
<div class="content">
  <p align="right">
  <p align="right">
Line 65: Line 156:
</nowiki></pre>
</nowiki></pre>


==== Key Elements ====
==== Key Class Names ====


* "content" can enclose an individual entry or all entries, depending on the context
* "content" can enclose an individual entry or all entries, depending on the context
Line 74: Line 165:
* there is no clear identification of the post's author
* there is no clear identification of the post's author
* the permalink is not necessarily on the page anywhere
* the permalink is not necessarily on the page anywhere
==== Template Concepts ====
* all posts
* an individual post
* post title
* post author
* post posting time
* post content, which includes the next two
* post content (first part)
* post content (expended part)
* post URI (permalink)


=== TypePad ===
=== TypePad ===
Line 80: Line 182:
Typepad is a MovableType hosting service. It provides a list of [http://help.typepad.com/tags/default_templates.html default templates] and [ "template modules"] from which users can construct or modify their own templates. Looking at several Typepad blogs, most or all of them following the nomenclature and struct defined by these templates.
Typepad is a MovableType hosting service. It provides a list of [http://help.typepad.com/tags/default_templates.html default templates] and [ "template modules"] from which users can construct or modify their own templates. Looking at several Typepad blogs, most or all of them following the nomenclature and struct defined by these templates.


The individual entry looks something like this:
The standard structure is as follows:


<pre><nowiki>
<pre><nowiki>
<h2 class="date-header">the date of the posting (optional)</h2>
<body class="layout-two-column-right">
<div id="container">
  <div id="container-inner" class="pkg">
  <div id="pagebody">
    <div id="pagebody-inner" class="pkg">
    <div id="alpha">
      <div id="alpha-inner" class="pkg">
        INDIVIDUAL ENTRY
      </div>
    </div>
    </div>
  </div>
  </div>
</div>
</body>


<div class="entry" id="entry-#####">
<div class="entry" id="entry-#####">
Line 100: Line 216:
  </p>
  </p>
</div>
</div>
</nowiki></pre>
All the entries (on the main page) are inclosed in this structure:
<pre><nowiki>
<body class="layout-two-column-right">
<div id="container">
  <div id="container-inner" class="pkg">
  <div id="pagebody">
    <div id="pagebody-inner" class="pkg">
    <div id="alpha">
      <div id="alpha-inner" class="pkg">
        INDIVIDUAL ENTRY
      </div>
    </div>
    </div>
  </div>
  </div>
</div>
</body>
</nowiki></pre>
</nowiki></pre>


Line 132: Line 228:
</nowiki></pre>
</nowiki></pre>


==== Key Elements ====
==== Key Class Names ====
* "entry" encloses all elements within an entry
* "entry" encloses all elements within an entry
* "entry-content" contains all the entry text, plus additional text saying "here's more"
* "entry-content" contains all the entry text, plus additional text saying "here's more"
Line 139: Line 235:
* there is no clear identification of "here's all the entries"
* there is no clear identification of "here's all the entries"
* there is no clear identification of the post's author
* there is no clear identification of the post's author
==== Template Concepts ====
* all posts
* an individual post
* post title
* post author
* post posting time
* post content, which includes the next two
* post content (first part)
* post content (expended part)
* post URI (permalink)
=== WordPress ===
* http://wordpress.org
WordPress is a popular GPLed blogging system based on PHP and MySQL. WordPress calls their templates "themes" -- [http://wordpress.org/extend/themes/ more information]. Wordpress does not have a standardized set of class names for identifying parts of the weblog content. I've included a number of examples of what is seen in the wild (move to [[http://microformats.org/wiki/blog-post-examples examples]]?)
Example 1: [http://nokrev.com/older/ Fresh Bananas] (Ed: dead link)
<pre><nowiki>
<body id="blog">
<div id="wrap">
  <div id="content" class="two_column">
  <div class="left">
    ENTRIES
  </div>
  </div>
</div>
</body>
<h2>POST TITLE</h2>
POST CONTENT (PARTIAL)
<p>
<a href="POST URRI" title="Contiue reading this post">Continue reading</a>
</p>
</nowiki></pre>
Example 2: [http://www.vanillamist.com/blog/ VanillaMist]
<pre><nowiki>
<body>
<div id="main">
  <div id="content">
  ENTRIES
  </div>
</div>
</body>
<div class="post">
<p class="post-date">Wed 6 Jul 2005
</p>
<div class="post-info">
  <h2 class="post-title">
  <a href="http://vanillamist.com/blog/?p=89" rel="bookmark" title="Permanent Link: Podcasts and a new version of Connections soon">Podcasts and a new version of Connections soon</a>
  </h2>
  Posted by AUTHOR under <a href="POST URI" title="View all posts in Blogs and Blogging" rel="category tag">CATEGORY</a>
  <div class="post-content">
  POST CONTENT
  </div>
  <div class="post-footer">&nbsp;</div>
</div>
</div>
</nowiki></pre>
Example 3: [http://www.aamukaste.org/wpthemes/ Boredom]
<pre><nowiki>
<body>
<div id="content">
</div>
</body>
<div class="post">
<h2 id="post-5">
  <a href="POST URI" rel="bookmark" title="Permanent Link to POST TITLE">POST TITLE</a>
</h2>
<small>POST DATE</small>
<div class="entry">
  POST CONTENT
</div>
<p class="postmetadata">
</p>
</div>
</nowiki></pre>
==== Key Class Names ====
There is very little reuse amongst the various templates selected.
==== Template Concepts ====
* Post
* Title
* Author
* Date
* Content (partial)
* Content (full)
A list of all the template elements is [http://codex.wordpress.org/Template_Tags available here].
A discussion about blog post content published as XML and the nature of the required format in relation to WordPress is [http://www.codescheme.net/2008/03/31/so-where-is-wordpress-blogxml-when-you-need-it/ here]
=== Xanga ===
* http://www.xanga.com/
== Journal Formats ==
Before blogs there were journals.  Many journals were kept merely on people's computers and not necessarily published. 
=== VJOURNAL ===
RFC2445 (iCalendar) defines the VJOURNAL object for storing journal entries which are essentially the same as blog posts.  Note that [[hcalendar|hCalendar]] by virtue of referencing all of RFC 2445, could be said to define VJOURNAL class names.
The basic structure of a series of VJOURNAL entries:
<pre><nowiki>
VJOURNAL      - 1
  class        - 0-1
    classparam    - 0-N
    classvalue    - 1; PUBLIC/PRIVATE/CONFIDENTIAL
  created      - 0-1
  description  - 0-1
    altrepparm    - 0-1
    languageparam - 0-1
    text          - 1
  dtstart      - 0-1
  dtstamp      - 0-1
  last-mod      - 0-1
  organizer    - 0-1
    cnparam      - 0-1
    dirparam      - 0-1
    sentbyparam  - 0-1
    languageparam - 0-1
    caladdress    - 1
  recurid      - 0-1
  seq          - 0-1
  status        - 0-1
    statvalue    - 1 DRAFT/FINAL/CANCELLED
  summary      - 0-1
    altrepparm    - 0-1
    languageparam - 0-1
    text          - 1
  uid          - 0-1
  url          - 0-1
  attach        - 0-N
    fmttype      - 0-1; mime type
    url          - 1; url
  attendee      - 0-N
    cutypeparam  - 0-1
    memberparam  - 0-1
    roleparam    - 0-1
    partstatparam - 0-1
    rsvpparam    - 0-1
    deltoparam    - 0-1
    delfromparam  - 0-1
    sentbyparam  - 0-1
    cnparam      - 0-1
    dirparam      - 0-1
    languageparam - 0-1
    caladdress    - 1
  categories    - 0-N
    languageparam - 0-1
    text          - 1-N; text
  comment      - 0-N
    altrepparam  - 0-1
    language-param- 0-1
    text          - 1; text
  contact      - 0-N
    altrepparam  - 0-1
    language-param- 0-1
    text          - 1; text
  exdate        - 0-N
  xrule        - 0-N
  related      - 0-N
    reltypeparam  - 0-1
    text          - other iCalendar component
  rdate        - 0-N
  rrule        - 0-N
  rstatus      - 0-N
</nowiki></pre>
Here are some example VJOURNAL entries from the rfc:
<pre><nowiki>
    BEGIN:VJOURNAL
    UID:19970901T130000Z-123405@host.com
    DTSTAMP:19970901T1300Z
    DTSTART;VALUE=DATE:19970317
    SUMMARY:Staff meeting minutes
    DESCRIPTION:1. Staff meeting: Participants include Joe\, Lisa
      and Bob. Aurora project plans were reviewed. There is currently
      no budget reserves for this project. Lisa will escalate to
      management. Next meeting on Tuesday.\n
      2. Telephone Conference: ABC Corp. sales representative called
      to discuss new printer. Promised to get us a demo by Friday.\n
      3. Henry Miller (Handsoff Insurance): Car was totaled by tree.
      Is looking into a loaner car. 654-2323 (tel).
    END:VJOURNAL
    BEGIN:VCALENDAR
    VERSION:2.0
    PRODID:-//ABC Corporation//NONSGML My Product//EN
    BEGIN:VJOURNAL
    DTSTAMP:19970324T120000Z
    UID:uid5@host1.com
    ORGANIZER:MAILTO:jsmith@host.com
    STATUS:DRAFT
    CLASS:PUBLIC
    CATEGORY:Project Report, XYZ, Weekly Meeting
    DESCRIPTION:Project xyz Review Meeting Minutes\n
      Agenda\n1. Review of project version 1.0 requirements.\n2.
    Definition
      of project processes.\n3. Review of project schedule.\n
      Participants: John Smith, Jane Doe, Jim Dandy\n-It was
      decided that the requirements need to be signed off by
      product marketing.\n-Project processes were accepted.\n
      -Project schedule needs to account for scheduled holidays
      and employee vacation time. Check with HR for specific
      dates.\n-New schedule will be distributed by Friday.\n-
      Next weeks meeting is cancelled. No meeting until 3/23.
    END:VJOURNAL
    END:VCALENDAR
</nowiki></pre>
Here's some analysis of parts of VJOURNAL which may be of interest to hAtom work. Items in bold are considered of interest to hAtom, others can be discarded without further discussion, because they're either machine-only or just don't fit into the weblog use-case.
* '''vjournal'''
** class - N/A to web, perhaps for archive format
** created - for machines only
** '''description''' - "full text"
** dtstart - when the entry is about - doesn't fit in weblog use case
** '''dtstamp''' - user creation time
** last-mod - machine use only (for the specific host's copy, not the same as 'updated')
** organizer - who organized the meeting (VJOURNAL's use case includes meeting notes)
** recurid - machine only
** seq - machine only
** status - N/A to WWW, consider for archive
** '''summary''' - "title"
** '''UID''' - "id/permalink"
** '''URL''' - "link"
** '''attach''' - "enclosure"
** attendee - who attended the meeting (think meeting notes)
** '''categories''' - "tags"
** comment - comments about the journal
** contact - author???
** exdate - N/A
** exrule - N/A
** '''related''' - just hyperlinks to other resources
** rrule - N/A
** rdate - N/A
** rstatus - N/A
== Syndication Feed Formats ==
All of the blogging tools above can produce syndication feeds from the same underlying content and thus prossibly worth discussing here.
[http://www.downes.ca/ Stephen Downes] postulates a [http://microformats.org/discuss/mail/microformats-discuss/2005-August/000670.html syndication/weblog equivalency rule], that RSS + XSLT = XHTML and XHTML + XSLT = RSS. In practice, this may not be exactly true due to syndication feeds often only provide summaries of the entry text and the definition of certain elements of syndications feeds may have ambiguous meaning or interpretation.
''This section may be moved elsewhere.''
=== Atom ===
* http://www.atomenabled.org/
Here is the basic structure of an Atom document, showing only required and recommended elements and the number of them that may appear (the rules are little more complicated than shown here, as some elements become optional or required depending on what else is included).
<pre><nowiki>
feed          - 1
  id            - 1
  title        - 1; type "text"
  updated      - 1
  link          - 0-1 recommended; type "link"
  author        - 0-N recommended; type "person"
  name          - 1
  email          - 0-1 recommended
  uri            - 0-1 recommended
  entry        - 0-N
  id            - 1
  title        - 1; type "text"
  updated      - 1
  published    - 0-1
  author        - 0-N recommended; type "person"
  content      - 0-1 recommended; type "text";
                  "contains or links to the complete content of the entry"
  link          - 0-N recommended; type "link"
  summary      - 0-1 recommended; type "text"
</nowiki></pre>
A note about Atom types
* [http://www.atomenabled.org/developers/syndication/#person person] - describes a person, corporation, or similar entity
* [http://www.atomenabled.org/developers/syndication/#text text] - contains human-readable text; @text defines the encoding of the text itself: "text" (default), "html", "xhtml"
* [http://www.atomenabled.org/developers/syndication/#link link] - is patterned after html's [http://www.w3.org/TR/1999/REC-html401-19991224/struct/links.html#h-12.3 link element]; @href is required; @rel, @type, @hreflang, @title, and @length are optional.
* published and updated are datetimes
Here's an example Atom feed
<pre><nowiki>
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>Example Feed</title>
  <link href="http://example.org/"/>
  <updated>2003-12-13T18:30:02Z</updated>
  <author>
    <name>John Doe</name>
  </author>
  <id>urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6</id>
  <entry>
    <title>Atom-Powered Robots Run Amok</title>
    <link href="http://example.org/2003/12/13/atom03"/>
    <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
    <updated>2003-12-13T18:30:02Z</updated>
    <summary>Some text.</summary>
  </entry>
</feed>
</nowiki></pre>
=== RSS 2.0 ===
* http://blogs.law.harvard.edu/tech/rss
RSS 2.0 is not to be confused with [http://web.resource.org/rss/1.0/ RSS 1.0] is based on RDF. More about the various versions of RSS and the reason for their existance can be read on [http://en.wikipedia.org/w/index.php?title=RSS_%28protocol%29 Wikipedia]
Here is the basic structure of an RSS document, showing only required and recommended elements and the number of them that may appear:
<pre><nowiki>
channel        - 1
  title          - 1; type "text"
  link            - 1; type "link"
  description    - 1; type "text"
  language        - 0-1; type lang
  copyright      - 0-1; type text
  managingEditor  - 0-1; type email
  webMaster      - 0-1; type email
  pubDate        - 0-1; type rfc822
  lastBuildDate  - 0-1; type rfc822
  category        - 0-N; type text
    @domain        - 0-1; type uri
  generator      - 0-1; type text
  docs            - 0-1; type url
  cloud          - 0-1
    @domain        - 1; type url domain
    @port          - 1; type url port
    @path          - 1; type url path
    @registerProcedure - 1; type token
    @protocol      - 1; type {http-post,xml-rpc,soap}
  ttl            - 0-1; type minutes
  image          - 0-1
    url            - 1; type url
    title          - 1; type text
    link            - 1; type url
    width          - 0-1; type pixels
    description    - 0-1; type text
  rating          - 0-1;
  textInput      - 0-1;
    title          1
    description    1
    name            1
    link            1
  skipHours      - 0-1;
  skipDays        - 0-1;
  item            - 0-N
    title          - 0-1*; type text
    link            - 0-1; type url
    description    - 0-1*; type text(?)
    author          - 0-1; type email
    category        - 0-N; type text
      @domain        - 0-1; type uri
    comments      - 0-1; type url
    enclosure    - 0-N;
      @url          - 1; type url
      @length      - 1; type bytes
      @type        - 1; type mime
    guid          - 0-1; type text
      @isPermalink  - 0-1; if true, text is a url
    pubDate      - 0-1; type rfc822
    source        - 0-1; type text
      @url          - 1; type url
* One of title or description must be present
</nowiki></pre>
Here's an example RSS 2.0 feed [''some editing needed internally to make entities show the correct way'']:
<pre><nowiki>
<?xml version="1.0"?>
<rss version="2.0">
  <channel>
      <title>Liftoff News</title>
      <link>http://liftoff.msfc.nasa.gov/</link>
      <description>Liftoff to Space Exploration.</description>
      <language>en-us</language>
      <pubDate>Tue, 10 Jun 2003 04:00:00 GMT</pubDate>
      <item>
        <title>Star City</title>
        <link>http://liftoff.msfc.nasa.gov/news/2003/news-starcity.asp</link>
        <description>How do Americans ...
          &lt;a href="http://howe.iki.rssi.ru/GCTC/gctc_e.htm"&gt;Star City&lt;/a&gt;.
        </description>
        <pubDate>Tue, 03 Jun 2003 09:39:21 GMT</pubDate>
        <guid>http://liftoff.msfc.nasa.gov/2003/06/03.html#item573</guid>
      </item>
  </channel>
</rss>
</nowiki></pre>
=== An XHTML profile for RDF Site Summaries ===
See [http://www.w3.org/2000/08/w3c-synd/ Site Summaries in XHTML], used in production for the W3C RSS feed for several years now.
* {{ToDo}} fill this out
== Discussion Forum / Bulletin Board Formats ==
Discussion forum posts are similar to blog posts, so may be relevant.  Due to the complexity of table-based layouts, much of the mark-up in the following examples has been stripped.  The mark-up and analysis also excludes controls (such as reply buttons).
=== Moodle ===
* http://moodle.org/
Moodle is an open source course management system used in education.  It includes a discussion forum.  The forum does not support templating.
<pre><nowiki>
<body class="mod-forum" id="mod-forum-discuss">
<div id="page">
  <div id="header">
  <div class="headermain">COURSE TITLE</div>
  <div class="headermenu">(controls)</div>
  </div>
  <table class="navbar">
  <tr>
    <td>
    <div class="breadcrumb">
  <a href="BREADCRUMB LINK">BREADCRUMB TITLE</a>+
</div>
    (controls)
    </td>
  </tr>
  </table>
  <div id="content">
  (controls)
  ENTRIES
  </div>
</div>
</body>
<table class="forumpost">
<tr>
  <td>
  <img src="AVATAR"/>
  </td>
  <td class="topic">
  <div class="subject">POST TITLE</div>
  <div class="author">by <a>AUTHOR</a> - POST DATE</div>
  </td>
</tr>
<tr>
  <td class="content">
  POST CONTENT
  <div class="commands"><a>Show Parent</a> | <a>Reply</a></div>
  </td>
</tr>
</table>
</nowiki></pre>
==== Key Class Names ====
* "headermain" identifies the course title
* "breadcrumb" identifies breadcrumbs
* "forumpost" identifies a post
* "subject" identifies the post subject
* "author" identifies the author, but also includes the date and other text
* "content" identifies the content area, but includes some controls also
==== Concepts ====
* course title
* breadcrumb titles and links
* post title
* post author
* post date
* post content
* original post (to which this is a reply)
* author avatar
=== phpBB ===
* http://www.phpbb.com/
phpBB is a popular GPL discussion forum system.  This is based on the forum on the [http://www.phpbb.com/phpBB/ phpBB] website.
<pre><nowiki>
<body>
<table>
  <tr>
  <td class="bodyline" bgcolor="white">
    <table width="100%" cellspacing="0" cellpadding="10" border="0">
    <tr>
      <td>
      <table>
        <tr>
        <td>
          <a class="maintitle" href="FORUM LINK">FORUM TITLE</a>
        </td>
        </tr>
      </table>
      <table width="100%" cellspacing="2" cellpadding="2" border="0">
        <tr>
        (controls)
        <td align="left" valign="middle" width="100%">
          <span class="nav"><a href="BREADCRUMB LINK" class="nav">BREADCRUMB TITLE</a>+</span>
        </td>
        </tr>
      </table>
      ENTRIES
      (controls)
      (repeat of controls and navigation)
  ...
</table>
</body>
<table class="forumline">
<tr>
  <td class="catHead">(controls)</td>
</tr>
<tr>
  <td class="row1">
  <span class="name">
    <a name="###"></a>
    <b>AUTHOR</b>
  </span>
  <span class="postdetails">
    AUTHOR STATUS (registered etc.)
    <img src="AVATAR"/>
    Joined: AUTHOR JOIN DATE
    Posts: AUTHOR POST COUNT
    Location: etc.
  </span>
  </td>
  <td class="row1">
  <table>
    <tr>
    <td>
      <span class="postdetails">Posted: POST DATE Post subject: POST TITLE</span>
    </td>
    <td>(controls)</td>
    </tr>
    <tr>
    <td>
      <span class="postbody">
      CONTENT
      </span>
    </td>
    </tr>
  </table>
  </td>
</tr>
(controls)
</table>
</nowiki></pre>
==== Key Class Names ====
* "maintitle" identifies the forum title and link
* "nav" is used for breadcrumbs
* "forumline" encloses the post
* "name" identifies the author
* "postdetails" occurs twice, once providing information about the author, once enclosing the post date and title
* "postbody" identifies the post content
==== Concepts ====
* forum title and link
* breadcrumb titles and links
* post title
* post author
* post date
* post content
* author status
* author join date
* author avatar
* author post count
* author location
=== PunBB ===
* http://punbb.org/
PunBB is GPL forum software.  This structure is from the [http://forums.punbb.org/ forum] on their web site.
<pre><nowiki>
<body>
<div id="punwrap">
  <div id="punviewtopic" class="pun">
  <div id="brdheader" class="block">
    <div class="box">
    <div id="brdtitle" class="inbox">
      <h1><span>SITE SECTION TITLE</span></h1>
      <p><span>SITE SECTION BYLINE</span></p>
    </div>
    (controls)
    </div>
  </div>
  <div id="announce" class="block">
  <h2><span>ANNOUNCEMENT TITLE</span></h2>
  <div class="box">
    <div class="inbox">
    <div><span class="warntext">ANNOUNCEMENT CONTENT</span></div>
    </div>
  </div>
  </div>
  <div class="linkst">
  <div class="inbox">
    (controls)
    <ul>
    <li><a href="BREADCRUMB LINK">BREADCRUMB TITLE</a></li>+
    </ul>
  </div>
  </div>
  ENTRIES
</div>
</body>
<div id="HTML-ID" class="blockpost">
<h2>
  <span class="conr">POST NUMBER</span>
  <a href="PERMALINK">POST DATE</a></h2>
<div class="box">
  <div class="inbox">
  <div class="postleft">
    <dl>
    <dt><strong><a href="AUTHOR PROFILE LINK">AUTHOR NAME</a></strong></dt>
    <dd class="usertitle"><strong>AUTHOR STATUS (e.g. Moderator)</strong></dd>
    <dd class="postavatar"><img src="AVATAR"/></dd>
    <dd>From: AUTHOR LOCATION</dd>
    <dd>Registered: AUTHOR REGISTRATION DATE</dd>
    <dd>Posts: AUTHOR POST COUNT</dd>
    <dd class="usercontacts"><a href="AUTHOR URL">Website</a></dd>
    </dl>
  </div>
  <div class="postright">
    <h3>POST TITLE</h3>
    <div class="postmsg">
POST CONTENT
    </div>
    <div class="postsignature">AUTHOR SIGNATURE</div>
  </div>
  <div class="clearer"></div>
  <div class="postfootleft"><p>AUTHOR STATE (online, offline)</p></div>
  <div class="postfootright"><div></div></div>
  </div>
</div>
</div>
</nowiki></pre>
==== Key Class Names ====
* "blockpost" encloses a post;  the element also has an HTML id
* "conr" is the post number within the discussion thread
* h1 indicates the site section title
* "announce" and "warntext" indicate the announcement title and content
* ul inside "linkst" identifies breadcrumbs
* h2 encloses the post number, post date, and permalink, distinguishable by HTML a and span elements
* h3 identifies the post title
* "postmsg" identifies the post content
* the author name and URL are in dl dt a
* additional author information is in dl dd
* "usertitle" identifies the author status
* "postavatar" indentifies the author's avatar
* "usercontacts" identifies author contact information
==== Concepts ====
* site section title and byline
* announcement title and content
* breadcrumb titles and links
* post title
* post author
* post date
* permalink
* post sequence number
* author status
* author avatar
* author location
* author registration date
* author post count
* author URL
* author profile link
* author signature
* author contact information (website)
=== YaBB ===
* http://www.yabbforum.com/
YaBB is a popular commercial/free forum system.  This example is based on the [http://www.yabbforum.com/community/ forum] on the YaBB site.  I have stripped some presentational class names.
<pre><nowiki>
<body>
<div class="container">
  <div class="maincontent">
  <div class="seperator">
    <table>
    (controls, login)
    <tr>
      <td>
      <span id="fscroller">FORUM WELCOME</span>
      (controls)
      </td>
    </tr>
    </table>
  </div>
  <div class="navbarcontainer">
    <table>
    <tr>
      <td>
      <span>
        <b><a href="BREADCRUMB LINK" class="nav">BREADCRUMB TITLE</a>+</b>
        (Moderators: <a href="MODERATOR LINK">MODERATOR NAME</a>+)
        <div class="seperator">
        <table>
        <tr>
          <td>
  <span>FORUM DESCRIPTION.</span>
          </td>
        </tr>
        </table>
      </div>
      </span>
      </td>
      (controls)
    </tr>
    </table>
  </div>
  ENTRIES
  </div>
  (repeat of navigation)
</div>
</body>
<div class="displaycontainer">
<table>
<tr>
  <td>
  <a><b>POST TITLE</b></a>
  AUTHOR STATUS (junior member, senior, etc.)<br />
  <img src="<AVATAR>" />
  AUTHOR BYLINE
  Posts: ###
  Gender: <img/>
  </td>
  <td>
  <div>
    <b>POST TITLE</b>
    <span><b>Reply #nnn on:</b> POST DATE</span>
  </div>
  <div>
    <span class="message">
    CONTENT
    </span>
  </div>
  </td>
</tr>
<tr>(controls)</tr>
</table>
</div>
</nowiki></pre>
==== Key Class Names ====
* "displaycontainer" encloses a post
* "message" identifies post content
* no other fields are identified by meaningful mark-up
==== Concepts ====
* forum welcome message
* forum description
* forum moderators (names and links)
* breadcrumbs (titles and links)
* post title
* post author
* post date
* post content
* original post (to which this is a reply)
* author status
* author byline
* author avatar
* author gender
* author post count


== Examples from the wild ==
== Examples from the wild ==
Line 144: Line 1,027:
* Early work on extending standardized nodes in Drupal: http://factorycity.net/demos/drupal/event_system/microformats/
* Early work on extending standardized nodes in Drupal: http://factorycity.net/demos/drupal/event_system/microformats/
* Microformat-style hooks in forum posts for Javascript annotation: http://www.geof.net/code/annotation/technical.html#microformats
* Microformat-style hooks in forum posts for Javascript annotation: http://www.geof.net/code/annotation/technical.html#microformats
=See Also=
* [[hatom|hAtom]] - the draft proposal based on this information
* [[blog-post-brainstorming]]
* [[blog-post-formats]]
* [[blog-post-examples]]
* [[blog-description-format]] - how to describe a blog (as opposed to the individual entries, which is what we're doing here)

Latest revision as of 16:21, 18 July 2020

This page needs lots of updating, including:

  • update participants (or remove)
  • move analysis of implicit formats from tools to a section in blog-post-examples
  • review remaining explicit formats for updates
  • simplify/flatten sections/hierarchy

Tantek 21:06, 3 October 2012 (UTC)



There is a need for developing standard classes for blog posts (i.e. a microformat!).

This page serves to document the current list of individual blog post schemas, formats, and efforts as background for the design of a simple blog post microformat.

The result of this exploration is:

Discussion Participants

Editor

Authors

Interested Folks

Tools

Blogger

Blogger is one the earliest, best known and probably most widely used blogging platform. Blogger was bought by Google in February of 2003. Blogger allows users to create and edit their own templates and also provides a large number of (more or less) attractive templates from which the user can select. Unfortunately, you must log into a blogger account to see the template selection.

Here are several Blogger templates, randomly selected from the presets. More recent templates seem to be converging on a vocabulary for identifying parts of posts. This may because they share an evolutionary history from a common template. I've included three examples here:

<body>
 <div id="content">
  <div id="main">
   ENTRIES
  </div>
 </div>
</body>

<h2 class="date-header">POST DATE</h2>
<div class="post">
 <a name="POST #"></a>
 <h3 class="post-title">
  <a href="POST URI" title="external link">POST TITLE</a>
 </h3>
 <p>POST CONTENT</p>
 <p class="post-footer">
  <em>posted by AUTHOR @ <a href="POST URI" title="permanent link">POST DATETIME</a></em>
 </p>
</div>
<body>
 <div id="main">
  <div id="main2">
   ENTRIES
  </div>
 </div>
</body>

<div class="post">
 <a name="POST #"></a>
 <h3 class="post-title">
  <a href="sample_post.html" title="permanent link">POST TITLE</a>
 </h3>
 <div class="post-body">
   POST CONTENT
 </div>
 <p class="post-footer">
   <em>posted by AUTHOR @ <a href="POST URI" title="permanent link">POST TIME</a></em>
 </p>
</div>
<body>
 <div id="leftcontent">
 ENTRIES
 </div>
</body>
  
<div class="Post">
 <a name="POST #"></a>
 POST CONTENT
 <span class="PostFooter">
  <a href="POST URI">POST TIME</a> 
 </span>
</div>

Key Class Names

  • newer templates seem to use "main" to identify a enclosure for all entries
  • newer templates use "post" to identify a weblog entry
  • newer templates use "post-title" to identify the entry's title
  • beyond this there is little standardization

Template Concepts

  • all posts
  • an individual post
  • post title
  • post author
  • post posting time
  • post content
  • post URI (permalink)

Blosxom

Drupal

LiveJournal

MovableType

MovableType is a perl-based blogging platform. Note that the MT is old and widely deployed and there are very many different variants on the templates in the wild.

The standard template for the weblog's main page (the "main index") has the following structure:

<body>
 <div class="content">
 <h2>DATE HEADER</h2>
 <h3 id="a####">POST TITLE</h3>
 POST CONTENT
 [ OPTIONAL LINK TO MORE POST CONTENT ]
 <p class="posted">Posted by AUTHOR at <a href="POST URI">POST DATE</a>
 </div>
</div>

<div class="content">
 <p align="right">
  <h2>POST DATE</h2>
 </p>
 <h3>POST TITLE</h3>
 POST CONTENT
 <div id="more">
  MORE POST CONTENT (optional)
 </div>
 <p class="posted">Posted by AUTHOR at DATE</p>
</div>

Key Class Names

  • "content" can enclose an individual entry or all entries, depending on the context
  • "h2" encloses the post date (literally: the time is not included)
  • "h3" encloses the title
  • there is no standard enclosure for all the content
  • there is no clear identification of "here's all the entries"
  • there is no clear identification of the post's author
  • the permalink is not necessarily on the page anywhere

Template Concepts

  • all posts
  • an individual post
  • post title
  • post author
  • post posting time
  • post content, which includes the next two
  • post content (first part)
  • post content (expended part)
  • post URI (permalink)

TypePad

Typepad is a MovableType hosting service. It provides a list of default templates and [ "template modules"] from which users can construct or modify their own templates. Looking at several Typepad blogs, most or all of them following the nomenclature and struct defined by these templates.

The standard structure is as follows:

<body class="layout-two-column-right">
 <div id="container">
  <div id="container-inner" class="pkg">
   <div id="pagebody">
    <div id="pagebody-inner" class="pkg">
     <div id="alpha">
      <div id="alpha-inner" class="pkg">
        INDIVIDUAL ENTRY
      </div>
     </div>
    </div>
   </div>
  </div>
 </div>
</body>

<div class="entry" id="entry-#####">
  <h3 class="entry-header">POST TITLE</h3>
 <div class="entry-content">
  <div class="entry-body">
    POST CONTENT
  </div>
   <a id="more"></a>
   <div class="entry-more">
     MORE POST CONTENT
   </div>
 </div>
 <p class="entry-footer">
   POST FOOTER
 </p>
</div>

I cannot seem to track down in the templates where the POST FOOTER is defined. However, we can see the results from a sample blog:

<span class="post-footers">Posted by AUTHOR_NAME in CATEGORY</span> 
<span class="separator">|</span> 
<a class="permalink" href="ENTRY_URI">Permalink</a>
| <a href="COMMENT_URI">Comments (2)</a>
| <a href="TRACKBACKS_URI">TrackBack (0)</a>

Key Class Names

  • "entry" encloses all elements within an entry
  • "entry-content" contains all the entry text, plus additional text saying "here's more"
  • "entry-header" contains the title of the post
  • "permalink" contains the post's URI
  • there is no clear identification of "here's all the entries"
  • there is no clear identification of the post's author

Template Concepts

  • all posts
  • an individual post
  • post title
  • post author
  • post posting time
  • post content, which includes the next two
  • post content (first part)
  • post content (expended part)
  • post URI (permalink)

WordPress

WordPress is a popular GPLed blogging system based on PHP and MySQL. WordPress calls their templates "themes" -- more information. Wordpress does not have a standardized set of class names for identifying parts of the weblog content. I've included a number of examples of what is seen in the wild (move to [examples]?)

Example 1: Fresh Bananas (Ed: dead link)

<body id="blog">
 <div id="wrap">
  <div id="content" class="two_column">
   <div class="left">
    ENTRIES
   </div>
  </div>
 </div>
</body>

<h2>POST TITLE</h2>
POST CONTENT (PARTIAL)
<p>
 <a href="POST URRI" title="Contiue reading this post">Continue reading</a>
</p>

Example 2: VanillaMist

<body>
 <div id="main">
  <div id="content">
   ENTRIES
  </div>
 </div>
</body>

<div class="post">
 <p class="post-date">Wed 6 Jul 2005
 </p>
 <div class="post-info">
  <h2 class="post-title">
   <a href="http://vanillamist.com/blog/?p=89" rel="bookmark" title="Permanent Link: Podcasts and a new version of Connections soon">Podcasts and a new version of Connections soon</a>
  </h2>
  Posted by AUTHOR under <a href="POST URI" title="View all posts in Blogs and Blogging" rel="category tag">CATEGORY</a>

  <div class="post-content">
   POST CONTENT
  </div>

  <div class="post-footer"> </div>
 </div>
</div>

Example 3: Boredom

<body>
 <div id="content">
 </div>
</body>

<div class="post">
 <h2 id="post-5">
  <a href="POST URI" rel="bookmark" title="Permanent Link to POST TITLE">POST TITLE</a>
 </h2>
 <small>POST DATE</small>
 <div class="entry">
  POST CONTENT
 </div>
 <p class="postmetadata">
 </p>
</div>

Key Class Names

There is very little reuse amongst the various templates selected.

Template Concepts

  • Post
  • Title
  • Author
  • Date
  • Content (partial)
  • Content (full)

A list of all the template elements is available here.

A discussion about blog post content published as XML and the nature of the required format in relation to WordPress is here

Xanga

Journal Formats

Before blogs there were journals. Many journals were kept merely on people's computers and not necessarily published.

VJOURNAL

RFC2445 (iCalendar) defines the VJOURNAL object for storing journal entries which are essentially the same as blog posts. Note that hCalendar by virtue of referencing all of RFC 2445, could be said to define VJOURNAL class names.

The basic structure of a series of VJOURNAL entries:

VJOURNAL      - 1
  class         - 0-1
    classparam    - 0-N
    classvalue    - 1; PUBLIC/PRIVATE/CONFIDENTIAL
  created       - 0-1
  description   - 0-1
    altrepparm    - 0-1
    languageparam - 0-1
    text          - 1
  dtstart       - 0-1
  dtstamp       - 0-1
  last-mod      - 0-1
  organizer     - 0-1
    cnparam       - 0-1
    dirparam      - 0-1
    sentbyparam   - 0-1
    languageparam - 0-1
    caladdress    - 1
  recurid       - 0-1
  seq           - 0-1
  status        - 0-1
    statvalue     - 1 DRAFT/FINAL/CANCELLED
  summary       - 0-1
    altrepparm    - 0-1
    languageparam - 0-1
    text          - 1
  uid           - 0-1
  url           - 0-1
  attach        - 0-N
    fmttype       - 0-1; mime type
    url           - 1; url
  attendee      - 0-N
    cutypeparam   - 0-1
    memberparam   - 0-1
    roleparam     - 0-1
    partstatparam - 0-1
    rsvpparam     - 0-1
    deltoparam    - 0-1
    delfromparam  - 0-1
    sentbyparam   - 0-1
    cnparam       - 0-1
    dirparam      - 0-1
    languageparam - 0-1
    caladdress    - 1
  categories    - 0-N
    languageparam - 0-1
    text          - 1-N; text
  comment       - 0-N
    altrepparam   - 0-1
    language-param- 0-1
    text          - 1; text
  contact       - 0-N
    altrepparam   - 0-1
    language-param- 0-1
    text          - 1; text
  exdate        - 0-N
  xrule         - 0-N
  related       - 0-N
    reltypeparam  - 0-1
    text          - other iCalendar component
  rdate         - 0-N
  rrule         - 0-N
  rstatus       - 0-N

Here are some example VJOURNAL entries from the rfc:

     BEGIN:VJOURNAL
     UID:19970901T130000Z-123405@host.com
     DTSTAMP:19970901T1300Z
     DTSTART;VALUE=DATE:19970317
     SUMMARY:Staff meeting minutes
     DESCRIPTION:1. Staff meeting: Participants include Joe\, Lisa
       and Bob. Aurora project plans were reviewed. There is currently
       no budget reserves for this project. Lisa will escalate to
       management. Next meeting on Tuesday.\n
       2. Telephone Conference: ABC Corp. sales representative called
       to discuss new printer. Promised to get us a demo by Friday.\n
       3. Henry Miller (Handsoff Insurance): Car was totaled by tree.
       Is looking into a loaner car. 654-2323 (tel).
     END:VJOURNAL

     BEGIN:VCALENDAR
     VERSION:2.0
     PRODID:-//ABC Corporation//NONSGML My Product//EN
     BEGIN:VJOURNAL
     DTSTAMP:19970324T120000Z
     UID:uid5@host1.com
     ORGANIZER:MAILTO:jsmith@host.com
     STATUS:DRAFT
     CLASS:PUBLIC
     CATEGORY:Project Report, XYZ, Weekly Meeting
     DESCRIPTION:Project xyz Review Meeting Minutes\n
      Agenda\n1. Review of project version 1.0 requirements.\n2.
     Definition
      of project processes.\n3. Review of project schedule.\n
      Participants: John Smith, Jane Doe, Jim Dandy\n-It was
       decided that the requirements need to be signed off by
       product marketing.\n-Project processes were accepted.\n
      -Project schedule needs to account for scheduled holidays
       and employee vacation time. Check with HR for specific
       dates.\n-New schedule will be distributed by Friday.\n-
      Next weeks meeting is cancelled. No meeting until 3/23.
     END:VJOURNAL
     END:VCALENDAR

Here's some analysis of parts of VJOURNAL which may be of interest to hAtom work. Items in bold are considered of interest to hAtom, others can be discarded without further discussion, because they're either machine-only or just don't fit into the weblog use-case.

  • vjournal
    • class - N/A to web, perhaps for archive format
    • created - for machines only
    • description - "full text"
    • dtstart - when the entry is about - doesn't fit in weblog use case
    • dtstamp - user creation time
    • last-mod - machine use only (for the specific host's copy, not the same as 'updated')
    • organizer - who organized the meeting (VJOURNAL's use case includes meeting notes)
    • recurid - machine only
    • seq - machine only
    • status - N/A to WWW, consider for archive
    • summary - "title"
    • UID - "id/permalink"
    • URL - "link"
    • attach - "enclosure"
    • attendee - who attended the meeting (think meeting notes)
    • categories - "tags"
    • comment - comments about the journal
    • contact - author???
    • exdate - N/A
    • exrule - N/A
    • related - just hyperlinks to other resources
    • rrule - N/A
    • rdate - N/A
    • rstatus - N/A

Syndication Feed Formats

All of the blogging tools above can produce syndication feeds from the same underlying content and thus prossibly worth discussing here.

Stephen Downes postulates a syndication/weblog equivalency rule, that RSS + XSLT = XHTML and XHTML + XSLT = RSS. In practice, this may not be exactly true due to syndication feeds often only provide summaries of the entry text and the definition of certain elements of syndications feeds may have ambiguous meaning or interpretation.

This section may be moved elsewhere.

Atom

Here is the basic structure of an Atom document, showing only required and recommended elements and the number of them that may appear (the rules are little more complicated than shown here, as some elements become optional or required depending on what else is included).

feed           - 1
  id            - 1
  title         - 1; type "text"
  updated       - 1
  link          - 0-1 recommended; type "link"
  author        - 0-N recommended; type "person"
   name           - 1
   email          - 0-1 recommended
   uri            - 0-1 recommended
  entry         - 0-N
   id            - 1
   title         - 1; type "text"
   updated       - 1
   published     - 0-1
   author        - 0-N recommended; type "person"
   content       - 0-1 recommended; type "text";
                   "contains or links to the complete content of the entry"
   link          - 0-N recommended; type "link"
   summary       - 0-1 recommended; type "text"

A note about Atom types

  • person - describes a person, corporation, or similar entity
  • text - contains human-readable text; @text defines the encoding of the text itself: "text" (default), "html", "xhtml"
  • link - is patterned after html's link element; @href is required; @rel, @type, @hreflang, @title, and @length are optional.
  • published and updated are datetimes

Here's an example Atom feed

<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

  <title>Example Feed</title>
  <link href="http://example.org/"/>
  <updated>2003-12-13T18:30:02Z</updated>
  <author>
    <name>John Doe</name>
  </author>
  <id>urn:uuid:60a76c80-d399-11d9-b93C-0003939e0af6</id>

  <entry>
    <title>Atom-Powered Robots Run Amok</title>
    <link href="http://example.org/2003/12/13/atom03"/>
    <id>urn:uuid:1225c695-cfb8-4ebb-aaaa-80da344efa6a</id>
    <updated>2003-12-13T18:30:02Z</updated>
    <summary>Some text.</summary>
  </entry>

</feed>

RSS 2.0

RSS 2.0 is not to be confused with RSS 1.0 is based on RDF. More about the various versions of RSS and the reason for their existance can be read on Wikipedia

Here is the basic structure of an RSS document, showing only required and recommended elements and the number of them that may appear:

channel         - 1
  title           - 1; type "text"
  link            - 1; type "link"
  description     - 1; type "text"
  language        - 0-1; type lang
  copyright       - 0-1; type text
  managingEditor  - 0-1; type email
  webMaster       - 0-1; type email
  pubDate         - 0-1; type rfc822
  lastBuildDate   - 0-1; type rfc822
  category        - 0-N; type text
    @domain         - 0-1; type uri
  generator       - 0-1; type text
  docs            - 0-1; type url
  cloud           - 0-1
    @domain         - 1; type url domain
    @port           - 1; type url port
    @path           - 1; type url path
    @registerProcedure - 1; type token
    @protocol       - 1; type {http-post,xml-rpc,soap}
  ttl             - 0-1; type minutes
  image           - 0-1
    url             - 1; type url
    title           - 1; type text
    link            - 1; type url
    width           - 0-1; type pixels
    description     - 0-1; type text
  rating          - 0-1;
  textInput       - 0-1;
    title           1
    description     1
    name            1
    link            1
  skipHours       - 0-1;
  skipDays        - 0-1;
  item            - 0-N
    title           - 0-1*; type text
    link            - 0-1; type url
    description     - 0-1*; type text(?)
    author          - 0-1; type email
    category        - 0-N; type text
      @domain         - 0-1; type uri
    comments      - 0-1; type url
    enclosure     - 0-N;
      @url          - 1; type url
      @length       - 1; type bytes
      @type         - 1; type mime
    guid          - 0-1; type text
      @isPermalink  - 0-1; if true, text is a url
    pubDate       - 0-1; type rfc822
    source        - 0-1; type text
      @url          - 1; type url

* One of title or description must be present

Here's an example RSS 2.0 feed [some editing needed internally to make entities show the correct way]:

<?xml version="1.0"?>
<rss version="2.0">
   <channel>
      <title>Liftoff News</title>
      <link>http://liftoff.msfc.nasa.gov/</link>
      <description>Liftoff to Space Exploration.</description>
      <language>en-us</language>
      <pubDate>Tue, 10 Jun 2003 04:00:00 GMT</pubDate>
      <item>

         <title>Star City</title>
         <link>http://liftoff.msfc.nasa.gov/news/2003/news-starcity.asp</link>
         <description>How do Americans ...
           <a href="http://howe.iki.rssi.ru/GCTC/gctc_e.htm">Star City</a>.
         </description>
         <pubDate>Tue, 03 Jun 2003 09:39:21 GMT</pubDate>
         <guid>http://liftoff.msfc.nasa.gov/2003/06/03.html#item573</guid>

      </item>
   </channel>
</rss>

An XHTML profile for RDF Site Summaries

See Site Summaries in XHTML, used in production for the W3C RSS feed for several years now.

  • to do! fill this out

Discussion Forum / Bulletin Board Formats

Discussion forum posts are similar to blog posts, so may be relevant. Due to the complexity of table-based layouts, much of the mark-up in the following examples has been stripped. The mark-up and analysis also excludes controls (such as reply buttons).

Moodle

Moodle is an open source course management system used in education. It includes a discussion forum. The forum does not support templating.

<body class="mod-forum" id="mod-forum-discuss">
 <div id="page">
  <div id="header">
   <div class="headermain">COURSE TITLE</div>
   <div class="headermenu">(controls)</div>
  </div>
  <table class="navbar">
   <tr>
    <td>
     <div class="breadcrumb">
	  <a href="BREADCRUMB LINK">BREADCRUMB TITLE</a>+
	 </div>
     (controls)
    </td>
   </tr>
  </table>
  <div id="content">
   (controls)
   ENTRIES
  </div>
 </div>
</body>

<table class="forumpost">
 <tr>
  <td>
   <img src="AVATAR"/>
  </td>
  <td class="topic">
   <div class="subject">POST TITLE</div>
   <div class="author">by <a>AUTHOR</a> - POST DATE</div>
  </td>
 </tr>
 <tr>
  <td class="content">
   POST CONTENT
   <div class="commands"><a>Show Parent</a> | <a>Reply</a></div>
  </td>
 </tr>
</table>

Key Class Names

  • "headermain" identifies the course title
  • "breadcrumb" identifies breadcrumbs
  • "forumpost" identifies a post
  • "subject" identifies the post subject
  • "author" identifies the author, but also includes the date and other text
  • "content" identifies the content area, but includes some controls also

Concepts

  • course title
  • breadcrumb titles and links
  • post title
  • post author
  • post date
  • post content
  • original post (to which this is a reply)
  • author avatar

phpBB

phpBB is a popular GPL discussion forum system. This is based on the forum on the phpBB website.

<body>
 <table>
  <tr>
   <td class="bodyline" bgcolor="white">
    <table width="100%" cellspacing="0" cellpadding="10" border="0">
     <tr>
      <td>
       <table>
        <tr>
         <td>
          <a class="maintitle" href="FORUM LINK">FORUM TITLE</a>
         </td>
        </tr>
       </table>
       <table width="100%" cellspacing="2" cellpadding="2" border="0">
        <tr>
         (controls)
         <td align="left" valign="middle" width="100%">
          <span class="nav"><a href="BREADCRUMB LINK" class="nav">BREADCRUMB TITLE</a>+</span>
         </td>
        </tr>
       </table>

       ENTRIES
       (controls)
       (repeat of controls and navigation)
  ...
 </table>
</body>

<table class="forumline">
 <tr>
  <td class="catHead">(controls)</td>
 </tr>
 <tr>
  <td class="row1">
   <span class="name">
    <a name="###"></a>
    <b>AUTHOR</b>
   </span>
   <span class="postdetails">
    AUTHOR STATUS (registered etc.)
    <img src="AVATAR"/>
    Joined: AUTHOR JOIN DATE
    Posts: AUTHOR POST COUNT
    Location: etc.
   </span>
  </td>
  <td class="row1">
   <table>
    <tr>
     <td>
      <span class="postdetails">Posted: POST DATE Post subject: POST TITLE</span>
     </td>
     <td>(controls)</td>
    </tr>
    <tr>
     <td>
      <span class="postbody">
       CONTENT
      </span>
     </td>
    </tr>
   </table>
  </td>
 </tr>
 (controls)
</table>

Key Class Names

  • "maintitle" identifies the forum title and link
  • "nav" is used for breadcrumbs
  • "forumline" encloses the post
  • "name" identifies the author
  • "postdetails" occurs twice, once providing information about the author, once enclosing the post date and title
  • "postbody" identifies the post content

Concepts

  • forum title and link
  • breadcrumb titles and links
  • post title
  • post author
  • post date
  • post content
  • author status
  • author join date
  • author avatar
  • author post count
  • author location

PunBB

PunBB is GPL forum software. This structure is from the forum on their web site.

<body>
 <div id="punwrap">
  <div id="punviewtopic" class="pun">
   <div id="brdheader" class="block">
    <div class="box">
     <div id="brdtitle" class="inbox">
      <h1><span>SITE SECTION TITLE</span></h1>
      <p><span>SITE SECTION BYLINE</span></p>
     </div>
     (controls)
    </div>
   </div>
  <div id="announce" class="block">
   <h2><span>ANNOUNCEMENT TITLE</span></h2>
   <div class="box">
    <div class="inbox">
     <div><span class="warntext">ANNOUNCEMENT CONTENT</span></div>
    </div>
   </div>
  </div>
  <div class="linkst">
   <div class="inbox">
    (controls)
    <ul>
     <li><a href="BREADCRUMB LINK">BREADCRUMB TITLE</a></li>+
    </ul>
   </div>
  </div>
  ENTRIES
 </div>
</body>

<div id="HTML-ID" class="blockpost">
 <h2>
  <span class="conr">POST NUMBER</span>
  <a href="PERMALINK">POST DATE</a></h2>
 <div class="box">
  <div class="inbox">
   <div class="postleft">
    <dl>
     <dt><strong><a href="AUTHOR PROFILE LINK">AUTHOR NAME</a></strong></dt>
     <dd class="usertitle"><strong>AUTHOR STATUS (e.g. Moderator)</strong></dd>
     <dd class="postavatar"><img src="AVATAR"/></dd>
     <dd>From: AUTHOR LOCATION</dd>
     <dd>Registered: AUTHOR REGISTRATION DATE</dd>
     <dd>Posts: AUTHOR POST COUNT</dd>
     <dd class="usercontacts"><a href="AUTHOR URL">Website</a></dd>
    </dl>
   </div>
   <div class="postright">
    <h3>POST TITLE</h3>
    <div class="postmsg">
	 POST CONTENT
    </div>
    <div class="postsignature">AUTHOR SIGNATURE</div>
   </div>
   <div class="clearer"></div>
   <div class="postfootleft"><p>AUTHOR STATE (online, offline)</p></div>
   <div class="postfootright"><div></div></div>
  </div>
 </div>
</div>

Key Class Names

  • "blockpost" encloses a post; the element also has an HTML id
  • "conr" is the post number within the discussion thread
  • h1 indicates the site section title
  • "announce" and "warntext" indicate the announcement title and content
  • ul inside "linkst" identifies breadcrumbs
  • h2 encloses the post number, post date, and permalink, distinguishable by HTML a and span elements
  • h3 identifies the post title
  • "postmsg" identifies the post content
  • the author name and URL are in dl dt a
  • additional author information is in dl dd
  • "usertitle" identifies the author status
  • "postavatar" indentifies the author's avatar
  • "usercontacts" identifies author contact information

Concepts

  • site section title and byline
  • announcement title and content
  • breadcrumb titles and links
  • post title
  • post author
  • post date
  • permalink
  • post sequence number
  • author status
  • author avatar
  • author location
  • author registration date
  • author post count
  • author URL
  • author profile link
  • author signature
  • author contact information (website)

YaBB

YaBB is a popular commercial/free forum system. This example is based on the forum on the YaBB site. I have stripped some presentational class names.

<body>
 <div class="container">
  <div class="maincontent">
   <div class="seperator">
    <table>
     (controls, login)
     <tr>
      <td>
       <span id="fscroller">FORUM WELCOME</span>
       (controls)
      </td>
     </tr>
    </table>
   </div>
   <div class="navbarcontainer">
    <table>
     <tr>
      <td>
       <span>
        <b><a href="BREADCRUMB LINK" class="nav">BREADCRUMB TITLE</a>+</b>
        (Moderators: <a href="MODERATOR LINK">MODERATOR NAME</a>+)
        <div class="seperator">
        <table>
         <tr>
          <td>
	   <span>FORUM DESCRIPTION.</span>
          </td>
         </tr>
        </table>
       </div>
      </span>
      </td>
      (controls)
     </tr>
    </table>
   </div>
   ENTRIES
  </div>
  (repeat of navigation)
 </div>
</body>

<div class="displaycontainer">
<table>
 <tr>
  <td>
   <a><b>POST TITLE</b></a>
   AUTHOR STATUS (junior member, senior, etc.)<br />
   <img src="<AVATAR>" />
   AUTHOR BYLINE
   Posts: ###
   Gender: <img/>
  </td>
  <td>
   <div>
    <b>POST TITLE</b>
    <span><b>Reply #nnn on:</b> POST DATE</span>
   </div>
   <div>
    <span class="message">
     CONTENT
    </span>
   </div>
  </td>
 </tr>
 <tr>(controls)</tr>
</table>
</div>

Key Class Names

  • "displaycontainer" encloses a post
  • "message" identifies post content
  • no other fields are identified by meaningful mark-up

Concepts

  • forum welcome message
  • forum description
  • forum moderators (names and links)
  • breadcrumbs (titles and links)
  • post title
  • post author
  • post date
  • post content
  • original post (to which this is a reply)
  • author status
  • author byline
  • author avatar
  • author gender
  • author post count

Examples from the wild


See Also