blog-post-brainstorming

From Microformats Wiki
Jump to navigation Jump to search

Terminology

This section explores the terminology that should used to discuss a blog post microformat. To make it easier to talk about the various different types of teminology, We're using a XML-like namespace version so we can make statements like atom:entry is roughly equivalent to rss20:feed, atom:feed/atom:link@rel=alternate is roughly equivalent to rss20:channel/rss20:link or atom:author is not equivalent to rss:entry/rss:author (because RSS 2.0 is only the definition of an email address).

Common terminology in weblogs

Reviewing blog-post-formats#Tools, one can see that there's little standardization amongst tools or even within a individual tool (such as 'blogger') for names of elements of blog posts. There are however many common elements, including:

  • a container for all posts/entries
  • a container for individual posts
  • the post content, which can be complete, summarized with a link to the complete link, or a couple of paragraphs with javascript/CSS tricks to reveal the remainder of the content
  • the name of the author
  • the posting date (in many many formats)

Although this looks like a bit of a dog's breakfast, there is usually a fair amount of rigour behind the presentation, as Atom and/or RSS feeds can be produced also from the same tools.

Furthermore, in developing a microformat for weblog posts, we want to be careful not to break any (or many) templates. Note that many weblog templates will have to be updated as they produce somewhat crufty HTML rather than shiny XHTML.

Atom Terminology

See here for the spec and blog-post-formats#Atom for analysis.

  • atom:feed - (composite) a collection of entries plus information about them
    • atom:author - (composite) the author of a feed (may contain atom:email, atom:name, atom:uri)
    • atom:id - a permament identifier for a feed
    • atom:title - the title of an atom:entry or a atom:feed
    • atom:updated - the last time the feed was updated
    • atom:link@rel=alternate - the home page of a feed
    • atom:link@rel=self - the URI of the feed (where it can be downloaded)
    • atom:entry - (composite) an entry within the feed
      • atom:content - the feed's content
      • atom:summary - a summary of the feed's content


RSS Terminology

See here for the spec and blog-post-formats#RSS for analysis. There are a lot more elements in RSS but this covers the most commonly used ones.

  • rss2:channel - (composite) a collection of entries plus information about them
    • rss2:author - (composite) the author of a feed (may contain atom:email, atom:name, atom:uri)
    • rss2:link - The URL to the HTML website corresponding to the channel (compare to atom:link@rel=alternate)
    • rss2:title - the title of an rss2:channel or a rss2:item
    • rss2:pubDate - The publication date for the content in the channel.
    • rss2:item - (composite) an entry within the feed
      • rss2:item/link - The URL of the item. Note that this may not be a permalink for the item; it may be a link to some other page on the Internet that the rss2:item is about
      • rss2:description - The item synopsis [sic]. There is no special indication whether this is the full content of an entry, a summary, or a precis of what the rss2:item/link is pointing to
      • rss2:author - email address of the author of the item

Discovered Elements

This section explores the information discovered from blog-post-formats using the terminology discussed above. We will only focus on the major elements of weblog posts:

  • the entry container
  • the individual entry
  • the content
  • the permalink

For now, the codification of the following major elements will be deferred as there is/may be overlap with other microformats that should be explored further

  • the poster/author
  • the posting date
  • the modified date

Further input from the community would be appreciated here

Entry Container

Roughly speaking, this corresponds to 'atom:feed' or 'rss2:channel' (in particular, the items within those elements).

Forms seen in the wild

  • entries are within a container; that is, all entries are within an enclosing 'div'. This is common with weblog home pages (example) or archive with multiple entries.
  • entries are not within a container; that is, there are multiple entries on a single page but there is no explicit container element (example). This is also a common use case for weblogs and archives also.
  • there may be multiple groups of entries on a single page that are tenously connected (example-1 example-2).
  • there is only a single entry on a page. This is common with weblogs that archive on a per entry basis (example).

Recommendation for blog-post-format microformat

  • weblog pages (including home pages, archives, category pages, tag pages and so forth) that may container multiple entries MUST enclose the entries in a atom:feed element
  • weblog pages MAY have multiple atom:feed element enclosing different groups of entries
  • weblog pages that have exactly on entry MAY use the atom:feed

Individual Entry

Content

Permalink

Obstacles

The 'content' problem

The most inconsistent element of blog posts is the content of the post themselves. For example, one webpage may only have a summary of the page, another webpage may contain the first part of the content, with a "More" button to see the rest. These inconsistencies may make it difficult to rationally define (or clarify) a set of microformat elements to achieve blog-post-feed-equivalence.

Header Tag for Entry Title?

--Bryan 14:55, 14 Aug 2005 (PDT)

Many weblog CMSes allow for concurrent publishing of entries in the following ways:

  • multiple entries on a page (an "Index," monthly archive, category archive, etc. see Example)
  • one entry on a page (see Example)

Early attempts at blog-post-formats have set the title of the blog post to use the h3 tag.

At least where individual entry pages are concerned (and possible including indexes and archives), I recommend using h1 for the entry title, given that the entry is by far the most important chunk of information on the page, and it's what we'd want search engines to recognize as such. In the case where the h1 was used for the site title, fears about "losing" this information should be allayed by simply including the site name in the title tag, after the title of the article / entry / post.

Whether an h3 or h1 is used is irrelevant, the semantics will be applied with classnames. This is a non-issue. --RyanKing 22:35, 18 Aug 2005 (PDT)

Possible Uses

This section is to describe possible applications for a blog post microformat

See Also