[uf-discuss] code microformat

Tue Jan 30 01:55:01 PST 2007

On 1/30/07, Colin Barrett <timber at lava.net> wrote:
> I have been posting some code listings on my blog recently. It would
> be really nice to have these sections identified (so then a source
> coloring tool could identify them and color them)
> <pre><code>
> code
> </code></pre>
> is the awful HTML I have been using. It would be nice to have
> something more semantic to put up, particularly with regards to
> licensing -- Some of the code snippets are public domain, some are
> GPL, and I don't really have any way of noting this currently.

In a blog context, I've found that common RSS or Atom tools don't do
well at retaining the whitespace in posts so relying on PRE isn't
foolproof, even though it makes sense.

I've ended up entity-encoding and adding <br /> to my code examples
when I use them.  I'm normally averse to <br /> but I think that code
listings are one of the few areas where they make sense.

This isn't particularly structured stuff, I'll try and tidy it up for
the wiki, but anecdotally I'd expect the following sorts of things:

* The language the code is written in.

The HTML @lang attribute seems to be vaguely relevant.  A look at
RFC1766[1] suggests the use of x-foo values for 'unusual' languages,
the example given being x-klingon.  Would @lang="x-PHP" be considered
abuse?

* The origin of the code

Most code  displayed on the web (again this is anecdotal) is in the
form of snippets.  Some reference to the complete listing if available
would seem to be in order.  Is this a possible extension/application
for hCite?  At the very least the semantics would be similar.

* Authorship (hCard), licence details (@rel="licence"?  May be scope issues)

-Ciaran

[1] http://www.ietf.org/rfc/rfc1766.txt