[uf-discuss] code microformat

Tue Jan 30 07:34:43 PST 2007

On 1/30/07, Ciaran McNulty <mail at ciaranmcnulty.com> wrote:
> On 1/30/07, Colin Barrett <timber at lava.net> wrote:
> > I have been posting some code listings on my blog recently. It would
> > be really nice to have these sections identified (so then a source
> > coloring tool could identify them and color them)
> > <pre><code>
> > code
> > </code></pre>
> > is the awful HTML I have been using. It would be nice to have
> > something more semantic to put up, particularly with regards to
> > licensing -- Some of the code snippets are public domain, some are
> > GPL, and I don't really have any way of noting this currently.
>
> In a blog context, I've found that common RSS or Atom tools don't do
> well at retaining the whitespace in posts so relying on PRE isn't
> foolproof, even though it makes sense.
>
> I've ended up entity-encoding and adding <br /> to my code examples
> when I use them.  I'm normally averse to <br /> but I think that code
> listings are one of the few areas where they make sense.

I would consider the rss and atom tools be not in accordance with the
(x)html specs on this one.  And I believe that muddying the code with
non-semantic tags makes is much more difficult to digest. As I
suggested earlier, while these tags are fine for rendered (x)html,
most of the digestion of microformats is done on a source code level.
The result of that is that unnecessary tags found inside of the source
code would either have to be stripped, or the code would have to be
rendered by the machine doing the digesting.  If I list of allowed
tags was decided upon I don't suppose stripping would be that big of a
deal, but at least for the first proposal I think that sticking with
plain-text is best.

>
> This isn't particularly structured stuff, I'll try and tidy it up for
> the wiki, but anecdotally I'd expect the following sorts of things:
>
> * The language the code is written in.
>
> The HTML @lang attribute seems to be vaguely relevant.  A look at
> RFC1766[1] suggests the use of x-foo values for 'unusual' languages,
> the example given being x-klingon.  Would @lang="x-PHP" be considered
> abuse?

>From the RFC

2.1.  Meaning of the language tag

"The language tag always defines a language as spoken (or written) by
human beings for communication of information to other human beings.
Computer languages are explicitly excluded."

Unfortunately it gives no reference as to what _should_ be done with
computer languages.

>
> * The origin of the code
>
> Most code  displayed on the web (again this is anecdotal) is in the
> form of snippets.  Some reference to the complete listing if available
> would seem to be in order.  Is this a possible extension/application
> for hCite?  At the very least the semantics would be similar.

Perhaps the "origin url" could be used either to point to the
originating article, the full listing or various other sources that
best match "where this snippet or example originated".

>
> * Authorship (hCard), licence details (@rel="licence"?  May be scope issues)
>
> -Ciaran
>
> [1] http://www.ietf.org/rfc/rfc1766.txt
> _______________________________________________
> microformats-discuss mailing list
> microformats-discuss at microformats.org
> http://microformats.org/mailman/listinfo/microformats-discuss
>