[uf-new] hAudio - audio-album and audio-podcast

Joe Andrieu joe at andrieu.net
Fri Jun 1 02:41:51 PDT 2007

Chris Griego wrote:
> On 5/31/07, Manu Sporny <msporny at digitalbazaar.com> wrote:
> > > Your use of audio-album could cause problems later in the 
> semantic 
> > > meaning, iTunes has many celebrity playlists, which are 
> not actually 
> > > ALBUMS, but are a collection of related songs. The term podcast 
> > > seems very 2005, in 4 years will we still use 'podcats' 
> maybe, maybe 
> > > not?
> >
> > We're not concerned with what might happen in the future. We're 
> > concerned with what's already there - the cow paths. The two major 
> > types of grouping in the audio-examples are podcasts and albums.
> Albums, podcasts, whatever, aren't they all forms of 
> playlists? haudio with opional hplaylist?

Not unless you can provide examples of them being display on the web as the same thing.

Albums suggest semantics as as a particular "publication", with a date, a title, cover art, publisher, artist, etc.

Playlists may or may not ever have been "published" in the traditional sense, especially personal playlists. A playlist /could/ just
be a list of audio clips without a dereferenceable link to the streams themselves. Note that FIQL[1] offers auto-play in Napster and
Rhapsody, but rarely of all the songs in the list are playable.

Podcasts are really just a single chunk of audio offered as a "cast". It /does/ contain or reference an URL to listen to the audio
(or at least the URL where it was once available). They are "published" in the sense of being created and/or offered for consumption
on a particular date.

I would argue that a podcast is, in particular, a conceptually complete downloadable audio "performance".  That term may evolve, but
the concept is pretty stable and "podcast" is unique enough that it might survive. They usually have an artist and creator, but
"publishers" are optional. I suppose NPR acts as a publisher, but many podcasts are self-published.

Albums on the other hand are complete audio collections published /as a unit/ with associated meta-media and meta-data like cover
art, publisher, artist, etc.. They are typically of multiple songs, usually by a single artist, but not always. 

Three very different things.

>From reviewing the examples [2], I think there are five distinct semantic items here, at a minimum: 

Looking at the Beatle's White Album at Amazon [3] also had two discs. Not an uncommon situation. Each disk had its own playlist. At
wikipedia [4], the same album had four sides, each with its own playlist.

Some more atomic than others, but I think the overlap is entirely constructive if organize correctly.

Here's my brief thinking about how these fit together:
o An album contains one or more playlists plus album meta-data (release-date, etc.) and potential meta-media (album cover, etc.)
o A playlist is a list of tracks, plus perhaps meta-data such as genre, creator, etc.
o A track specifies a particular audio clip: length, author, title, etc. It may point to zero or more audio files for those tracks. 
o An audiofile is a particular media file specifying format, URL, etc.
o A podcast is an audiofile plus publication meta-data such as publisher, release date, etc.

Any of these /entities/ could have a number of attributes.

Reviewing the proposed spec, it seems like all five of these entities are being jammed into one hAudio, which means I can't tell
which is a track or a podcast, which means a loss of semantic information.  It would also allow you to put an album inside a track,
inside a podcast. Kind of like a riddle wrapped in a puzzle wrapped in an enigma.

Using Amazon's structure for the moment how about something like this instead:
<div class="hAudio">
 <div class="album">
  <h1 class="title">The White Album</h1>
  <div class="release-date">22 November 1968</div>
  <h2 class="artist">The Beatles</h2>
   <div class="playlist">
    <h3 class="disc">Disc: 1</h3>
     <li class="track"><span class="trackTitle">Back in the U.S.S.R.</span> - <span class="duration">2:43</span></li>
     <li class="track"><span class="trackTitle">Dear Prudence</span> - <span class="duration">3:56</span></li>
   <div class="playlist">
    <h3 class="disc">Disc: 2</h3>
     <li class="track"><span class="trackTitle">Birthday</span> - <span class="duration">2:42</span></li>
     <li class="track"><span class="trackTitle">Yer Blues</span> - <span class="duration">4:00</span></li>

Obviously, the attributes are just placeholders. I'm not sure if "fn" should be applied to The Beatles in this case, as they are not
a "person", but I expect others will find that acceptable. I do however like that "disc" as a potential part of "album" indicates
that this album is a CD, which a "side" would indicate vinyl. And I don't even want to get into the Date/ISO/ABBR issue, so just
replace the "release-date" with whatever semantics you prefer.

By being explicit about the "grouping" by using album and disc, we retain a lot of the semantic information that is visually obvious
to a human when looking at the page at Amazon or Wikipedia. Dump the above into an html file and you'll see it is quite legible. I
also think it is easier to understand and author.

Note that in the Amazon case, the playlists do not provide links to audio files.

Good list of examples, btw. I appreciated the work you've already done. It helped me understand what you are trying to do. Thanks to
those who have contributed to that.


[1] http://www.fiql.com/
[2] http://microformats.org/wiki/audio-info-examples
[3] http://www.amazon.com/Beatles-White-Album/dp/B000002UAX

Joe Andrieu
SwitchBook Software
joe at switchbook.com
+1 (805) 705-8651 

More information about the microformats-new mailing list