From Microformats Wiki
citation-brainstorming /
Revision as of 05:30, 18 January 2006 by Tantek (talk | contribs) (noted that many sections should be moved to cite-formats)
Jump to navigation Jump to search

Citation Brainstroming

XHTML Structure

With my exprience working X2V and hCa* has taught me what elememts are easy to find and which are not. Since the Citation microformat is very new it is possible to not make a lot of the same errors twice and to make things easier for extracting application to find and imply certain properties.

  • There should be some sort of 'root node' that implies all child elements are for the Citation microformat.
  • Since most people will have multiple Citation there should be away to represent each Citation object as a unqiue block independant of another. This is to keep the parse from finding 'author' and applying to all citations. Each citation should be in a container (class="???") that scoped from others.

@@ more points will be posted as i remember of them

Semantic Meaning

One of the guiding priniciple of Microformats is to use the most semantically rich element to describe each node (Point 2 of Semantic XHTML Design Principles: Use the most accurately precise semantic XHTML building block for each object etc). Since we are dealing with HTML and citations, several elements are candidates to be used to enrich the semantic meaning. CITE, BLOCKQUOTE, Q, A, (are there more?)

The Citation Brainstorming Page has a few development and ideas about how to give another person credit for a link. Some of the semantic ideas behind their choices of tags can be applied to a full bibliographic type reference.

ISBN:// Protocol


RFC3187 defines an isbn protocol



I'm not sure if any browser uses this data, but it might be have an application in citations describing registered materials with an ISBN

Question: what about using something like OCLC's WorldCat for linking titles? - Tim White

This and That

After reading through alot of different citation encoding formats, i noticed that each format was being used in onw of two ways. It was either to describe the Current page (THIS.PAGE) or being used to encode references that point to external resources (THAT.PAGE)

The informatation being encoded was identical for both resources (author, date, name, etc) they just reference different things. For this microformat, i'm not sure if we want to try to solve both problems, or just one? The meta tags in the head element would be the ideal place for information about the THIS.PAGE, but that is not in following with the ideals of microformats where information is human-readable. The THAT.PAGE idea where a list of references is at the end of a document in the form of a bibliography is more inline with the ideals of a microformat where the data is human-readable. That doesn't mean that data about the current document shouldn't be human-readable, so some of the same properties used to reference extermal resources can be used for the current document (THIS.PAGE). To do this a different root item could be used and transforming applications could either extract the citation data about the current page, or information about this page's references.

This is open for discussion, but either way, i believe that the properties used to describe a page will be the same for both THIS and THAT. brian suda

Date Formatting

Since microformats are all about re-use and the accepted way to encode Date-Time has been pretty much settled, then this is a good place to start when dealing with all the different date citation types.

These are all the different fields from various citation formats that are of temporal nature:

* Date (available | created | dateAccepted | dateCopyrighted | dateSubmitted | issued | modified | valid)
* originInfo/dateIssued
* originInfo/dateCreated
* originInfo/dateCaptured
* originInfo/dateOther
* month
* year
* Copyright Year
* Date - Generic
* Date of Confernce
* Date of Publication
* Date of update/revisou/issuance of database record
* Former Date
* Entry Date for Database Record
* Database Update
* Year of Publication

There are several common properties across several citation domains and will certainly be in the citation microformat, the unique instances will need further consideration, otherwise there could be no end to posiblities.

There are also several properties (year, month, Year of publication) that can be extracted from another source. Therefore, if you only encode a more specific property such as; Date of Publication, you can extract the 'year of publication' from that. Since the date-time format we are modeling after is the ISO date-time format, just the Year portion is an acceptable date. So if you ONLY know the year of publication, the you can form a valid 'Date of Publication' as a microformat (which inturn is a valid 'year of publication') - you milage may vary when it comes to importing into citation applications.


It seems to me that these can be collapsed to maybe one or two different date properties. As far as the specific human readable formatting of the date, that can be chosen per whatever the presentation style guide says, and the Datetime Design Pattern used to simplify the markup. - Tantek

Types and Roles

(Section is informative only as a place to capture various parts of publication citations.)


There are many different types of publications and this information should be captured in the citation. Possible types include:

  • Novel/fiction (specify type -- literature, sci-fi, romance, etc.?)
  • Non-fiction
  • Poem
  • Play
  • Magazine
  • Reference (seperate out encyclopedia, dictionary, almanac, etc.?)
  • Journal
  • Article within a journal
  • Chapter within a book
  • Dissertation
  • Web Site
  • Page within a web site
  • Music Recording
  • Video Recording
  • Interview
  • Physical object (Statue, Painting, etc.)
  • ??

Question: Certain works have specific types of citations, for example, the Bible--and, I assume, other religious works--have very specific citation formats with different relevant information (chapter/verse) than others, as do the works of Shakespeare. Should these be considered seperate types/roles?

A: I think in terms of types, we should at least note the items (chapter, verse, etc). How they get dealt with is still way up in the air. - Tim

Likewise, there are several different roles associated with publications -- author, co-author, editor, translator, etc. Should these be captured under a master "role" or treated as individual elements?

A: Good question. I think there is an important distinction, but whether we follow a design pattern of "role-*" (or more likely "author-*) or some other pattern hasn't been discussed yet. - Tim


Some of the citation formats has a place for 'keywords' or 'generic tags', etc. This might be a good place to re-use the RelTag microformat. The downside would be that they are then forced to be links, which might be the correct way to mark-up these terms.



OpenURL is in use in library software around the world to allow citation metadata to be embedded in URLs. Typically these URLs are used for targeting resolvers which essentially proxy access to licensed content. OpenURL also provides an XML encoding. The key/value pairs used in OpenURL could relatively easily be adapted to semantic HTML. This is a cow path that could be paved relatively easily using existing modules as appropriate.

OpenURL has a microformat-like encoding: COinS. This uses the 'title' attribute of HTML 'span' tags to embed OpenURL URLs.

Example (from a book review written using the Structured Blogging plugin):

<p><b>ISBN</b>: <span class='Z3988'

MARC / MODS / Dublin Core

The MODS (example) and Dublin Core (example) transformations of MARC21 may contain some useful ideas.

Here's a first attempt at rewriting the linked examples in XHTML (written in response to a mailing list query about encoding book information with microformats):

<div class="book" lang="en">
  <h3 class="fn">Arithmetic /</h3>
  <p>By <span class="creator"><span class="fn">Sandburg, Carl</span>,
     <span class="date">1878-1967</span></span>,
     and <span class="illustrator">Rand, Ted</span></p>
  <p>Publisher: <span class="publisher"><span class="fn">Harcourt Brace Jovanovich</span>,
     <span class="locality">San Diego</span></span></p>
  <p>Published: <span class="issued">1993</span></p>
  <p class="description">A poem about numbers and their characteristics. Features
     anamorphic, or distorted, drawings which can be restored to normal by viewing
     from a particular angle or by viewing the image's reflection in the provided
     Mylar cone.</p>
  <p class="note">One Mylar sheet included in pocket.</p>
    <li class="subject">Arithmetic</li>
    <li class="subject">Children's poetry, American.</li>
    <li class="subject">Arithmetic</li>
    <li class="subject">American poetry</li>
    <li class="subject">Visual perception</li>

Ann Arbor District Library XML feed


Here is an example of the HTML (inside an RSS feed) that the Ann Arbor District Library is generating and wants to mark up with some sort of publication microformat.

Here's a record in XML format from their project:

<callnum>823 Bu</callnum>
<author>Burkart, Gina, 1971-</author>
<fulltitle>A parent's guide to Harry Potter / Gina Burkart</fulltitle>
<title>A parent's guide to Harry Potter </title>
<pubinfo>Downers Grove, Ill. : InterVarsity Press, c2005</pubinfo>
<desc>112 p</desc>
<bibliography>Includes bibliographical references</bibliography>
The Harry hype -- More than a story -- The modern fairy tale -- Discussing fantasy with children -- Morals, not magic -- The real issues in Harry Potter -- Dealing with traumatic experiences -- Facing fears -- Battling bullies -- Delving into diversity -- Hiding hurts -- Letting go of anger -- Getting help -- Choosing good over evil -- The power of love -- Facing spiritual battles
<avail>No copies available</avail>
<recordlink xlink:href=""/>