Difference between revisions of "citation-brainstorming"

From Microformats Wiki
Jump to navigation Jump to search
Line 6: Line 6:
* ... (a bunch of good folks!)
* ... (a bunch of good folks!)
* Tantek Çelik
* Tantek Çelik
* Tim White
== See also ==
== See also ==

Revision as of 21:38, 22 January 2006

Citation Brainstroming


  • ...
  • ... (a bunch of good folks!)
  • Tantek Çelik
  • Tim White

See also

XHTML Structure

With my exprience working X2V and hCa* has taught me what elememts are easy to find and which are not. Since the Citation microformat is very new it is possible to not make a lot of the same errors twice and to make things easier for extracting application to find and imply certain properties.

  • There should be some sort of 'root node' that implies all child elements are for the Citation microformat.
  • Since most people will have multiple Citation there should be away to represent each Citation object as a unqiue block independant of another. This is to keep the parse from finding 'author' and applying to all citations. Each citation should be in a container (class="???") that scoped from others.
  • Perhaps class="hcite" with <cite> recommended as the root element. E.g. <cite class="hcite">

Some Thoughts

What distinguishes a cite from say Media Info (e.g. media-info-examples) is that a cite is a reference to something explicitly external to the current piece of content or document, whereas Media Info describes information about content embedded or inline in the current document.

Semantic Meaning

One of the guiding priniciple of Microformats is to use the most semantically rich element to describe each node (Point 2 of Semantic XHTML Design Principles: Use the most accurately precise semantic XHTML building block for each object etc). Since we are dealing with HTML and citations, several elements are candidates to be used to enrich the semantic meaning. CITE, BLOCKQUOTE, Q, A, (are there more?)

The Citation Brainstorming Page has a few development and ideas about how to give another person credit for a link. Some of the semantic ideas behind their choices of tags can be applied to a full bibliographic type reference.

ISBN:// Protocol


RFC3187 defines an isbn protocol



I'm not sure if any browser uses this data, but it might be have an application in citations describing registered materials with an ISBN

Question: what about using something like OCLC's WorldCat for linking titles? - Tim White

This and That

After reading through alot of different citation encoding formats, i noticed that each format was being used in onw of two ways. It was either to describe the Current page (THIS.PAGE) or being used to encode references that point to external resources (THAT.PAGE)

The informatation being encoded was identical for both resources (author, date, name, etc) they just reference different things. For this microformat, i'm not sure if we want to try to solve both problems, or just one? The meta tags in the head element would be the ideal place for information about the THIS.PAGE, but that is not in following with the ideals of microformats where information is human-readable. The THAT.PAGE idea where a list of references is at the end of a document in the form of a bibliography is more inline with the ideals of a microformat where the data is human-readable. That doesn't mean that data about the current document shouldn't be human-readable, so some of the same properties used to reference extermal resources can be used for the current document (THIS.PAGE). To do this a different root item could be used and transforming applications could either extract the citation data about the current page, or information about this page's references.

This is open for discussion, but either way, i believe that the properties used to describe a page will be the same for both THIS and THAT. brian suda

Date Formatting

Since microformats are all about re-use and the accepted way to encode Date-Time has been pretty much settled, then this is a good place to start when dealing with all the different date citation types.

These are all the different fields from various citation formats that are of temporal nature:

* Date (available | created | dateAccepted | dateCopyrighted | dateSubmitted | issued | modified | valid)
* originInfo/dateIssued
* originInfo/dateCreated
* originInfo/dateCaptured
* originInfo/dateOther
* month
* year
* Copyright Year
* Date - Generic
* Date of Confernce
* Date of Publication
* Date of update/revisou/issuance of database record
* Former Date
* Entry Date for Database Record
* Database Update
* Year of Publication

There are several common properties across several citation domains and will certainly be in the citation microformat, the unique instances will need further consideration, otherwise there could be no end to posiblities.

There are also several properties (year, month, Year of publication) that can be extracted from another source. Therefore, if you only encode a more specific property such as; Date of Publication, you can extract the 'year of publication' from that. Since the date-time format we are modeling after is the ISO date-time format, just the Year portion is an acceptable date. So if you ONLY know the year of publication, the you can form a valid 'Date of Publication' as a microformat (which inturn is a valid 'year of publication') - you milage may vary when it comes to importing into citation applications.


It seems to me that these can be collapsed to maybe one or two different date properties. As far as the specific human readable formatting of the date, that can be chosen per whatever the presentation style guide says, and the Datetime Design Pattern used to simplify the markup. - Tantek


Some of the citation formats has a place for 'keywords' or 'generic tags', etc. This might be a good place to re-use the RelTag microformat. The downside would be that they are then forced to be links, which might be the correct way to mark-up these terms.



OpenURL is in use in library software around the world to allow citation metadata to be embedded in URLs. Typically these URLs are used for targeting resolvers which essentially proxy access to licensed content. OpenURL also provides an XML encoding. The key/value pairs used in OpenURL could relatively easily be adapted to semantic HTML. This is a cow path that could be paved relatively easily using existing modules as appropriate.

OpenURL has a microformat-like encoding: COinS. This uses the 'title' attribute of HTML 'span' tags to embed OpenURL URLs.

Example (from a book review written using the Structured Blogging plugin):

<p><b>ISBN</b>: <span class='Z3988'

MARC / MODS / Dublin Core

The MODS (example) and Dublin Core (example) transformations of MARC21 may contain some useful ideas.

Here's a first attempt at rewriting the linked examples in XHTML (written in response to a mailing list query about encoding book information with microformats):

<div class="book" lang="en">
  <h3 class="fn">Arithmetic /</h3>
  <p>By <span class="creator"><span class="fn">Sandburg, Carl</span>,
     <span class="date">1878-1967</span></span>,
     and <span class="illustrator">Rand, Ted</span></p>
  <p>Publisher: <span class="publisher"><span class="fn">Harcourt Brace Jovanovich</span>,
     <span class="locality">San Diego</span></span></p>
  <p>Published: <span class="issued">1993</span></p>
  <p class="description">A poem about numbers and their characteristics. Features
     anamorphic, or distorted, drawings which can be restored to normal by viewing
     from a particular angle or by viewing the image's reflection in the provided
     Mylar cone.</p>
  <p class="note">One Mylar sheet included in pocket.</p>
    <li class="subject">Arithmetic</li>
    <li class="subject">Children's poetry, American.</li>
    <li class="subject">Arithmetic</li>
    <li class="subject">American poetry</li>
    <li class="subject">Visual perception</li>