citation-brainstorming: Difference between revisions

From Microformats Wiki
Jump to navigation Jump to search
(→‎Contributors: add Brian)
m (s/<source>/<syntaxhighlight>/)
 
(98 intermediate revisions by 14 users not shown)
Line 1: Line 1:
<h1> Citation Brainstroming </h1>
{{DISPLAYTITLE: Citation Brainstorming }}


__TOC__
Part of the overall effort to develop a [[citation]] microformat.


== Contributors ==
== Use Cases ==
To focus the discussion, please add use cases below that will help show what problems the citation microformat will be solving.
 
Use cases for both publishing and consuming citation information can help to focus citation brainstorming on efforts that provide real world utility to users.
 
For now, please add any uses cases you think of, however common or obscure (feel free to note opinions as to expected/known frequency of use of such use cases).


=== improve web citations ===
Articles on the web often cite other online articles with permalinks (e.g. <span id="Blogs_quoting_other_resources.2C_including_blogs">blogs quoting other resources, including blogs</span>). Such web citations could be improved both in content and interaction in a number of useful ways:
* '''richer citations.''' Existing web citations typically include only permalink URL and article title in a hyperlink. An explicit format (both in microformat and style) for web citations could encourage the use of richer citations with information like author and date(time) of publication. Author information is useful because it provides an immediate inline proxy for reputation, and date of publication is useful because it sets a context for the information backed by the citation.
* '''richer citation interfaces.''' Web articles sometimes provide an explicit user interface to copy/paste a permalink for reference purposes, or a hyperlink embed code for linking to the article from another web article. An explicit markup structure/format could encourage such interfaces to provide a richer citation structure (e.g. including author, date of publication) to copy/paste, with little to no change in overall UI. This is useful in that it would help propagate richer citations themselves, which have the advantages mentioned above.
Additional useful rich citation enhancements:
* '''access date.''' Rich citations could include the access date when an author (blogger) made a citation, because resources on the other side of those links can change without notice.
* ...
* ...
* ... (a bunch of good folks!)
* Tantek Çelik
* Tim White
* Michael McCracken
* Brian Suda


== See also ==
=== I read this ===
* [[citation]]
A reader wants to collect a set of things they've read (e.g. on the web), perhaps for the purposes of cataloging them, adding notes, and using the information to generate later citations, potentially in other forms, such as BibTeX or Docbook, for inclusion in a publication of their own.
* [[citation-examples]]
* [[citation-formats]]
* [[citation-faq]]


== Use Cases ==
If web articles (e.g. blog posts) contained discoverable descriptions of self-citations (e.g. permalinks plus authorship), browsers/aggregators could both automatically collect these, perhaps as part of an enhanced browser history functionality, or allow explicit collection, e.g. bookmarking with additional structure.


To focus the discussion, please add use cases below that will help show what problems the citation microformat will be solving.
Notes: In this case, it isn't important to the user what style the citation takes as displayed on the page where they find it. What *is* important is that it contains enough information to allow generation of the format they will ultimately re-publish it in. This implies that it may be worthwhile to err a little on the side of verbosity, but at most enough to provide typical TCMOS/APA/MLA citations.


I've included two, focusing on consuming information - I've assumed that use cases for generating microformatted content would just involve the desire to enable your content to be consumed better, but I'm interested to see if there's something I'm missing here -Mike
=== collect further reading ===
Was part of <span id="Acquiring_reference_information_from_the_web">Acquiring reference information from the web</span>.


=== Acquiring reference information from the web ===
A reader finds a list of citations (e.g. a paper's bibliography, an author's papers page, results of a search for academic papers), and wants to add them to a queue of things they'd like to read, perhaps as part of further research on whatever subject/person they were reading/researching.


A user either finds an author's papers page, or is viewing the results of a search and would like to import the information about the displayed papers into their local reference database, for the purposes of cataloging things they've read, adding notes, and using the information to generate later citations, potentially in other forms, such as BibTeX or Docbook, for inclusion in a publication of their own.
Marking up the list of citations with a microformat would enable to browsers/aggregators to present an explicit list of structured citations with a user interface for one-click addition to a read it later list (or a local reference database).


Notes: In this case, it isn't important to the user what format the citation takes as displayed on the page where they find it. What *is* important is that it contains enough information to allow generation of the format they will ultimately re-publish it in. This implies that it may be worthwhile to err a little on the side of verbosity.
Links to downloadable full representations of the cited work (e.g. link to the PDF of a journal article, or to a music file) would help the reader find cited works, and perhaps even have their browser/aggregator prefetch/cache/download them.
 
Also, links to downloadable full representations of the cited work are very important - e.g. a link to the PDF of a journal article, or to a music file.


=== Subscribing to reading lists, periodicals, etc ===
=== Subscribing to reading lists, periodicals, etc ===
I would like to be able to leverage my news aggregator with hAtom to subscribe to a remote source for citation information, for example:
I would like to be able to leverage my news aggregator with hAtom to subscribe to a remote source for citation information, for example:


Line 43: Line 45:


=== Aggregating reading lists and reviews ===
=== Aggregating reading lists and reviews ===
A citation microformat-specific aggregator could provide a decentralized version of [http://citeulike.org/ CiteULike]. Libraries, authors, research groups, and publishers could mark up their collections, while other people on weblogs or review sites could add tags and reviews.
A citation microformat-specific aggregator could provide a decentralized version of [http://citeulike.org/ CiteULike]. Libraries, authors, research groups, and publishers could mark up their collections, while other people on weblogs or review sites could add tags and reviews.


Line 51: Line 52:
Capturing/copying HTML from web pages for use in other applications (especially when those apps present HTML as output), such as pasting into Word, or a specialized application like [http://www.google.com/notebook Google Notebook], [http://onfolio.com Onfolio] or [http://www.kaboodle.com Kaboodle].  When such captures are made, it makes sense to keep track of the full citation data, including the date it was accessed, which may or may not be the date it was published.  
Capturing/copying HTML from web pages for use in other applications (especially when those apps present HTML as output), such as pasting into Word, or a specialized application like [http://www.google.com/notebook Google Notebook], [http://onfolio.com Onfolio] or [http://www.kaboodle.com Kaboodle].  When such captures are made, it makes sense to keep track of the full citation data, including the date it was accessed, which may or may not be the date it was published.  


=== Blogs quoting other resources, including blogs ===
===Finding in Library===
Any blog that cites online content, whether a blog or news article, could use an hCitation to properly link to the cited reference. Such citations could include the access date when the blogger made the citation, because resources on the other side of those links can change without notice.  
Find a copy of the cited work in a nearby library (as with [http://ocoins.info/ OpenCOinS]).


Instead, today we have simple formating with a link to the permaURL. The citation data is completely lacking. See [http://doc.weblogs.com Doc Searl's blog] for a style of referencing that could benefit from proper a citation uF.
===Buy a copy===
Find the cited work on, for example, Amazon or [http://www.abebooks.com/ ABE]; or subscribe to a journal via its own website.


===Find reviews===
Find third-party reviews of the cited work.


Fascinating... after I added the last two use cases, I realized they focus on potentially marginal cases. The first because it is missing the "output" part of the cut & paste, where the uF would actually be used as part of the paste.  The latter because bloggers have a working citation mechanism that is just a link to the URL (hopefully a permaURL). One could argue they wouldn't want a full hCitation. And in fact, until a tool exists that makes it easy, they probably won't. However, a tool that cuts & pastes from anywhere on the web into a blog with a full citation seems like a nice tool.  But again, I'm not really paving the cowpaths with these ideas. -Joe Andrieu
===Give citation data for the page being visited===
Adding a class of, say, "self" to an attribute of the proposed strawman would allow users (or user agents) to extract the data required to cite the page being visited, when referring to it elsewhere. There would be the added advantage of allowing the citation to be ignored by any parser which might be building a "tree" of citations, and preventing the setting up of an infinite loop.  


== Original hBib Discussion ==
For evidence of published "self citation" data (albeit on a secondary page) see the "cite this article" link on any Wikipedia entry, e.g. [http://en.wikipedia.org/w/index.php?title=Special:Cite&page=West_Midland_Bird_Club&id=115894372] from [http://en.wikipedia.org/wiki/West_Midland_Bird_Club].


During the WWW2005 Developer's Day [[microformats]] track, Rohit Khare gave a [[presentations|presentation]] where he discussed the microformats [[process]], and then did  a quick demonstration wherein a bunch of us got on a shared Subethaedit document, and brainstormed some thoughts on what an "hBib" bibliography citation microformat would look like.  Rohit placed the [http://cnlabs.commerce.net/~rohit/hBib%20Discussion.html document on his Commercenet site].
See also [http://en.wikipedia.org/wiki/Wikipedia_talk:Citing_Wikipedia#Citation_data_should_be_on_the_page_concerned Proposal to include on-page citation data in Wikipedia]


* http://cnlabs.commerce.net/~rohit/hBib%20Discussion.html
=== Cite a journal on Wikipedia ===
* (from a mailing list):
:<blockquote>if you want to cite a [biomedical journal] journal article on Wikipedia [...] you can export a correctly-formatted citation for Wikipedia from HubMed using unAPI... http://hublog.hubmed.org/archives/001408.html</blockquote>


''An attempt to summarize and inline the linked document follows. -Mike''
*[http://www.zotero.org/ Zotero], a Firefox extension to help collect, manage, and cite research sources.  


Two major goals were outlined by the group:
== principles ==
Principles help guide and compare various different brainstorming proposals.


* Avoid re-keying references
In the first three years of development the citation microformat effort generated a number of brainstorm proposals without clear consensus or adoption of any of them in particular. Thus any new (2012+) proposals must be written with references to particular principles for each design decision, justifying why/how the new proposal is an improvement upon previous proposals.
* Adapt to new journal styles by changing CSS
The fundamental problem was discussed in terms of display - the ability to transform XHTML+hBib into the many journal-specific formats. For example, how to display "et.al" when all authors are present in the source, and how to re-order the elements if a style defines a set order of elements that conflicts with the ordering in the source. Using hCard for authors was agreed on, and the beginnings of an example were shown.


== XHTML Structure ==
Principles to use:
With my exprience working X2V and hCa* has taught me what elememts are easy to find and which are not. Since the Citation microformat is very new it is possible to not make a lot of the same errors twice and to make things easier for extracting application to find and imply certain properties.
* microformats design [[principles]]
* Semantic HTML Design Principles
* use as precise as HTML semantics as are available


* There should be some sort of 'root node' that implies all child elements are for the Citation microformat.
=== Semantic HTML Design Principles ===
* Since most people will have multiple Citation there should be away to represent each Citation object as a unqiue block independant of another. This is to keep the parse from finding 'author' and applying to all citations. Each citation should be in a container (class="???") that scoped from others.
{{semantic-html-design-principles}}
* Perhaps class="hcite" with <code>&lt;cite&gt;</code> recommended as the root element. E.g. <code>&lt;cite class="hcite"&gt;</code>


== Citation vs. [[media-info]] ==
Brainstorm proposals should take into account the Semantic HTML Design Principles.


What distinguishes a cite from say [[media-info]] (e.g. [[media-info-examples]]) is that a cite is a reference to something explicitly external to the current piece of content or document, whereas [[media-info]] describes information about content embedded or inline in the current document.
=== semantic elements to consider ===
One of the guiding principles of Microformats is to encourage the use of the most precisely semantically rich element to describe each node (Point 2 of Semantic HTML Design Principles: Use the most accurately precise semantic HTML building block for each object etc). Since we are dealing with HTML and citations, several elements are candidates to be used to enrich the semantic meaning. [http://www.w3.org/TR/REC-html40/struct/text.html CITE, BLOCKQUOTE, Q, A], (are there more?)


== Semantic Meaning ==
== brainstorm proposals ==
One of the guiding priniciple of Microformats is to use the most semantically rich element to describe each node (Point 2 of Semantic XHTML Design Principles: Use the most accurately precise semantic XHTML building block for each object etc). Since we are dealing with HTML and citations, several elements are candidates to be used to enrich the semantic meaning. [http://www.w3.org/TR/REC-html40/struct/text.html CITE, BLOCKQUOTE, Q, A], (are there more?)


The [[citation-brainstorming|Citation Brainstorming Page]] has a few development and ideas about how to give another person credit for a link. Some of the semantic ideas behind their choices of tags can be applied to a full bibliographic type reference. ''Does this sentence make sense only historically? -Mike''
=== web citations ===
{{main|h-cite}}


== OCLC's WorldCat for titles ==
This brainstorm has now been moved to a draft microformat:
Question: what about using something like OCLC's [http://www.oclc.org/worldcat/open/isbnissnlinking/default.htm WorldCat] for linking titles? - Tim White
* [[h-cite]]


== This and That ==
The remainder of this brainstorm proposal is left here for historical purposes:
After reading through alot of different citation encoding formats, i noticed that each format was being used in onw of two ways. It was either to describe the Current page (THIS.PAGE) or being used to encode references that point to external resources (THAT.PAGE)


The informatation being encoded was identical for both resources (author, date, name, etc) they just reference different things. For this microformat, i'm not sure if we want to try to solve both problems, or just one? The meta tags in the head element would be the ideal place for information about the THIS.PAGE, but that is not in following with the ideals of microformats where information is human-readable. The THAT.PAGE idea where a list of references is at the end of a document in the form of a bibliography is more inline with the ideals of a microformat where the data is human-readable. That doesn't mean that data about the current document shouldn't be human-readable, so some of the same properties used to reference extermal resources can be used for the current document (THIS.PAGE). To do this a different root item could be used and transforming applications could either extract the citation data about the current page, or information about this page's references.
The <dfn>web citations</dfn> proposal uses a smaller, [[simpler]] set of only ''eight'' properties to solve the [[specific]] problem of how to markup citations in an article <em>on the web</em> that refers to other articles <em>on the web</em>. Offline to offline, and online to offline references are specifically not addressed.


This is open for discussion, but either way, i believe that the properties used to describe a page will be the same for both THIS and THAT. [http://suda.co.uk/ brian suda]
==== web citations background ====
This work is based on how existing [[citation-formats#styles|citation format ''styles'']] (APA, MLA, TCMOS) represent references to articles on the web, and is designed to match the implied schema of those styles. The web citations proposal defines how to markup such reference representation styles in order to satisfy the use-cases above.


== More on This and That ==
==== web citation illustrative example ====
Here is a simple minimal abstract web citation example:


Citation microformats are being explored as a possibility for citing genealogical information at [http://eatslikeahuman.blogspot.com Dan Lawyer's blog].
<syntaxhighlight lang="html">
<span class="h-cite">
  <time class="dt-published">YYYY-MM-DD</time>
  <span class="p-author h-card">AUTHOR</span>:
  <cite><a class="u-url p-name" href="URL">TITLE</a></cite>
</span>
</syntaxhighlight>


This is a case where frequently the citation would refer to (THIS.PAGE), but would have nested within it a reference to (THAT.PAGE), possibly a few levels deep. For instance, a web page might contain data extracted from a microfilm of a census. The citation would need to include information about the web page, information about the microfilm, and information about the census. Genealogical citations are expected to include the repository (where can this book or microfilm be found. Is this the same as ''venue''?). So, at each level the information should contain the repository of the referenced item. A nesting (recursive) mechanism for citation microformats would be useful in this case. Is this the function of the "container" element in the Straw Format?
==== web citation properties ====
root classname: '''<code>h-cite</code>'''


== Date Formatting ==
In rough order of presentation and relevance/frequency:
Since microformats are all about re-use and the accepted way to encode Date-Time has been pretty much settled, then this is a good place to start when dealing with all the different date citation types.


These are all the different fields from various citation formats that are of temporal nature:
properties:
* Date (available | created | dateAccepted | dateCopyrighted | dateSubmitted | issued | modified | valid)
* '''<code>dt-published</code>''' - reused from [[uf2]] h-entry
* originInfo/dateIssued
* '''<code>p-author</code>''' - same, with optional substructured h-card
* originInfo/dateCreated
* '''<code>p-name</code>''' - common property instead of entry-title
* originInfo/dateCaptured
* '''<code>u-url</code>''' - a URL to access the cited work
* originInfo/dateOther
* '''<code>u-uid</code>''' - a URL/URI that uniquely/canonically identifies the cited work, canonical permalink.
* month
* '''<code>p-publication</code>''' - for citing articles in publications with more than one author, or perhaps when the author has a specific publication vehicle for the cited work. Also works when the publication is known, but the authorship information is either unknown, ambiguous, unclear, or collaboratively complex enough to be unable to list explicit author(s), e.g. like with many wiki pages.
* year
* '''<code>dt-accessed</code>''' - date the cited work was accessed for whatever reason it is being cited. Useful in case online work changes and it's possible to access the dt-accessed datetimestamped version in particular, e.g. via the Internet Archive.
* Copyright Year
* '''<code>p-content</code>''' for when the citation includes the content itself, like when citing short text notes (e.g. tweets).
* Date - Generic
* Date of Confernce
* Date of Publication
* Date of update/revisou/issuance of database record
* Former Date
* Entry Date for Database Record
* Database Update
* Year of Publication


There are several common properties across several citation domains and will certainly be in the citation microformat, the unique instances will need further consideration, otherwise there could be no end to posiblities.  
==== web citations vs previous proposals ====
I think the biggest problem with all previous proposals is that they tried to do too much. They didn't design a citation ''micro''format that could be used as a building block, but rather, erred on the side of attempting to describe the myriad types of references to dead-tree resources. They were so over-designed that their authors didn't even dogfood them on their own sites. -- [[User:Tantek|Tantek]] 00:56, 7 August 2012 (UTC)


There are also several properties (year, month, Year of publication) that can be extracted from another source. Therefore, if you only encode a more specific property such as; Date of Publication, you can extract the 'year of publication' from that. Since the date-time format we are modeling after is the ISO date-time format, just the Year portion is an acceptable date. So if you ONLY know the year of publication, the you can form a valid 'Date of Publication' as a microformat (which inturn is a valid 'year of publication') - you milage may vary when it comes to importing into citation applications.
A primary goal of the web citation effort is to both start small, and always "make small possible", that is, no matter how it is extended, continue permitting very small meaningful citations with perhaps only 2-3 properties (e.g. date published, author, name of work).


...
==== web citations design principles ====
Principles driving this proposal:
* '''solve a [[specific]] problem'''. In this case '''web citations''' seeks to solve a ''more'' specific problem than previous proposals, that of citations from the web to the web (more constrained than any publication to any publication).
* [[solve simpler problems first]]. Existing web-to-web citations contain very little information compared to generalized academic citations, thus '''web citations''' is greatly simplified compared to previous proposals by only starting with a handful of properties.
* [[humans first]] - '''web citations''' focuses on the human readability and writability aspects of citations in articles first and foremost, and only secondarily considers the machine readability/reusability of the data contained therein.
* [[reuse]] building blocks - by re-using the better designed aspects of [[citation-formats#styles|existing citation conventions]] for web resources, '''web citations''' builds on top of previous work to make citations human readable/writable, as well as what implied properties are commonly expressed by such previous work.


It seems to me that these can be collapsed to maybe one or two different date properties.  As far as the specific human readable formatting of the date, that can be chosen per whatever the presentation style guide says, and the [[datetime-design-pattern]] used to simplify the markup. - Tantek
==== web citation property details ====
(stub)


All web citation properties are derived from the implied schema in [[citation-formats#styles|existing citation styling guides]] for citing permalinks to articles and short text notes online.


'''Important'''
Date-time properties (dt-published, dt-accessed) may optionally include time information in addition to the date if relevant to the citation (e.g. when citing short text notes (tweets) of which there may be several in a single day).
Sometimes we need a date range and not simply a date (e.g. 4-6 May 2006). See ''Conference Citation'' examples later on this page. - Discoleo


== Tags ==
To be added:
Some of the citation formats has a place for 'keywords' or 'generic tags', etc. This might be a good place to re-use the [http://microformats.org/wiki/rel-tag RelTag microformat]. The downside would be that they are then forced to be links, which might be the correct way to mark-up these terms.
* for each property, what equivalent TCMOS, APA, MLA terms/vocabulary is being expressed/captured as researched in the [[citation-formats#styles|citation formats styles]] section.
* transforms from the web citations proposal properties into each of those citation styles.
** for citations of blog posts / articles
** for citations of text notes / tweets
** see examples in wild below for markup samples to style in each of the TCMOS/APA/MLA styles for blog/note citations.


==== web citation additional uses ====
The web citation proposal could be used for simple web-to-off-web citation use cases. As suggested by Ed Summers, dropping the hyperlink to the cited web article provides a simple off-web citation:


== MARC / MODS / Dublin Core ==
<syntaxhighlight lang="html">
<span class="h-cite">
  <time class="dt-published">YYYY-MM-DD</time>
  <span class="p-author h-card">AUTHOR</span>:
  <cite class="p-name">TITLE</cite>
</span>
</syntaxhighlight>


The MODS ([http://www.loc.gov/standards/marcxml/Sandburg/sandburgmods.xml example]) and Dublin Core ([http://www.loc.gov/standards/marcxml/Sandburg/sandburgdc.xml example]) transformations of MARC21 may contain some useful ideas.
Next steps:
* Try such markup with actual content being published on the web (perhaps a bibliography, list of papers in a resume, etc.)
* See how it works/feels there
* Determine what seems to be missing.  
* See if the "p-publisher" property helps in some web-to-off-web citation use cases.


Here's a first attempt at rewriting the linked examples in XHTML (written in response to a [http://microformats.org/discuss/mail/microformats-discuss/2005-December/002438.html mailing list query about encoding book information with microformats]):
==== web citation examples in the wild ====
Real world in the wild examples:
* ... add uses of h-cite you see in the wild here.


<pre><nowiki>
Real but not quite wild (use by the brainstorm author)
<div class="book" lang="en">
* Every blog post on http://tantek.com has a text field for copying h-cite markup for that blog post.
  <h3 class="fn">Arithmetic /</h3>
* <cite class="h-cite">[http://tantek.com/2012/353/b1/why-html-classes-css-class-selectors Why you should say HTML classes, CSS class selectors, or CSS pseudo-classes, but not CSS classes]</cite> has a couple of interesting uses of h-cite:
  <p>By <span class="creator"><span class="fn">Sandburg, Carl</span>,
** properties used: p-name, u-url, p-author, p-publication, dt-published, dt-accessed (basically all proposed properties except p-content!)
    <span class="date">1878-1967</span></span>,
** additional experimental property: p-x-translation - which refers to a nested h-cite with its own implied p-name and implied u-url.
    and <span class="illustrator">Rand, Ted</span></p>
* <cite class="h-cite">[http://christopheducamp.com/blog/test-indieweb-parvenir-a-posser-un-article-vers-twitter/ Test #IndieWeb : Parvenir à « POSSEr » un article vers Twitter]</cite> has an h-cite of a short note:
  <p>Publisher: <span class="publisher"><span class="fn">Harcourt Brace Jovanovich</span>,
    <span class="locality">San Diego</span></span></p>
  <p>Published: <span class="issued">1993</span></p>
  <p class="description">A poem about numbers and their characteristics. Features
    anamorphic, or distorted, drawings which can be restored to normal by viewing
    from a particular angle or by viewing the image's reflection in the provided
    Mylar cone.</p>
  <p class="note">One Mylar sheet included in pocket.</p>
  <p>Subjects:</p>
  <ul>
    <li class="subject">Arithmetic</li>
    <li class="subject">Children's poetry, American.</li>
    <li class="subject">Arithmetic</li>
    <li class="subject">American poetry</li>
    <li class="subject">Visual perception</li>
  </ul>
</div>
</nowiki></pre>


== Basic Citation Stuctures ==
<syntaxhighlight lang="html">
There are basic structures to any citation, this is an overview of some of the types
<blockquote><p>
[http://www.users.muohio.edu/darcusb/misc/citations-spec.html http://www.users.muohio.edu/darcusb/misc/citations-spec.html]
  <cite class="h-cite">
    <a class="u-url p-name" href="http://tantek.com/2013/104/t2/urls-readable-speakable-listenable-retypable">
      URLs should be readable, speakable, listenable, and unambiguously
retypable, e.g. from print: tantek.com/w/ShortURLPrintExample #UX
    </a>
  (<abbr class="p-author h-card" title="Tantek Çelik">Çelik</abbr>
    <time class="dt-published">2013-04-14</time>)
  </cite>
</p></blockquote>
</syntaxhighlight>


==== web citation references ====
I've been iterating on this design for some time, however, first publicly proposed it as the result of an interactive web citation design discussion during IndieWebCamp2012:
* <span class="h-cite"><time class="dt-published">2012-07-01</time> <span class="p-publisher">IndieWebCamp2012</span>: <cite class="p-name">[http://indiewebcamp.com/2012/Academic_Citations_Web Academic Citations for the Web]</cite></span>


== Concerns not addressed by existing formats ==


There are some aspects '''NOT adequately''' covered by existing formats. I have addressed this issue on the OpenOffice.org wiki page, too. [see http://wiki.services.openoffice.org/wiki/Bibliographic_Database for an extending discussion, the paragraph on ''Reference Types'']


----


These issues pertain mainly to '''Errata''', '''Comments and Authors Reply''' and '''Article Retractions'''.
=== A Prescriptive Proposal ===
* a bidirectional link could be necessary to implement these features (original article <=> eratum, reply, retraction letter)
([http://microformats.org/wiki/index.php?title=citation&diff=31441&oldid=27502 Contributed 2008-07-09] by [[User:Paramaeleon]] to the [[citation]] page). [[User:Tantek|Tantek]] 19:31, 27 July 2012 (UTC)
* '''IMPORTANT: Errata'''
** Erata: one or more Corrections might be posted in various issues of the journal
** this is usually cited as: Orininal Article Citation Data (Correction available in ''Journal, Issue Nr, Year, Pages'') (repeat for more than one correction)
** it is possibly never cited alone
** there should be a link to the original article, while the original article should contain a link to this ''Errata''
* '''IMPORTANT: Commentary and Author Reply'''
** similar to Errata, there might be one or more Comments and Author Replys; this should be stored, too
** however, it is usually not included in the original citation
** it might be used however in a citation, but I do not know exaclty how to cite it optimally (original article should be provided as well)  
* '''IMPORTANT: Article Retraction'''
** an article may be retracted because of plagiarism or some other flaw
** this should not be used any further in the research
** however, it might be used e.g. for an article on plagiarism or flawed research
** there should be therefore one field storing this information, too, and a link to:
** the published withdrawal letter (which explains why the article was retracted)


* this issue may need a time-controlled event
Here is a proposal which was derived from what one actually has to give as information in a citation in university work. (I don't know where to put that, so I put it right here.)
* '''IMPORTANT: electronic publishing ahead of print (EPUB)'''
** more and more articles are initially posted online, before the published article gets actually printed
** How should this be used/cited?
** Is this changed, after the print version becomes available?


First, we need a frame, let's say "hcitation". Multiple citations can be put in a "hcitation" frame. Inside there, we need to describe the type of citation; I suggest "monograph", "anthology", "periodical" , "reference", "thesis" , "standard", "internet", or "specialist".


== Outstanding Issues ==
If a "label" was used to refer to the resource in the text (often in square brackets) it can be named so.
The 3 main points i (Brian) came across so far are:
1) IDENTIFIERS
2) FORMAT TYPES
3) NESTING


1) In hCard/hCalendar there is a UID field. Added with URL it makes for a great unique identifier. There are loads of other identifers besides URL, ISBN, LOC call number, SKU, ISSN, etc. Many of these are unique in their domain, but not globally unique. So how to they get marked-up? Much like the hCard TEL/ADR properties, we can use something like:
Here comes the list of field names we need: "article", "atime", "author", "ctime", "department", "edition", "editor", "eligibility", "employer", "number", "overalltitle", "pagerange", "part", "place", "publisher", "subseries", "title", "type", "url", "volume", "volumetitle", "year".
<pre>
<nowiki>
<div class="uid"><span class="type">ISBN</span>: <span
class="value">123456</span></div>
</nowiki>
</pre>
This makes the encoding the most extensible... if we start use class="isbn" then it is an enumerated list, with class="type" it is open ended.


2) I keep mis-using "format", format is the medium - hardback, softback. The TYPE (there probably is a better word - container?) is book, article, conference, manifesto, etc. Much like the identifers we can make an enumerated list of values, class="book", class="article", but that boxes us in, whereas something like: <pre><nowiki><span class="type">article</span></nowiki></pre> leaves things more open.
The field "page" is to mark up which page you actually quote from. Marking up whatever as "prefix" should give you a hint that this is to be put at first place, but not to refer to when sorting. E.G. "The" should be marked as "prefix" either in "The Crocodile" and also in "Crocodile, the".


3) Nesting citation data in a citation. The ability to nest the same microformat inside itself is something that other microformats don't explicitly handle.
<table border="1">
    <tr>
        <th>Field</th>
        <th>Description</th>
        <th><code>monograph</code></th>
        <th><code>anthology</code></th>
        <th><code>periodical</code></th>
        <th><code>thesis</code></th>
        <th><code>standard</code></th>
        <th><code>internet</code></th>
        <th><code>specialist</code></th>
    </tr>
    <tr>
        <td><code>article</code></td>
        <td>Name of the Article in question</td>
        <td align="center">&nbsp;</td>
        <td align="center">3</td>
        <td align="center">3</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
    </tr>
    <tr>
        <td><code>atime</code></td>
        <td>Last access time for online resources. Use abbr convention for datetime encoding.</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">11</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">5</td>
        <td align="center">&nbsp;</td>
    </tr>
    <tr>
        <td><code>author</code></td>
        <td>Creator. Use fn or n markup for every single entity.</td>
        <td align="center">1</td>
        <td align="center">1</td>
        <td align="center">1</td>
        <td align="center">1</td>
        <td align="center">&nbsp;</td>
        <td align="center">1</td>
        <td align="center">1</td>
    </tr>
    <tr>
        <td><code>ctime</code></td>
        <td>Date / Last modification. Use abbr convention for datetime encoding.</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">8</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">4</td>
        <td align="center">5</td>
    </tr>
    <tr>
        <td><code>department</code></td>
        <td>special field / faculty</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">6</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">3</td>
    </tr>
    <tr>
        <td><code>edition</code></td>
        <td>Edition information</td>
        <td align="center">6</td>
        <td align="center">8</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">2</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
    </tr>
    <tr>
        <td><code>editor</code></td>
        <td>Editors of an anthology. Use fn or n markup for every single entity. Add &quot;transl&quot; for translators and &quot;comp&quot; for compilers</td>
        <td align="center">&nbsp;</td>
        <td align="center">4</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
    </tr>
    <tr>
        <td><code>eligibility</code></td>
        <td>Qualification of a specialist</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">2</td>
    </tr>
    <tr>
        <td><code>employer</code></td>
        <td>Name of university eg.</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">4</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">4</td>
    </tr>
    <tr>
        <td><code>number</code></td>
        <td>Number</td>
        <td align="center">10</td>
        <td align="center">12</td>
        <td align="center">9</td>
        <td align="center">&nbsp;</td>
        <td align="center">1</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
    </tr>
    <tr>
        <td><code>overalltitle</code></td>
        <td>Overall Title / Title of series</td>
        <td align="center">9</td>
        <td align="center">11</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">8</td>
    </tr>
    <tr>
        <td><code>pagerange</code></td>
        <td>Page range of an article in an anthology / periodical</td>
        <td align="center">&nbsp;</td>
        <td align="center">13</td>
        <td align="center">10</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
    </tr>
    <tr>
        <td><code>part</code></td>
        <td>Part of article (if having several parts)</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">4</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
    </tr>
    <tr>
        <td><code>place</code></td>
        <td>Place of publication</td>
        <td align="center">7</td>
        <td align="center">9</td>
        <td align="center">&nbsp;</td>
        <td align="center">5</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
    </tr>
    <tr>
        <td><code>publisher</code></td>
        <td>Publication house</td>
        <td align="center">8</td>
        <td align="center">10</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
    </tr>
    <tr>
        <td><code>subseries</code></td>
        <td>Name of subseries, if any</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">6</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
    </tr>
    <tr>
        <td><code>title</code></td>
        <td>The main title. Anthology: name of anthology.  Periodical: name of periodical</td>
        <td align="center">3</td>
        <td align="center">5</td>
        <td align="center">5</td>
        <td align="center">3</td>
        <td align="center">3</td>
        <td align="center">3</td>
        <td align="center">6</td>
    </tr>
    <tr>
        <td><code>type</code></td>
        <td>Type (type of thesis or type of utterance (radio interview, e-mail, ...) of a specialist)</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">7</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">7</td>
    </tr>
    <tr>
        <td><code>url</code></td>
        <td>URL</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">12</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">6</td>
        <td align="center">&nbsp;</td>
    </tr>
    <tr>
        <td><code>volume</code></td>
        <td>Volume information (eg. Vol. 22)</td>
        <td align="center">4</td>
        <td align="center">6</td>
        <td align="center">7</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
    </tr>
    <tr>
        <td><code>volumetitle</code></td>
        <td>Volume title</td>
        <td align="center">5</td>
        <td align="center">7</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
        <td align="center">&nbsp;</td>
    </tr>
    <tr>
        <td><code>year</code></td>
        <td>Year of appearance. 4 digit year. Use abbr convention for datetime encoding.</td>
        <td align="center">2</td>
        <td align="center">2</td>
        <td align="center">2</td>
        <td align="center">2</td>
        <td align="center">&nbsp;</td>
        <td align="center">2</td>
        <td align="center">&nbsp;</td>
    </tr>
</table>


The two options are:
This table shows what has to go together. Numbers give the typical ordered structure of the values. Other Information than given here (eg. ISBN, ...) actually has not to be put into citations, students would recive negative evaluations if they do so. (I hope this will help somehow. sorry for bad english.)
i) Using class="book"
<pre>
<nowiki>
<div class="hcite">
<div class="book">
  <span class="fn">Book Title</span>
  <div class="chapter">
    <span class="fn">Chapter Title</span>
  </div>
</div>
</div>
</nowiki>
</pre>


This makes things easy to nest and to figure out exactly what is
==== Sample Usage ====
associated with what, but the downside is that we have enumerated
lists of values for the class properties.


ii) using the TYPE for book
<pre><nowiki>
<pre>
<h1>The Bibliography</h1>
<nowiki>
<div class="hcite">
<div class="type">book</div>
<span class="fn">Book Title</span>
<div class="type">chapter</div>
<span class="fn">Chapter Title</span>
</div>
</nowiki>
</pre>


now the class="fn" is not nested inside the class="book" or
<table class="hcitation">
class="chapter" so there would have to be some other mechanism to
<tr>
associate the data with the type.
    <th scope="row" style="font-variant: small-caps; ">[MR06]</th>
    <td class="monograph">
        <a name="sr06">
            <span class="author firstauthor">
                <span class="family-name" style="font-variant: small-caps; ">Miller</span>,
                <span class="given-name">Michael</span>
                <span class="additional-name">C.</span>
            </span> ;
            <span class="author">
                <span class="given-name">Mathew</span>
                <span class="family-name" style="font-variant: small-caps; ">Roth</span>
            </span>
            (<span class="year">2006</span>):
            <span style="font-style: italic; ">
                <span class="title">Students' Jokes : A complete collection of jokes students laugh about</span>.
                Vol. <span class="volume">23</span>:
                <span class="volumetitle">Computational Linguists' Jokes</span>.
            </span>
            <span class="edition">4th completely revised Edition</span>.
            <span class="place">München</span> :
            <span class="publisher">Weltbild</span>
            (<span class="overalltitle">Fictional publications of munich's students</span>
            <span class="number">2675</span>)
        </a>
    </td>
</tr>
<tr>
    <th scope="row" style="font-variant: small-caps; ">[R08a]</th>
    <td class="anthology">
        <a name="r08a">
            <span class="author firstauthor">
                <span class="family-name" style="font-variant: small-caps; ">Roth</span>,
                <span class="given-name">Mathew</span>
            </span>
            (<span class="year">2008</span>):
            &bdquo;<span class="article">Using semantic HTML for bibliographic citations</span>.&ldquo;
            In:
            <span class="editor">
                <span class="given-name">Michael</span>
                <span class="additional-name">B.</span>
                <span class="family-name" style="font-variant: small-caps; ">Smith</span>
            </span> ;
            <span class="editor">
                <span class="given-name">John</span>
                <span class="family-name" style="font-variant: small-caps; ">Miller</span>
            </span>
            (Eds.)
            (<span class="year">2008</span>):
            <span style="font-style: italic; ">
                <span class="title">Being POSH : Usage of semantic HTML in web pages</span>.
                Vol. <span class="volume">4</span>:
                <span class="volumetitle">Whatever you read</span>.
            </span>
            <span class="edition">1st Edition</span>.
            <span class="place">New York</span> :
            <span class="publisher">Public Press</span>
            (<span class="overalltitle">Books on data processing</span>
            <span class="number">1435</span>)
        </a>
    </td>
</tr>
<tr>
    <th scope="row" style="font-variant: small-caps; ">[R08b]</th>
    <td class="periodical">
        <a name="r08b">
            <span class="author firstauthor">
                <span class="family-name" style="font-variant: small-caps; ">Roth</span>,
                <span class="given-name">Mathew</span>
            </span>
            (<span class="year">2008</span>):
            &bdquo;<span class="article">Using semantic HTML in scientific work</span>.&ldquo;
            P. <span class="part">1</span>; P. <span class="part">2</span>.
            In:
            <span style="font-style: italic; ">
                <span class="title">The Computational Linguist</span>.
            </span>
            <span class="subseries">Development of the Semantic Web</span>.
            <span class="volume">2</span>
            (<span class="ctime">2008</span>)
            No. <span class="number">16</span>,
            Pp. <span class="pagerange">124&ndash;131</span>
            (Access: <span class="atime"><abbr title="20080714T1612+0200">14.07.2008 16:12 CEST</abbr></span>)
            &lt;<span class="url">http://www.sample.url/web/address/1234.pdf</span>&gt;
        </a>
    </td>
</tr>
<tr>
    <th scope="row" style="font-variant: small-caps; ">[S07]</th>
    <td class="thesis">
        <a name="s07">
            <span class="author firstauthor">
                <span class="family-name" style="font-variant: small-caps; ">Smith</span>,
                <span class="given-name">John</span>
            </span>
            (<span class="year">2007</span>):
            <span style="font-style: italic; ">
                <span class="title">Semantic Data Extraction from the World Wide Web</span>.
            </span>
            <span class="employer">University of <span class="place">Munich</span></span>,
            <span class="department">Department of Computational Linguistics</span>,
            <span class="type">Diss.</span>
        </a>
    </td>
</tr>
<tr>
    <th scope="row" style="font-variant: small-caps; ">[SVG11]</th>
    <td class="standard">
        <a name="svg11">
            <span class="number">ISO 1234567</span>
            (<span class="edition">1-2003</edition>):
            <span style="font-style: italic; ">
                <span class="title">Scalable Vector Graphics (SVG) 1.1 Specification</span>.
            </span>
        </a>
    </td>
</tr>
<tr>
    <th scope="row" style="font-variant: small-caps; ">[Wik08]</th>
    <td class="internet">
        <a name="wik08">
            <span class="author firstauthor">
                <span class="family-name" style="font-variant: small-caps; ">Wikipedia, the free encyclopedia</span>,
            </span>
            (<span class="year">2008</span>):
            <span style="font-style: italic; ">
                <span class="title">Microformat</span>.
            </span>
            (Version: <abbr class="ctime" title="2008-06-19">19th June 2008</abbr>.
            Access: <abbr class="atime" title="20080703T1423+0200">3rd July 2008 14:23 CEST</abbr>)
            &lt;<a href="http://en.wikipedia.org/w/index.php?title=Microformat&oldid=220275451" class="url">http://en.wikipedia.org/w/index.php?title=Microformat&amp;oldid=220275451</a>&gt;
        </a>
    </td>
</tr>
<tr>
    <th scope="row" style="font-variant: small-caps; ">[W08]</th>
    <td class="specialist">
        <a name="w08">
            <span class="author firstauthor">
                <span class="family-name" style="font-variant: small-caps; ">Wang</span>,
                <span class="given-name">Wu</span>
            </span>
            (<span class="eligibility">Professor of Informatics</span>,
            <span class="department">Department of Applied Sciences</span>,
            <span class="employer">University of Michigan</span>)
            (<abbr class="ctime" title="20000801T0918+0100D0007">01.08.2000, 9:18&ndash;9:25 MEZ</abbr>)
            <span class="title">Science News</span>.
            <span class="type">Interview</span>.
            <span class="overalltitle">Michigan Television</span>
        </a>
    </td>
</tr>
</table>
</nowiki></pre>


== Brian's Straw format ==


=== implied schema (examples) ===
+ publisher
+ language
+ description
+ title
+ creator
+ journal
+ volume
+ issue
+ page
+ edition
+ identifier
+ tags
+ format
+ date published
+ copyright
- audience


=== implied schema (formats) ===
----
+ publisher
+ language
+ description
+ title
+ creator
+ volume
+ pages
+ edition
+ issue
+ identifier
+ tags
+ format
+ date published
+ date copyrighted
- subtitle
- image
- excerpt
- index terms
- series title
- publication
- journal
- part (1 of X)


UNION of the two schemas
=== XHTML Structure ===
+ (PLUS) means common properties
With my exprience working X2V and hCa* has taught me what elememts are easy to find and which are not. Since the Citation microformat is very new it is possible to not make a lot of the same errors twice and to make things easier for extracting application to find and imply certain properties.
- (MINUS) means unique to the schema


=== Examples ===
* There should be some sort of 'root node' that implies all child elements are for the hCitation microformat.
* Since most people will have multiple citations there should be away to represent each hCitation object as a unqiue block independent of another. This is to keep the parse from finding 'author' and applying that to all citations. Each citation should be in a container (class="hcite") that is separated from others.
* Perhaps class="hcite" with <code>&lt;cite&gt;</code> recommended as the root element. E.g. <code>&lt;cite class="hcite"&gt;</code>


Markup examples using the above format:
'''Note: This section was the original content of the document. Since then, class='hcite' has been agreed on as the root class name. See  [http://microformats.org/wiki?title=citation-brainstorming#.27hcite.27_as_Root_Element_name explanation].'''


==== Book ====


This is Brian's original example


<pre>
----
<nowiki>
<ul class="bibliography">
<li class="hcite" xml:lang="en-gb">
<!-- publisher data as hCard-->
<div class="publisher vcard">
<span class="fn org">ABC Publishing Co.</span>
<span class="country-name">United Kingdom</span>
...
</div>
<!-- author(s) data as hCard -->
<div class="creator vcard">
<span class="fn n"><span class="given-name">John <span class="family-name">Doe</span></span>
...
</div>


<!-- location data -->
=== how to use with HTML5 ===
<span class="fn">Foobar!</span>
Per [http://lists.w3.org/Archives/Public/public-html/2009Jun/0811.html Theresa O'Connor's email to public-html]:
<span class="description">World Class Book about foobar</span>
Add a section in the citation microformat describing how to use the citation microformat in [[HTML5]], including optional use of HTML5's &lt;time&gt; element and microdata feature. Encourage [[HTML5]] to drop the "BibTeX" predefined microdata vocabulary and reference an updated citation microformat spec instead. [[User:Tantek|Tantek]]
<span class="volume">1</span>
<span class="issue">1</span>
<span class="edition">1</span>
<span class="pages">1-10</span>
<span class="format">article</span>
<!-- differed to the UID debate -->
<span class="identifier">12345678</span>
<!-- keywords -->
<a class="keyword" rel="tag" href="/tags/foo">foo</a>
<span class="keyword">bar</span>
<!-- date properties -->
Published <abbr class="dtpublished" title="20060101">January 1st 1006</abbr>
Copyright <abbr class="copyright" title="20060101">2006</abbr>
</li>
...
</ul>


<p class="citation">Have you read <span class="title"><abbr title="book" class="format">Foo Bar</abbr></span>?
It was written by <span class="author vcard"><span class="fn">John Doe</span></span>.
It only came out a <abbr class="dtpublished" title="20060101">few months ago</abbr></p>
</nowiki>
</pre>


Note: the "format" property above is incorrect. Format would refer more the physical characteristics of an item, rather than its type or genre (e.g. "article", "book", etc.). I'd rather have the main class for the li be "article" in this context, than the fairly meaningless "citation."  Of course, one could have both, which would be fine too. -- bruce


Note: Could we use ROLE from hCard to identify editors, translators, authors, etc?
----
This was discussed on the mailing list and the idea was dropped [http://microformats.org/discuss/mail/microformats-discuss/2006-September/005694.html]


'''Comments''' : [[User:Singpolyma|singpolyma]] 08:03, 16 Jun 2006 (PDT) : keywords should be [[rel-tag]], and probably also [[XOXO]] (the same way the citation list is)
=== OCLC's WorldCat for titles ===
Question: what about using something like OCLC's [http://www.oclc.org/worldcat/open/isbnissnlinking/default.htm WorldCat] for linking titles? - Tim White




==== Citing Private Communication ====


Needs an example.
----


==== Citing Legal Cases ====
=== This and That ===
Needs an example.  
After reading through alot of different citation encoding formats, i noticed that each format was being used in one of two ways. It was either to describe the Current page (THIS.PAGE) or being used to encode references that point to external resources (THAT.PAGE)
see [http://microformats.org/wiki/citation-examples-markup#Wikipedia_Court_Case Wikipedia example] for inspiration.


==== Citing a Book ====
The informatation being encoded was identical for both resources (author, date, name, etc) they just reference different things. For this microformat, i'm not sure if we want to try to solve both problems, or just one? The meta tags in the head element would be the ideal place for information about the THIS.PAGE, but that is not in following with the ideals of microformats where information is human-readable. The THAT.PAGE idea where a list of references is at the end of a document in the form of a bibliography is more inline with the ideals of a microformat where the data is human-readable. That doesn't mean that data about the current document shouldn't be human-readable, so some of the same properties used to reference extermal resources can be used for the current document (THIS.PAGE). To do this a different root item could be used and transforming applications could either extract the citation data about the current page, or information about this page's references.


needs an example
This is open for discussion, but either way, i believe that the properties used to describe a page will be the same for both THIS and THAT. [http://suda.co.uk/ brian suda]


==== Citing a journal article ====
==== More on This and That ====
Citation microformats are being explored as a possibility for citing genealogical information at [http://eatslikeahuman.blogspot.com Dan Lawyer's blog].


needs an example
This is a case where frequently the citation would refer to (THIS.PAGE), but would have nested within it a reference to (THAT.PAGE), possibly a few levels deep. For instance, a web page might contain data extracted from a microfilm of a census. The citation would need to include information about the web page, information about the microfilm, and information about the census. Genealogical citations are expected to include the repository (where can this book or microfilm be found. Is this the same as ''venue''?). So, at each level the information should contain the repository of the referenced item. A nesting (recursive) mechanism for citation microformats would be useful in this case. Is this the function of the "container" element in the Straw Format?


==== Citing a magazine article ====


needs an example


==== Citing a Patent ====
----


Drawn from this [http://microformats.org/wiki/citation-examples#U.S._Patent example from Wikipedia]:
=== MARC / MODS / Dublin Core ===
The MODS ([http://www.loc.gov/standards/marcxml/Sandburg/sandburgmods.xml example]) and Dublin Core ([http://www.loc.gov/standards/marcxml/Sandburg/sandburgdc.xml example]) transformations of MARC21 may contain some useful ideas.


<pre><nowiki>
Here's a first attempt at rewriting the linked examples in XHTML (written in response to a [http://microformats.org/discuss/mail/microformats-discuss/2005-December/002438.html mailing list query about encoding book information with microformats]):
<li class="hcite"><a href="http://patft.uspto.gov/netacgi/nph-Parser?patentnumber=4,405,829" class="url"
    title="http://patft.uspto.gov/netacgi/nph-Parser?patentnumber=4,405,829">
<span class="format">U.S. Patent</span> <span class="identifier">4,405,829</span></a>:
    <span class="description">The <a href="/wiki/RSA" title="RSA">RSA</a> patent, a famous software patent on the ground-breaking
    and highly unobvious algorithm for public key encryption, widely used for secure communications
    in many industries nowdays</span>
</li>
</nowiki></pre>
 
==== Citing a conference publication====
 
Based on the [http://microformats.org/wiki/citation-examples#ACM_Digital_Library_Search_Result_Examples conference publication reference example].
 
Changed Oct 06 to conform with [http://microformats.org/wiki/citation-brainstorming#Brian.27s_Straw_format Brian's format]. --[[User:Mike|Mike]] 18:09, 12 Oct 2006 (PDT)
(everything but the url class should be in line with that proposal)
 
L. Hochstein, J. Carver, F. Shull, S. Asgari, V. Basili, J. K. Hollingsworth, and M. Zelkowitz, “Hpc programmer productivity: A case study of novice hpc programmers,” in Proceedings of ACM/IEEE Supercomputing Conference, 2005.


<pre><nowiki>
<pre><nowiki>
<cite class="hcite">
<div class="book" lang="en">
<span class="creator vcard"><span class="fn">Lorin Hochstein</span><span class="org"> University of Maryland, College Park </span><span>,
  <h3 class="fn">Arithmetic /</h3>
<span class="creator vcard"><span class="fn"> Jeff Carver </span> <span class="org"> Mississippi State University </span> <span>,
  <p>By <span class="creator"><span class="fn">Sandburg, Carl</span>,
<span class="creator vcard"><span class="fn"> Forrest Shull </span> <span class="org"> Fraunhofer Center Maryland </span> <span>,
    <span class="date">1878-1967</span></span>,
<span class="creator vcard"><span class="fn"> Sima Asgari</span> <span class="org"> University of Maryland, College Park </span> <span>,
    and <span class="illustrator">Rand, Ted</span></p>
<span class="creator vcard"><span class="fn"> Victor Basili</span> <span class="org"> Fraunhofer Center Maryland </span> <span>,
  <p>Publisher: <span class="publisher"><span class="fn">Harcourt Brace Jovanovich</span>,
<span class="creator vcard"><span class="fn"> Jeffrey K. Hollingsworth</span> <span class="org"> University of Maryland, College Park </span> <span>, and
    <span class="locality">San Diego</span></span></p>
<span class="creator vcard"><span class="fn"> Marv Zelkowitz</span> <span class="org"> University of Maryland, College Park </span> <span>,
  <p>Published: <span class="issued">1993</span></p>
<a class="fn url" href="http://dx.doi.org/10.1109/SC.2005.53">HPC Programmer Productivity: A Case Study of Novice HPC Programmers</a>.  
  <p class="description">A poem about numbers and their characteristics. Features
        (<span class="format">conference publication</span>)
    anamorphic, or distorted, drawings which can be restored to normal by viewing
<cite class="container hcite">
    from a particular angle or by viewing the image's reflection in the provided
<a class="fn url" href="...">Proceedings of ACM/IEEE Supercomputing Conference</a>
    Mylar cone.</p>
<abbr class="dtpublished" title="20051126T0000-0800">2005</abbr>
  <p class="note">One Mylar sheet included in pocket.</p>
</cite>
  <p>Subjects:</p>
page <span class="pages">35</span>
   <ul>
<div class="publisher vcard">
     <li class="subject">Arithmetic</li>
   <span class=" fn">IEEE Computer Society
     <li class="subject">Children's poetry, American.</li>
  </span>
     <li class="subject">Arithmetic</li>
     <div class="adr">
    <li class="subject">American poetry</li>
      <span class="locality">Washington</span>,
    <li class="subject">Visual perception</li>
      <span class="region">DC</span>
  </ul>
     </div>
</div>
  </div>
<a class="url instantiation" href="http://portal.acm.org/ft_gateway.cfm?id=1105800&type=pdf&coll=portal&dl=ACM&CFID=68330711&CFTOKEN=39187329">PDF of full text from ACM</a>
 
DOI: <a class="url uid" href="http://dx.doi.org/10.1109/SC.2005.53">10.1109/SC.2005.53</a>
        Tags:
     <a class="keyword" rel="tag" href="results.cfm?query=genterm%3A%22Design%22 ...">
Design</a>,
    <a class="keyword" rel="tag" href="results.cfm?query=genterm%3A%22Experimentation%22 ....">
Experimentation</a>,
    <a class="keyword" rel="tag" href="results.cfm?query=genterm%3A%22Measurement%22...">
Measurement</a>,
    <a class="keyword" rel="tag" href="results.cfm?query=genterm%3A%22Performance%22 ...">
Performance</a>
 
<blockquote class="description">In developing High-Performance Computing (HPC) software, time to solution is an important metric. This metric is comprised of two main components: the human effort required developing the software, plus the amount of machine time required to execute it. To date, little empirical work has been done to study the first component: the human effort required and the effects of approaches and practices that may be used to reduce it. In this paper, we describe a series of studies that address this problem. We instrumented the development process used in multiple HPC classroom environments. We analyzed data within and across such studies, varying factors such as the parallel programming model used and the application being developed, to understand their impact on the development process.
</blockquote>
  </cite>
</nowiki></pre>
</nowiki></pre>


== comparison and use of other microformats ==
=== Citation vs. [[media-info]] ===
What distinguishes a cite from say [[media-info]] (e.g. [[media-info-examples]]) is that a cite is a reference to something explicitly external to the current piece of content or document, whereas [[media-info]] describes information about content embedded or inline in the current document.


'''Note''' (From [[Discoleo]], Sept. 06)
=== Date Formatting ===
* sometimes, the citation must include '''Town/Country''' and '''Precise Date/Date Range''', e.g.
Since microformats are all about re-use and the accepted way to encode Date-Time has been pretty much settled, then this is a good place to start when dealing with all the different date citation types.  
** ''Gillespie SH, Dickens A.'' Variation in mutation rate of quinolone resistance in Streptococcus pneumoniae [abstract P06-17A]. In: Abstracts of the 3rd International Symposium on Pneumococci and Pneumococcal Disease (Anchorage, 5-9 May 2002).Washington, DC: American Society of Microbiology, 2002.
** ''Bassetti, M.; Righi, E.; Rebesco, B.; Molinari, MP.; Costa, A.; Fasce, R.; Cruciani, M.; Bassetti, D.; Bobbio Pallavicini, F.'' 44th Annual Interscience Conference on Antimicrobial Agents and Chemotherapy (ICAAC). Washington, DC; 2004. Epidemiological trends in nosocomial candidemia in ICU: A five-year Italian perspective.
** ''Peacock JE, Wade JC, Lazarus HM, et al.'' Ciprofloxacin/piperacillin vs. tobramycin/piperacillin as empiric therapy for fever in neutropenic cancer patients, a randomized, double-blind trial [abstract 373]. In: Program and abstracts of the 37th Interscience Conference on Antimicrob Agents and Chemotherapy (Toronto). Washington, DC: American Society for Microbiology, 1997.


==== Citing an external website ====
These are all the different fields from various citation formats that are of temporal nature:
* Date (available | created | dateAccepted | dateCopyrighted | dateSubmitted | issued | modified | valid)
* originInfo/dateIssued
* originInfo/dateCreated
* originInfo/dateCaptured
* originInfo/dateOther
* month
* year
* Copyright Year
* Date - Generic
* Date of Confernce
* Date of Publication
* Date of update/revisou/issuance of database record
* Former Date
* Entry Date for Database Record
* Database Update
* Year of Publication


This is based on a formal citation of a website in the references section of a research paper, but could also be used for in-line links that had added information. Here's the original:
There are several common properties across several citation domains and will certainly be in the citation microformat, the unique instances will need further consideration, otherwise there could be no end to posiblities.  


[25] David Stern, "eprint Moderator Model", http://www.library.yale.edu/scilib/modmodexplain.html  (version dated Jan 25, 1999)
There are also several properties (year, month, Year of publication) that can be extracted from another source. Therefore, if you only encode a more specific property such as; Date of Publication, you can extract the 'year of publication' from that. Since the date-time format we are modeling after is the ISO date-time format, just the Year portion is an acceptable date. So if you ONLY know the year of publication, the you can form a valid 'Date of Publication' as a microformat (which inturn is a valid 'year of publication') - you milage may vary when it comes to importing into citation applications.
<pre><nowiki>
<cite class="citation">
<a class="fn url" href="http://www.library.yale.edu/scilib/modmodexplain.html">eprint Moderator Model</a>
<span class="author vcard">
<a href="http://pantheon.yale.edu/~dstern/dsbio.html" class="url fn">David Stern</a>
</span> 
<abbr class="dtpublished" title="19990125T0000-0500">
    Jan 25, 1999
  </abbr>
</cite>
</nowiki></pre>


...


It seems to me that these can be collapsed to maybe one or two different date properties.  As far as the specific human readable formatting of the date, that can be chosen per whatever the presentation style guide says, and the [[datetime-design-pattern]] used to simplify the markup. - Tantek




== Old straw format discussion ==
'''Important'''
Sometimes we need a date range and not simply a date (e.g. 4-6 May 2006). See ''Conference Citation'' examples later on this page. - Discoleo


Saved here so that I'm not just deleting people's comments.
'''Seasons'''
Some journals have seasonal issues (e.g. "Summer 2006 edition") instead of, or as well as, editions labelled by month or other calendar-date. [[User:AndyMabbett|AndyMabbett]] 05:05, 4 Nov 2006 (PST)


=== Tags ===
Some of the citation formats has a place for 'keywords' or 'generic tags', etc. This might be a good place to re-use the [http://microformats.org/wiki/rel-tag RelTag microformat]. The downside would be that they are then forced to be links, which might be the correct way to mark-up these terms.


=== Mike straw format suggestion (Deprecated) ===
== past discussions ==
=== Original hBib Discussion ===
During the WWW2005 Developer's Day [[microformats]] track, Rohit Khare gave a [[presentations|presentation]] where he discussed the microformats [[process]], and then did  a quick demonstration wherein a bunch of us got on a shared Subethaedit document, and brainstormed some thoughts on what an "hBib" bibliography citation microformat would look like.  Rohit placed the [http://cnlabs.commerce.net/~rohit/hBib%20Discussion.html document on his Commercenet site].


In the interests of starting debate and having something concrete to fix, I suggest the following structure for a format. It is probably very incomplete and I claim no microformat expertise. I'm just trying to follow existing patterns. Comments and ridicule are both solicited. -Mike
* [[hBib Discussion]]


'''NOTE:''' This format is here for historical reference. Because it was not based on existing examples, I've deprecated it and contributed examples to Brian's format. If you feel that any missing elements in here should be in the final format, find examples for them and contribute to Brian's schema. Thanks! --[[User:Mike|Mike]] 18:22, 12 Oct 2006 (PDT)
''An attempt to summarize and inline the linked document follows. -Mike''


==== In General ====
Two major goals were outlined by the group:


The ''citation'' format is based on a set of fields common to many bibliographic data formats, which are often implied by standard citation display styles but not explicitly marked up in practice on the web.
* Avoid re-keying references
* Adapt to new journal styles by changing CSS
The fundamental problem was discussed in terms of display - the ability to transform XHTML+hBib into the many journal-specific formats. For example, how to display "et.al" when all authors are present in the source, and how to re-order the elements if a style defines a set order of elements that conflicts with the ordering in the source. Using hCard for authors was agreed on, and the beginnings of an example were shown.


==== Schema ====
== Outstanding Issues ==
See [[citation-issues]].


The citation schema consists of the following:
== Examples in the wild ==
Pages which start to use the discussion above to create working examples in using hcite:
(This section could be used as a base for a page like "hcite-examples-in-wild" later).


* cite
Please add new examples to the top of this section.
** title: required, text (class = fn)
* [http://www.demo.vorlagen.uni-erlangen.de/univis/mitarbeiter.shtml/georg-hager.shtml Example User Page] at the regional computer lab Erlangen, Germany, based on the universal information system UnivIS marked up with vcard, hcalender (optional, if user makes a lecture) and hcite.
** subtitle: optional, text
** authors: optional, use hCard
** publication date: optional
** link(s) to instantiations, optional, url or use rel-enclosure? (class=url)
** UID, optional (for ISBN, DOI - use existing uid class) | permalink
** series (aka volume/issuenum) , optional (''not as sure how to handle these - suggestions?'')
** pages: startpage & endpage, optional, text
** venue, optional (hCard)
** publisher, optional (hCard)
** container: optional (nested hCite)
** abstract, optional (blockquote + class="abstract" ?)
** notes, optional (blockquote + class="notes" ?)
** keywords, optional (rel-tag)
** image, optional (for inclusion inline, unlike the url)
** copyright, optional (rel-license)
** ''what else am I missing?''
*** language, optional
 
''Looks good, but I question the use of hCard for names. Due to ambiguity issues, requring hCard would lead to extra markup in order to apply just a name, hence [http://microformats.org/discuss/mail/microformats-discuss/2006-March/003487.html the need for a root element]. We should extract the N optimization of hCard like we did with adr, in order to ease this problem.'' --[[User:RCanine|Ryan Cannon]]
 
Perhaps a Retrieved Date or Access Date would be appropriate for citing online resources. For example at http://www.crlt.umich.edu/publinks/facment_biblio.html
you see citations like this<nowiki>:</nowiki>
<blockquote>
Chief Academic Officers of the Big 12 Universities (2000). Big 12 Faculty Fellowship Program. Retrieved December 20, 2000 from the World Wide Web: http://www.k-state.edu/provost/academic/big12/big12guide.htm.
</blockquote>
--[[User:JoeAndrieu|Joe Andrieu]]
 
 
==== Discussion about citing legal cases ====
 
Here's some info I found about citing law:
 
I'm not a lawyer, so I'm relying on the published [http://www.legalbluebook.com "blue book" standard], at least the only part of it I can get without paying $25. I'd be happy to hear improvements from experts in the field - how do lawyers mark up references to case law in HTML now?
 
From groklaw.net and eff.org, I find mostly just links to PDFs with the name of the case as the link text. Or just this, from EFF:
<pre><nowiki>
<h1>The Betamax Case</h1>
<h2>Sony Corp. of America v. Universal City Studios, 464 U.S. 417 (1984)</h2>
</nowiki></pre>
 
From an example at the sample bluepages: http://www.legalbluebook.com/pdfs/bluepages.pdf
5 basic components:
*1 name of the case (citation title)
*2 published source in which case may be found (citation containing publication?)
*3 a parenthetical indicating the court and year of decision (citation venue?)
*4 other parenthetical information, if any (citation notes?)
*5 subsequent history of the case, if any (citation notes?)
 
Here's two examples from the bluebook. Note that there are very strict rules about abbreviations in that source!
 
Holland v. Donnelly, 216 F. Supp. 2d 227, 230 (S.D.N.Y. 2002), aff'd, 324 F.3d 99 (2d Cir. 2003).
 
Green v. Georgia, 442 U.S. 95, 97 (1979) (per curiam) (holding that exclusion of relevant evidence at sentencing hearing constitutes denial of due process).


== discussions ==
== discussions ==
* [[citation-irc-notes-2006-04-09]]
* [[citation-irc-notes-2006-04-09]]
== See also ==
{{citation-related-pages}}

Latest revision as of 21:11, 26 July 2023


Part of the overall effort to develop a citation microformat.

Use Cases

To focus the discussion, please add use cases below that will help show what problems the citation microformat will be solving.

Use cases for both publishing and consuming citation information can help to focus citation brainstorming on efforts that provide real world utility to users.

For now, please add any uses cases you think of, however common or obscure (feel free to note opinions as to expected/known frequency of use of such use cases).

improve web citations

Articles on the web often cite other online articles with permalinks (e.g. blogs quoting other resources, including blogs). Such web citations could be improved both in content and interaction in a number of useful ways:

  • richer citations. Existing web citations typically include only permalink URL and article title in a hyperlink. An explicit format (both in microformat and style) for web citations could encourage the use of richer citations with information like author and date(time) of publication. Author information is useful because it provides an immediate inline proxy for reputation, and date of publication is useful because it sets a context for the information backed by the citation.
  • richer citation interfaces. Web articles sometimes provide an explicit user interface to copy/paste a permalink for reference purposes, or a hyperlink embed code for linking to the article from another web article. An explicit markup structure/format could encourage such interfaces to provide a richer citation structure (e.g. including author, date of publication) to copy/paste, with little to no change in overall UI. This is useful in that it would help propagate richer citations themselves, which have the advantages mentioned above.

Additional useful rich citation enhancements:

  • access date. Rich citations could include the access date when an author (blogger) made a citation, because resources on the other side of those links can change without notice.
  • ...

I read this

A reader wants to collect a set of things they've read (e.g. on the web), perhaps for the purposes of cataloging them, adding notes, and using the information to generate later citations, potentially in other forms, such as BibTeX or Docbook, for inclusion in a publication of their own.

If web articles (e.g. blog posts) contained discoverable descriptions of self-citations (e.g. permalinks plus authorship), browsers/aggregators could both automatically collect these, perhaps as part of an enhanced browser history functionality, or allow explicit collection, e.g. bookmarking with additional structure.

Notes: In this case, it isn't important to the user what style the citation takes as displayed on the page where they find it. What *is* important is that it contains enough information to allow generation of the format they will ultimately re-publish it in. This implies that it may be worthwhile to err a little on the side of verbosity, but at most enough to provide typical TCMOS/APA/MLA citations.

collect further reading

Was part of Acquiring reference information from the web.

A reader finds a list of citations (e.g. a paper's bibliography, an author's papers page, results of a search for academic papers), and wants to add them to a queue of things they'd like to read, perhaps as part of further research on whatever subject/person they were reading/researching.

Marking up the list of citations with a microformat would enable to browsers/aggregators to present an explicit list of structured citations with a user interface for one-click addition to a read it later list (or a local reference database).

Links to downloadable full representations of the cited work (e.g. link to the PDF of a journal article, or to a music file) would help the reader find cited works, and perhaps even have their browser/aggregator prefetch/cache/download them.

Subscribing to reading lists, periodicals, etc

I would like to be able to leverage my news aggregator with hAtom to subscribe to a remote source for citation information, for example:

  • a reading list for a seminar
  • The publication list for a conference (e.g., subscribe to SIGGRAPH and see the updated conference proceedings every year)
  • the issues of a journal
  • a particular research group or researcher's publications
  • Not just research: a popular author's publications (e.g., Malcolm Gladwell's Archive)

Aggregating reading lists and reviews

A citation microformat-specific aggregator could provide a decentralized version of CiteULike. Libraries, authors, research groups, and publishers could mark up their collections, while other people on weblogs or review sites could add tags and reviews.

At least, having a well-adopted microformat would make writing tools like CiteULike much better, since it relies in some cases on screen-scraping publisher web-sites.

Cut & Paste from web pages

Capturing/copying HTML from web pages for use in other applications (especially when those apps present HTML as output), such as pasting into Word, or a specialized application like Google Notebook, Onfolio or Kaboodle. When such captures are made, it makes sense to keep track of the full citation data, including the date it was accessed, which may or may not be the date it was published.

Finding in Library

Find a copy of the cited work in a nearby library (as with OpenCOinS).

Buy a copy

Find the cited work on, for example, Amazon or ABE; or subscribe to a journal via its own website.

Find reviews

Find third-party reviews of the cited work.

Give citation data for the page being visited

Adding a class of, say, "self" to an attribute of the proposed strawman would allow users (or user agents) to extract the data required to cite the page being visited, when referring to it elsewhere. There would be the added advantage of allowing the citation to be ignored by any parser which might be building a "tree" of citations, and preventing the setting up of an infinite loop.

For evidence of published "self citation" data (albeit on a secondary page) see the "cite this article" link on any Wikipedia entry, e.g. [1] from [2].

See also Proposal to include on-page citation data in Wikipedia

Cite a journal on Wikipedia

  • (from a mailing list):

if you want to cite a [biomedical journal] journal article on Wikipedia [...] you can export a correctly-formatted citation for Wikipedia from HubMed using unAPI... http://hublog.hubmed.org/archives/001408.html

  • Zotero, a Firefox extension to help collect, manage, and cite research sources.

principles

Principles help guide and compare various different brainstorming proposals.

In the first three years of development the citation microformat effort generated a number of brainstorm proposals without clear consensus or adoption of any of them in particular. Thus any new (2012+) proposals must be written with references to particular principles for each design decision, justifying why/how the new proposal is an improvement upon previous proposals.

Principles to use:

  • microformats design principles
  • Semantic HTML Design Principles
  • use as precise as HTML semantics as are available

Semantic HTML Design Principles

  1. Reuse the schema (names, objects, properties, values, types, hierarchies, constraints) as much as possible from pre-existing, established, well-supported microformats.
  2. When new schema are needed, reuse the schema (names, objects, properties, values, types, hierarchies, constraints) as much as possible from pre-existing, established, well-supported other formats/standards by incorporation, following the microformats naming-principles. Re-do constraints expressed in the source standard from the perspective of microformats design principles and designed primarily for web authoring. Informatively mention source standard for reference purposes.
    1. For types with multiple components, use nested elements with class names equivalent to the names of the components.
    2. Plural components are made singular, and thus multiple nested elements are used to represent multiple text values that are comma-delimited.
  3. Use the most accurately precise semantic HTML building block for each object etc.
  4. Otherwise use a generic structural element (e.g. <span> or <div>), or the appropriate contextual element (e.g. an <li> inside a <ul> or <ol>).
  5. Use class names based on names from the original schema, unless the semantic HTML building block precisely represents that part of the original schema. If names in the source schema are case-insensitive, then use an all lowercase equivalent. Components names implicit in prose (rather than explicit in the defined schema) should also use lowercase equivalents for ease of use. Spaces in component names become dash '-' characters.
  6. Finally, if the format of the data according to the original schema is too long but still human readable/listenable, use <abbr> instead of a generic structural element, and place the literal longer data into the 'title' attribute (where abbr expansions go), and the briefer equivalent into the contents of the element itself. If however, the format of the literal longer data data is not human-friendly, instead of <abbr>, use the value-class-pattern or HTML5 <time>/<data> elements as most semantically appropriate.

Brainstorm proposals should take into account the Semantic HTML Design Principles.

semantic elements to consider

One of the guiding principles of Microformats is to encourage the use of the most precisely semantically rich element to describe each node (Point 2 of Semantic HTML Design Principles: Use the most accurately precise semantic HTML building block for each object etc). Since we are dealing with HTML and citations, several elements are candidates to be used to enrich the semantic meaning. CITE, BLOCKQUOTE, Q, A, (are there more?)

brainstorm proposals

web citations

Main article: h-cite

This brainstorm has now been moved to a draft microformat:

The remainder of this brainstorm proposal is left here for historical purposes:

The web citations proposal uses a smaller, simpler set of only eight properties to solve the specific problem of how to markup citations in an article on the web that refers to other articles on the web. Offline to offline, and online to offline references are specifically not addressed.

web citations background

This work is based on how existing citation format styles (APA, MLA, TCMOS) represent references to articles on the web, and is designed to match the implied schema of those styles. The web citations proposal defines how to markup such reference representation styles in order to satisfy the use-cases above.

web citation illustrative example

Here is a simple minimal abstract web citation example:

<span class="h-cite">
  <time class="dt-published">YYYY-MM-DD</time> 
  <span class="p-author h-card">AUTHOR</span>: 
  <cite><a class="u-url p-name" href="URL">TITLE</a></cite>
</span>

web citation properties

root classname: h-cite

In rough order of presentation and relevance/frequency:

properties:

  • dt-published - reused from uf2 h-entry
  • p-author - same, with optional substructured h-card
  • p-name - common property instead of entry-title
  • u-url - a URL to access the cited work
  • u-uid - a URL/URI that uniquely/canonically identifies the cited work, canonical permalink.
  • p-publication - for citing articles in publications with more than one author, or perhaps when the author has a specific publication vehicle for the cited work. Also works when the publication is known, but the authorship information is either unknown, ambiguous, unclear, or collaboratively complex enough to be unable to list explicit author(s), e.g. like with many wiki pages.
  • dt-accessed - date the cited work was accessed for whatever reason it is being cited. Useful in case online work changes and it's possible to access the dt-accessed datetimestamped version in particular, e.g. via the Internet Archive.
  • p-content for when the citation includes the content itself, like when citing short text notes (e.g. tweets).

web citations vs previous proposals

I think the biggest problem with all previous proposals is that they tried to do too much. They didn't design a citation microformat that could be used as a building block, but rather, erred on the side of attempting to describe the myriad types of references to dead-tree resources. They were so over-designed that their authors didn't even dogfood them on their own sites. -- Tantek 00:56, 7 August 2012 (UTC)

A primary goal of the web citation effort is to both start small, and always "make small possible", that is, no matter how it is extended, continue permitting very small meaningful citations with perhaps only 2-3 properties (e.g. date published, author, name of work).

web citations design principles

Principles driving this proposal:

  • solve a specific problem. In this case web citations seeks to solve a more specific problem than previous proposals, that of citations from the web to the web (more constrained than any publication to any publication).
  • solve simpler problems first. Existing web-to-web citations contain very little information compared to generalized academic citations, thus web citations is greatly simplified compared to previous proposals by only starting with a handful of properties.
  • humans first - web citations focuses on the human readability and writability aspects of citations in articles first and foremost, and only secondarily considers the machine readability/reusability of the data contained therein.
  • reuse building blocks - by re-using the better designed aspects of existing citation conventions for web resources, web citations builds on top of previous work to make citations human readable/writable, as well as what implied properties are commonly expressed by such previous work.

web citation property details

(stub)

All web citation properties are derived from the implied schema in existing citation styling guides for citing permalinks to articles and short text notes online.

Date-time properties (dt-published, dt-accessed) may optionally include time information in addition to the date if relevant to the citation (e.g. when citing short text notes (tweets) of which there may be several in a single day).

To be added:

  • for each property, what equivalent TCMOS, APA, MLA terms/vocabulary is being expressed/captured as researched in the citation formats styles section.
  • transforms from the web citations proposal properties into each of those citation styles.
    • for citations of blog posts / articles
    • for citations of text notes / tweets
    • see examples in wild below for markup samples to style in each of the TCMOS/APA/MLA styles for blog/note citations.

web citation additional uses

The web citation proposal could be used for simple web-to-off-web citation use cases. As suggested by Ed Summers, dropping the hyperlink to the cited web article provides a simple off-web citation:

<span class="h-cite">
  <time class="dt-published">YYYY-MM-DD</time> 
  <span class="p-author h-card">AUTHOR</span>: 
  <cite class="p-name">TITLE</cite>
</span>

Next steps:

  • Try such markup with actual content being published on the web (perhaps a bibliography, list of papers in a resume, etc.)
  • See how it works/feels there
  • Determine what seems to be missing.
  • See if the "p-publisher" property helps in some web-to-off-web citation use cases.

web citation examples in the wild

Real world in the wild examples:

  • ... add uses of h-cite you see in the wild here.

Real but not quite wild (use by the brainstorm author)

<blockquote><p>
  <cite class="h-cite">
    <a class="u-url p-name" href="http://tantek.com/2013/104/t2/urls-readable-speakable-listenable-retypable"> 
      URLs should be readable, speakable, listenable, and unambiguously 
retypable, e.g. from print: tantek.com/w/ShortURLPrintExample #UX 
    </a> 
   (<abbr class="p-author h-card" title="Tantek Çelik">Çelik</abbr> 
    <time class="dt-published">2013-04-14</time>)
  </cite>
</p></blockquote>

web citation references

I've been iterating on this design for some time, however, first publicly proposed it as the result of an interactive web citation design discussion during IndieWebCamp2012:



A Prescriptive Proposal

(Contributed 2008-07-09 by User:Paramaeleon to the citation page). Tantek 19:31, 27 July 2012 (UTC)

Here is a proposal which was derived from what one actually has to give as information in a citation in university work. (I don't know where to put that, so I put it right here.)

First, we need a frame, let's say "hcitation". Multiple citations can be put in a "hcitation" frame. Inside there, we need to describe the type of citation; I suggest "monograph", "anthology", "periodical" , "reference", "thesis" , "standard", "internet", or "specialist".

If a "label" was used to refer to the resource in the text (often in square brackets) it can be named so.

Here comes the list of field names we need: "article", "atime", "author", "ctime", "department", "edition", "editor", "eligibility", "employer", "number", "overalltitle", "pagerange", "part", "place", "publisher", "subseries", "title", "type", "url", "volume", "volumetitle", "year".

The field "page" is to mark up which page you actually quote from. Marking up whatever as "prefix" should give you a hint that this is to be put at first place, but not to refer to when sorting. E.G. "The" should be marked as "prefix" either in "The Crocodile" and also in "Crocodile, the".

Field Description monograph anthology periodical thesis standard internet specialist
article Name of the Article in question   3 3        
atime Last access time for online resources. Use abbr convention for datetime encoding.     11     5  
author Creator. Use fn or n markup for every single entity. 1 1 1 1   1 1
ctime Date / Last modification. Use abbr convention for datetime encoding.     8     4 5
department special field / faculty       6     3
edition Edition information 6 8     2    
editor Editors of an anthology. Use fn or n markup for every single entity. Add "transl" for translators and "comp" for compilers   4          
eligibility Qualification of a specialist             2
employer Name of university eg.       4     4
number Number 10 12 9   1    
overalltitle Overall Title / Title of series 9 11         8
pagerange Page range of an article in an anthology / periodical   13 10        
part Part of article (if having several parts)     4        
place Place of publication 7 9   5      
publisher Publication house 8 10          
subseries Name of subseries, if any     6        
title The main title. Anthology: name of anthology. Periodical: name of periodical 3 5 5 3 3 3 6
type Type (type of thesis or type of utterance (radio interview, e-mail, ...) of a specialist)       7     7
url URL     12     6  
volume Volume information (eg. Vol. 22) 4 6 7        
volumetitle Volume title 5 7          
year Year of appearance. 4 digit year. Use abbr convention for datetime encoding. 2 2 2 2   2  

This table shows what has to go together. Numbers give the typical ordered structure of the values. Other Information than given here (eg. ISBN, ...) actually has not to be put into citations, students would recive negative evaluations if they do so. (I hope this will help somehow. sorry for bad english.)

Sample Usage

<h1>The Bibliography</h1>

<table class="hcitation">
<tr>
    <th scope="row" style="font-variant: small-caps; ">[MR06]</th>
    <td class="monograph">
        <a name="sr06">
            <span class="author firstauthor">
                <span class="family-name" style="font-variant: small-caps; ">Miller</span>, 
                <span class="given-name">Michael</span>
                <span class="additional-name">C.</span>
            </span> ;
            <span class="author">
                <span class="given-name">Mathew</span>
                <span class="family-name" style="font-variant: small-caps; ">Roth</span>
            </span>
            (<span class="year">2006</span>):
            <span style="font-style: italic; ">
                <span class="title">Students' Jokes : A complete collection of jokes students laugh about</span>.
                Vol. <span class="volume">23</span>:
                <span class="volumetitle">Computational Linguists' Jokes</span>.
            </span>
            <span class="edition">4th completely revised Edition</span>.
            <span class="place">München</span> :
            <span class="publisher">Weltbild</span>
            (<span class="overalltitle">Fictional publications of munich's students</span>
            <span class="number">2675</span>)
        </a>
    </td>
</tr>
<tr>
    <th scope="row" style="font-variant: small-caps; ">[R08a]</th>
    <td class="anthology">
        <a name="r08a">
            <span class="author firstauthor">
                <span class="family-name" style="font-variant: small-caps; ">Roth</span>, 
                <span class="given-name">Mathew</span>
            </span>
            (<span class="year">2008</span>):
            „<span class="article">Using semantic HTML for bibliographic citations</span>.“
            In: 
            <span class="editor">
                <span class="given-name">Michael</span>
                <span class="additional-name">B.</span>
                <span class="family-name" style="font-variant: small-caps; ">Smith</span>
            </span> ;
            <span class="editor">
                <span class="given-name">John</span>
                <span class="family-name" style="font-variant: small-caps; ">Miller</span>
            </span>
            (Eds.)
            (<span class="year">2008</span>):
            <span style="font-style: italic; ">
                <span class="title">Being POSH : Usage of semantic HTML in web pages</span>.
                Vol. <span class="volume">4</span>:
                <span class="volumetitle">Whatever you read</span>.
            </span>
            <span class="edition">1st Edition</span>.
            <span class="place">New York</span> :
            <span class="publisher">Public Press</span>
            (<span class="overalltitle">Books on data processing</span>
            <span class="number">1435</span>)
        </a>
    </td>
</tr>
<tr>
    <th scope="row" style="font-variant: small-caps; ">[R08b]</th>
    <td class="periodical">
        <a name="r08b">
            <span class="author firstauthor">
                <span class="family-name" style="font-variant: small-caps; ">Roth</span>, 
                <span class="given-name">Mathew</span>
            </span>
            (<span class="year">2008</span>):
            „<span class="article">Using semantic HTML in scientific work</span>.“
            P. <span class="part">1</span>; P. <span class="part">2</span>.
            In:
            <span style="font-style: italic; ">
                <span class="title">The Computational Linguist</span>.
            </span>
            <span class="subseries">Development of the Semantic Web</span>.
            <span class="volume">2</span>
            (<span class="ctime">2008</span>)
            No. <span class="number">16</span>,
            Pp. <span class="pagerange">124–131</span>
            (Access: <span class="atime"><abbr title="20080714T1612+0200">14.07.2008 16:12 CEST</abbr></span>)
            <<span class="url">http://www.sample.url/web/address/1234.pdf</span>>
        </a>
    </td>
</tr>
<tr>
    <th scope="row" style="font-variant: small-caps; ">[S07]</th>
    <td class="thesis">
        <a name="s07">
            <span class="author firstauthor">
                <span class="family-name" style="font-variant: small-caps; ">Smith</span>, 
                <span class="given-name">John</span>
            </span>
            (<span class="year">2007</span>):
            <span style="font-style: italic; ">
                <span class="title">Semantic Data Extraction from the World Wide Web</span>.
            </span>
            <span class="employer">University of <span class="place">Munich</span></span>,
            <span class="department">Department of Computational Linguistics</span>,
            <span class="type">Diss.</span>
        </a>
    </td>
</tr>
<tr>
    <th scope="row" style="font-variant: small-caps; ">[SVG11]</th>
    <td class="standard">
        <a name="svg11">
            <span class="number">ISO 1234567</span>
            (<span class="edition">1-2003</edition>):
            <span style="font-style: italic; ">
                <span class="title">Scalable Vector Graphics (SVG) 1.1 Specification</span>.
            </span>
        </a>
    </td>
</tr>
<tr>
    <th scope="row" style="font-variant: small-caps; ">[Wik08]</th>
    <td class="internet">
        <a name="wik08">
            <span class="author firstauthor">
                <span class="family-name" style="font-variant: small-caps; ">Wikipedia, the free encyclopedia</span>, 
            </span>
            (<span class="year">2008</span>):
            <span style="font-style: italic; ">
                <span class="title">Microformat</span>.
            </span>
            (Version: <abbr class="ctime" title="2008-06-19">19th June 2008</abbr>.
            Access: <abbr class="atime" title="20080703T1423+0200">3rd July 2008 14:23 CEST</abbr>)
            <<a href="http://en.wikipedia.org/w/index.php?title=Microformat&oldid=220275451" class="url">http://en.wikipedia.org/w/index.php?title=Microformat&oldid=220275451</a>>
        </a>
    </td>
</tr>
<tr>
    <th scope="row" style="font-variant: small-caps; ">[W08]</th>
    <td class="specialist">
        <a name="w08">
            <span class="author firstauthor">
                <span class="family-name" style="font-variant: small-caps; ">Wang</span>, 
                <span class="given-name">Wu</span>
            </span>
            (<span class="eligibility">Professor of Informatics</span>,
            <span class="department">Department of Applied Sciences</span>,
            <span class="employer">University of Michigan</span>)
            (<abbr class="ctime" title="20000801T0918+0100D0007">01.08.2000, 9:18–9:25 MEZ</abbr>)
            <span class="title">Science News</span>.
            <span class="type">Interview</span>.
            <span class="overalltitle">Michigan Television</span>
        </a>
    </td>
</tr>
</table>



XHTML Structure

With my exprience working X2V and hCa* has taught me what elememts are easy to find and which are not. Since the Citation microformat is very new it is possible to not make a lot of the same errors twice and to make things easier for extracting application to find and imply certain properties.

  • There should be some sort of 'root node' that implies all child elements are for the hCitation microformat.
  • Since most people will have multiple citations there should be away to represent each hCitation object as a unqiue block independent of another. This is to keep the parse from finding 'author' and applying that to all citations. Each citation should be in a container (class="hcite") that is separated from others.
  • Perhaps class="hcite" with <cite> recommended as the root element. E.g. <cite class="hcite">

Note: This section was the original content of the document. Since then, class='hcite' has been agreed on as the root class name. See explanation.



how to use with HTML5

Per Theresa O'Connor's email to public-html: Add a section in the citation microformat describing how to use the citation microformat in HTML5, including optional use of HTML5's <time> element and microdata feature. Encourage HTML5 to drop the "BibTeX" predefined microdata vocabulary and reference an updated citation microformat spec instead. Tantek



OCLC's WorldCat for titles

Question: what about using something like OCLC's WorldCat for linking titles? - Tim White



This and That

After reading through alot of different citation encoding formats, i noticed that each format was being used in one of two ways. It was either to describe the Current page (THIS.PAGE) or being used to encode references that point to external resources (THAT.PAGE)

The informatation being encoded was identical for both resources (author, date, name, etc) they just reference different things. For this microformat, i'm not sure if we want to try to solve both problems, or just one? The meta tags in the head element would be the ideal place for information about the THIS.PAGE, but that is not in following with the ideals of microformats where information is human-readable. The THAT.PAGE idea where a list of references is at the end of a document in the form of a bibliography is more inline with the ideals of a microformat where the data is human-readable. That doesn't mean that data about the current document shouldn't be human-readable, so some of the same properties used to reference extermal resources can be used for the current document (THIS.PAGE). To do this a different root item could be used and transforming applications could either extract the citation data about the current page, or information about this page's references.

This is open for discussion, but either way, i believe that the properties used to describe a page will be the same for both THIS and THAT. brian suda

More on This and That

Citation microformats are being explored as a possibility for citing genealogical information at Dan Lawyer's blog.

This is a case where frequently the citation would refer to (THIS.PAGE), but would have nested within it a reference to (THAT.PAGE), possibly a few levels deep. For instance, a web page might contain data extracted from a microfilm of a census. The citation would need to include information about the web page, information about the microfilm, and information about the census. Genealogical citations are expected to include the repository (where can this book or microfilm be found. Is this the same as venue?). So, at each level the information should contain the repository of the referenced item. A nesting (recursive) mechanism for citation microformats would be useful in this case. Is this the function of the "container" element in the Straw Format?



MARC / MODS / Dublin Core

The MODS (example) and Dublin Core (example) transformations of MARC21 may contain some useful ideas.

Here's a first attempt at rewriting the linked examples in XHTML (written in response to a mailing list query about encoding book information with microformats):

<div class="book" lang="en">
  <h3 class="fn">Arithmetic /</h3>
  <p>By <span class="creator"><span class="fn">Sandburg, Carl</span>,
     <span class="date">1878-1967</span></span>,
     and <span class="illustrator">Rand, Ted</span></p>
  <p>Publisher: <span class="publisher"><span class="fn">Harcourt Brace Jovanovich</span>,
     <span class="locality">San Diego</span></span></p>
  <p>Published: <span class="issued">1993</span></p>
  <p class="description">A poem about numbers and their characteristics. Features
     anamorphic, or distorted, drawings which can be restored to normal by viewing
     from a particular angle or by viewing the image's reflection in the provided
     Mylar cone.</p>
  <p class="note">One Mylar sheet included in pocket.</p>
  <p>Subjects:</p>
  <ul>
    <li class="subject">Arithmetic</li>
    <li class="subject">Children's poetry, American.</li>
    <li class="subject">Arithmetic</li>
    <li class="subject">American poetry</li>
    <li class="subject">Visual perception</li>
  </ul>
</div>

comparison and use of other microformats

Citation vs. media-info

What distinguishes a cite from say media-info (e.g. media-info-examples) is that a cite is a reference to something explicitly external to the current piece of content or document, whereas media-info describes information about content embedded or inline in the current document.

Date Formatting

Since microformats are all about re-use and the accepted way to encode Date-Time has been pretty much settled, then this is a good place to start when dealing with all the different date citation types.

These are all the different fields from various citation formats that are of temporal nature:

* Date (available | created | dateAccepted | dateCopyrighted | dateSubmitted | issued | modified | valid)
* originInfo/dateIssued
* originInfo/dateCreated
* originInfo/dateCaptured
* originInfo/dateOther
* month
* year
* Copyright Year
* Date - Generic
* Date of Confernce
* Date of Publication
* Date of update/revisou/issuance of database record
* Former Date
* Entry Date for Database Record
* Database Update
* Year of Publication

There are several common properties across several citation domains and will certainly be in the citation microformat, the unique instances will need further consideration, otherwise there could be no end to posiblities.

There are also several properties (year, month, Year of publication) that can be extracted from another source. Therefore, if you only encode a more specific property such as; Date of Publication, you can extract the 'year of publication' from that. Since the date-time format we are modeling after is the ISO date-time format, just the Year portion is an acceptable date. So if you ONLY know the year of publication, the you can form a valid 'Date of Publication' as a microformat (which inturn is a valid 'year of publication') - you milage may vary when it comes to importing into citation applications.

...

It seems to me that these can be collapsed to maybe one or two different date properties. As far as the specific human readable formatting of the date, that can be chosen per whatever the presentation style guide says, and the datetime-design-pattern used to simplify the markup. - Tantek


Important Sometimes we need a date range and not simply a date (e.g. 4-6 May 2006). See Conference Citation examples later on this page. - Discoleo

Seasons Some journals have seasonal issues (e.g. "Summer 2006 edition") instead of, or as well as, editions labelled by month or other calendar-date. AndyMabbett 05:05, 4 Nov 2006 (PST)

Tags

Some of the citation formats has a place for 'keywords' or 'generic tags', etc. This might be a good place to re-use the RelTag microformat. The downside would be that they are then forced to be links, which might be the correct way to mark-up these terms.

past discussions

Original hBib Discussion

During the WWW2005 Developer's Day microformats track, Rohit Khare gave a presentation where he discussed the microformats process, and then did a quick demonstration wherein a bunch of us got on a shared Subethaedit document, and brainstormed some thoughts on what an "hBib" bibliography citation microformat would look like. Rohit placed the document on his Commercenet site.

An attempt to summarize and inline the linked document follows. -Mike

Two major goals were outlined by the group:

  • Avoid re-keying references
  • Adapt to new journal styles by changing CSS

The fundamental problem was discussed in terms of display - the ability to transform XHTML+hBib into the many journal-specific formats. For example, how to display "et.al" when all authors are present in the source, and how to re-order the elements if a style defines a set order of elements that conflicts with the ordering in the source. Using hCard for authors was agreed on, and the beginnings of an example were shown.

Outstanding Issues

See citation-issues.

Examples in the wild

Pages which start to use the discussion above to create working examples in using hcite: (This section could be used as a base for a page like "hcite-examples-in-wild" later).

Please add new examples to the top of this section.

  • Example User Page at the regional computer lab Erlangen, Germany, based on the universal information system UnivIS marked up with vcard, hcalender (optional, if user makes a lecture) and hcite.

discussions

See also