citation
Citation microformat efforts
This wiki page outlines the overall effort to develop a citation microformat. We are documenting current examples of cites/citations on the web today, their implicit/explicit schemas, and current cite/citation formats, with the intent of deriving a cite microformat from that research.
- Authors
- Tantek Çelik
- Brian Suda
- Ed Summers
Copyright
This specification is (C) 2004-2025 by the authors. However, the authors intend to submit (or already have submitted, see details in the spec) this specification to a standards body with a liberal copyright/licensing policy such as the GMPG, IETF, and/or W3C. Anyone wishing to contribute should read their copyright principles, policies and licenses (e.g. the GMPG Principles) and agree to them, including licensing of all contributions under all required licenses (e.g. CC-by 1.0 and later), before contributing.
- Tantek: I release all my contributions to this specification into the public domain and I encourage the other authors to do so as well.
- When all authors/editors have done so, we can remove the MicroFormatCopyrightStatement template reference and replace it with the MicroFormatPublicDomainContributionStatement.
 
- Brian Suda: I release all my contributions to this specification into the public domain and I encourage the other authors to do so as well.
Semantic XHTML Design Principles
Note: the Semantic XHTML Design Principles were written primarily within the context of developing hCard and hCalendar, thus it may be easier to understand these principles in the context of the hCard design methodology (i.e. read that first). Tantek
XHTML is built on XML, and thus XHTML based formats can be used not only for convenient display presentation, but also for general purpose data exchange. In many ways, XHTML based formats exemplify the best of both HTML and XML worlds. However, when building XHTML based formats, it helps to have a guiding set of principles.
- Reuse the schema (names, objects, properties, values, types, hierarchies, constraints) as much as possible from pre-existing, established, well-supported standards by reference.  Avoid restating constraints expressed in the source standard.  Informative mentions are ok.
- For types with multiple components, use nested elements with class names equivalent to the names of the components.
- Plural components are made singular, and thus multiple nested elements are used to represent multiple text values that are comma-delimited.
 
- Use the most accurately precise semantic XHTML building block for each object etc.
- Otherwise use a generic structural element (e.g. <span>or<div>), or the appropriate contextual element (e.g. an<li>inside a<ul>or<ol>).
- Use class names based on names from the original schema, unless the semantic XHTML building block precisely represents that part of the original schema. If names in the source schema are case-insensitive, then use an all lowercase equivalent. Components names implicit in prose (rather than explicit in the defined schema) should also use lowercase equivalents for ease of use. Spaces in component names become dash '-' characters.
- Finally, if the format of the data according to the original schema is too long and/or not human-friendly, use <abbr>instead of a generic structural element, and place the literal data into the 'title' attribute (where abbr expansions go), and the more brief and human readable equivalent into the element itself. Further informative explanation of this use of<abbr>: Human vs. ISO8601 dates problem solved
Example Citations
Citation Examples are citations found in the wild that could benefit from semantic mark-up. This is a growing list of examples from all sorts of places including W3C specifications, RFCs and others. These are the examples which will determine the schema for the citation microformat.
Known Citation Formats
The Citation Formats Page will be a running tab of known formats for publishing citations.
Eventually, I would like to see a chart of how each value from the implicit schema determined by the citation-examples is represented in each format, and what formats have additional properties that do not map between them. (For example, Format1 calls 'author' 'author', in format2 'author' is called 'writer'. etc)
A Prescriptive Proposal
Here is a proposal which was derived from what one actually has to give as information in a citation in university work. (I don't know where to put that, so I put it right here.)
First, we need a frame, let's say "hcitation". Multiple citations can be put in a "hcitation" frame. Inside there, we need to describe the type of citation; I suggest "monograph", "anthology", "periodical" , "reference", "thesis" , "standard", "internet", or "specialist".
If a "label" was used to refer to the resource in the text (often in square brackets) it can be named so.
Here comes the list of field names we need: "article", "atime", "author", "ctime", "department", "edition", "editor", "eligibility", "employer", "number", "overalltitle", "pagerange", "part", "place", "publisher", "subseries", "title", "type", "url", "volume", "volumetitle", "year".
The field "page" is to mark up which page you actually quote from. Marking up whatever as "prefix" should give you a hint that this is to be put at first place, but not to refer to when sorting. E.G. "The" should be marked as "prefix" either in "The Crocodile" and also in "Crocodile, the".
| Field | Description | monograph | anthology | periodical | thesis | standard | internet | specialist | 
|---|---|---|---|---|---|---|---|---|
| article | Name of the Article in question | 3 | 3 | |||||
| atime | Last access time for online ressources. Use abbr convention for datetime encoding. | 11 | 5 | |||||
| author | Creator. Use fn or n markup for every single entity. | 1 | 1 | 1 | 1 | 1 | 1 | |
| ctime | Date / Last modification. Use abbr convention for datetime encoding. | 8 | 4 | 5 | ||||
| department | special field / faculty | 6 | 3 | |||||
| edition | Edition information | 6 | 8 | 2 | ||||
| editor | Editors of an anthology. Use fn or n markup for every single entity. Add "transl" for translators and "comp" for compilers | 4 | ||||||
| eligibility | Qualification of a specialist | 2 | ||||||
| employer | Name of university eg. | 4 | 4 | |||||
| number | Number | 10 | 12 | 9 | 1 | |||
| overalltitle | Overall Title / Title of series | 9 | 11 | 8 | ||||
| pagerange | Page range of an article in an anthology / periodical | 13 | 10 | |||||
| part | Part of article (if having several parts) | 4 | ||||||
| place | Place of publication | 7 | 9 | 5 | ||||
| publisher | House of Publish | 8 | 10 | |||||
| subseries | name of subseries, If any | 6 | ||||||
| title | The main title. Anthology: name of antology. Periodical: name of periodical | 3 | 5 | 5 | 3 | 3 | 3 | 6 | 
| type | Type (type of thesis or type of utterance (radio interview, e-mail, ...) of a speciaist) | 7 | 7 | |||||
| url | URL | 12 | 6 | |||||
| volume | Volume information (eg. Vol. 22) | 4 | 6 | 7 | ||||
| volumetitle | Volume title | 5 | 7 | |||||
| year | Year of appearance. 4 digit year. Use abbr convention for datetime encoding. | 2 | 2 | 2 | 2 | 2 | 
This table shows what has to go together. Numbers give the typical ordered structure of the values. Other Information than given here (eg. ISBN, ...) actually has not to be put into citations, students would recive negative evaluations if they do so. (I hope this will help somehow. sorry for bad english.)
Sample Usage
<h1>The Bibliography</h1>
<table class="hcitation">
<tr>
    <th scope="row" style="font-variant: small-caps; ">[MR06]</th>
    <td class="monograph">
        <a name="sr06">
            <span class="author firstauthor">
                <span class="family-name" style="font-variant: small-caps; ">Miller</span>, 
                <span class="given-name">Michael</span>
                <span class="additional-name">C.</span>
            </span> ;
            <span class="author">
                <span class="given-name">Mathew</span>
                <span class="family-name" style="font-variant: small-caps; ">Roth</span>
            </span>
            (<span class="year">2006</span>):
            <span style="font-style: italic; ">
                <span class="title">Students' Jokes : A complete collection of jokes students laugh about</span>.
                Vol. <span class="volume">23</span>:
                <span class="volumetitle">Computational Linguists' Jokes</span>.
            </span>
            <span class="edition">4th completely revised Edition</span>.
            <span class="place">München</span> :
            <span class="publisher">Weltbild</span>
            (<span class="overalltitle">Fictional publications of munich's students</span>
            <span class="number">2675</span>)
        </a>
    </td>
</tr>
<tr>
    <th scope="row" style="font-variant: small-caps; ">[R08a]</th>
    <td class="anthology">
        <a name="r08a">
            <span class="author firstauthor">
                <span class="family-name" style="font-variant: small-caps; ">Roth</span>, 
                <span class="given-name">Mathew</span>
            </span>
            (<span class="year">2008</span>):
            „<span class="article">Using semantic HTML for bibliographic citations</span>.“
            In: 
            <span class="editor">
                <span class="given-name">Michael</span>
                <span class="additional-name">B.</span>
                <span class="family-name" style="font-variant: small-caps; ">Smith</span>
            </span> ;
            <span class="editor">
                <span class="given-name">John</span>
                <span class="family-name" style="font-variant: small-caps; ">Miller</span>
            </span>
            (Eds.)
            (<span class="year">2008</span>):
            <span style="font-style: italic; ">
                <span class="title">Being POSH : Usage of semantic HTML in web pages</span>.
                Vol. <span class="volume">4</span>:
                <span class="volumetitle">Whatever you read</span>.
            </span>
            <span class="edition">1st Edition</span>.
            <span class="place">New York</span> :
            <span class="publisher">Public Press</span>
            (<span class="overalltitle">Books on data processing</span>
            <span class="number">1435</span>)
        </a>
    </td>
</tr>
<tr>
    <th scope="row" style="font-variant: small-caps; ">[R08b]</th>
    <td class="periodical">
        <a name="r08b">
            <span class="author firstauthor">
                <span class="family-name" style="font-variant: small-caps; ">Roth</span>, 
                <span class="given-name">Mathew</span>
            </span>
            (<span class="year">2008</span>):
            „<span class="article">Using semantic HTML in scientific work</span>.“
            P. <span class="part">1</span>; P. <span class="part">2</span>.
            In:
            <span style="font-style: italic; ">
                <span class="title">The Computational Linguist</span>.
            </span>
            <span class="subseries">Development of the Semantic Web</span>.
            <span class="volume">2</span>
            (<span class="ctime">2008</span>)
            No. <span class="number">16</span>,
            Pp. <span class="pagerange">124–131</span>
            (Access: <span class="atime"><abbr title="20080714T1612+0200">14.07.2008 16:12 CEST</abbr></span>)
            <<span class="url">http://www.sample.url/web/address/1234.pdf</span>>
        </a>
    </td>
</tr>
<tr>
    <th scope="row" style="font-variant: small-caps; ">[S07]</th>
    <td class="thesis">
        <a name="s07">
            <span class="author firstauthor">
                <span class="family-name" style="font-variant: small-caps; ">Smith</span>, 
                <span class="given-name">John</span>
            </span>
            (<span class="year">2007</span>):
            <span style="font-style: italic; ">
                <span class="title">Semantic Data Extraction from the World Wide Web</span>.
            </span>
            <span class="employer">University of <span class="place">Munich</span></span>,
            <span class="department">Department of Computational Linguistics</span>,
            <span class="type">Diss.</span>
        </a>
    </td>
</tr>
<tr>
    <th scope="row" style="font-variant: small-caps; ">[SVG11]</th>
    <td class="standard">
        <a name="svg11">
            <span class="number">ISO 1234567</span>
            (<span class="edition">1-2003</edition>):
            <span style="font-style: italic; ">
                <span class="title">Scalable Vector Graphics (SVG) 1.1 Specification</span>.
            </span>
        </a>
    </td>
</tr>
<tr>
    <th scope="row" style="font-variant: small-caps; ">[Wik08]</th>
    <td class="internet">
        <a name="wik08">
            <span class="author firstauthor">
                <span class="family-name" style="font-variant: small-caps; ">Wikipedia, the free encyclopedia</span>, 
            </span>
            (<span class="year">2008</span>):
            <span style="font-style: italic; ">
                <span class="title">Microformat</span>.
            </span>
            (Version: <abbr class="ctime" title="2008-06-19">19th June 2008</abbr>.
            Access: <abbr class="atime" title="20080703T1423+0200">3rd July 2008 14:23 CEST</abbr>)
            <<a href="http://en.wikipedia.org/w/index.php?title=Microformat&oldid=220275451" class="url">http://en.wikipedia.org/w/index.php?title=Microformat&oldid=220275451</a>>
        </a>
    </td>
</tr>
<tr>
    <th scope="row" style="font-variant: small-caps; ">[W08]</th>
    <td class="specialist">
        <a name="w08">
            <span class="author firstauthor">
                <span class="family-name" style="font-variant: small-caps; ">Wang</span>, 
                <span class="given-name">Wu</span>
            </span>
            (<span class="eligibility">Professor of Informatics</span>,
            <span class="department">Department of Applied Sciences</span>,
            <span class="employer">University of Michigan</span>)
            (<abbr class="ctime" title="20000801T0918+0100D0007">01.08.2000, 9:18–9:25 MEZ</abbr>)
            <span class="title">Science News</span>.
            <span class="type">Interview</span>.
            <span class="overalltitle">Michigan Television</span>
        </a>
    </td>
</tr>
</table>
Issues
The citation-issues page is intended to capture ongoing issues.
To Do
- Using existing class names and creating new names, create property names for the profile
- Based on implicit schemas in citation-examples, and terms from one or more citation-formats, do some citation-brainstorming for a simple citation microformat.
- Create additional strawman proposals
Modularity
My hope for this microformat is that it can be a sort of module that can be used in other microformats. Once this is developed and flushed out, citation references could easily be used for publications on a Resume/CV, therefore the citation microformat would be a module (subset) of all the possible Resume Values.
Other Microformats that could use the Citation Module
Other Microformats that the Citation Module will use
- hCard encodings for things like Author, Publisher (people and companies)
- hAtom encodings as a possible container, and author/date-time properties
- rel-tag encoding for keywords
- rel-license encoding for copyright
References
Informative References
- COinS
- XMLResume: if part of the drive for citations is for publications for a resume/CV then some of this information could be useful
- CiteUlike is a free service to help academics to share, store, and organise the academic papers they are reading
- Connotea is a scientific bookmarking service from Nature.
- OpenURL with Autodiscovery
- "Gather, Create, Share" and "Personal Collection Systems" memes, and systems implementing either or both
- Metadata Object Description Schema developed by the Library of Congress
- Guidelines for Encoding Bibliographic Citation Information in Dublin Core Metadata
- BibTeX reference from Dana Jacobsen
- RIS Format Specification from Thomson ResearchSoft, makers of ReferenceManager
- Zotero - "Firefox extension to help you collect, manage, and cite your research sources"
- DOI (CrossRef Guidelines for use of DOIs in citations)
- INFO URI (URI scheme for representing legacy namespaces)
- ISBN (ISBN on Wikipedia)
- ISSN (ISSN on Wikipedia)
- Open Citation Project - OpCit, a three year (1999-2002)R&D project funded by the Joint NSF - JISC International Digital Libraries Research Programme.
- DocBook (markup langauige for books); DocBook on Wikipedia
- RDA: Resource Description and Access
See Also
- citation
- citation-examples
- citation-examples-markup
- citation-formats
- citation-brainstorming
- citation-faq
- citation-issues
- Strawman microformats: