citation
Citation Formats
Authors
Copyright
This specification is (C) 2004-2025 by the authors. However, the authors intend to submit (or already have submitted, see details in the spec) this specification to a standards body with a liberal copyright/licensing policy such as the GMPG, IETF, and/or W3C. Anyone wishing to contribute should read their copyright principles, policies and licenses (e.g. the GMPG Principles) and agree to them, including licensing of all contributions under all required licenses (e.g. CC-by 1.0 and later), before contributing.
Introduction
Currently, there has been some discussion about a citation format. This is the wiki page to document current examples of cites/citations on the web today, and current cite/citation formats, and their implicit/explicit schemas, with the intent of deriving a cite microformat from that research.
Semantic XHTML Design Principles
Note: the Semantic XHTML Design Principles were written primarily within the context of developing hCard and hCalendar, thus it may be easier to understand these principles in the context of the hCard design methodology (i.e. read that first). Tantek
XHTML is built on XML, and thus XHTML based formats can be used not only for convenient display presentation, but also for general purpose data exchange. In many ways, XHTML based formats exemplify the best of both HTML and XML worlds. However, when building XHTML based formats, it helps to have a guiding set of principles.
- Reuse the schema (names, objects, properties, values, types, hierarchies, constraints) as much as possible from pre-existing, established, well-supported standards by reference.  Avoid restating constraints expressed in the source standard.  Informative mentions are ok.
- For types with multiple components, use nested elements with class names equivalent to the names of the components.
- Plural components are made singular, and thus multiple nested elements are used to represent multiple text values that are comma-delimited.
 
- Use the most accurately precise semantic XHTML building block for each object etc.
- Otherwise use a generic structural element (e.g. <span>or<div>), or the appropriate contextual element (e.g. an<li>inside a<ul>or<ol>).
- Use class names based on names from the original schema, unless the semantic XHTML building block precisely represents that part of the original schema. If names in the source schema are case-insensitive, then use an all lowercase equivalent. Components names implicit in prose (rather than explicit in the defined schema) should also use lowercase equivalents for ease of use. Spaces in component names become dash '-' characters.
- Finally, if the format of the data according to the original schema is too long and/or not human-friendly, use <abbr>instead of a generic structural element, and place the literal data into the 'title' attribute (where abbr expansions go), and the more brief and human readable equivalent into the element itself. Further informative explanation of this use of<abbr>: Human vs. ISO8601 dates problem solved
Known Citation Formats
This is a list of the known formats for creating citations, this microformat will be a blend of some or all of them. The Citation Formats Page will be a running tab of these formats.
Eventually, i would like to see a chart of how each value is represented in each format, and what formats have additional properties that do not map between them. (For example, Format1 calls 'author' 'author', in format2 'author' is called 'writter'. etc)
Example Citations
Citation Examples are citations found in the wild that could benefit from semantic mark-up. This is a growing list of examples from all sorts of places including W3C specifications, RFCs and others.
Citation Brainstorming Ideas
This is the brainstorming page where just about anything can be put out for discussion.
Todo
- select a bibliography format to model
- look for HTML tags that give the most semantic meaning
Questions
- what is the difference between hReview and a Citation format?
- Right a citation is actually very different from a review, and even although a review could be said to contain a citation to the item being reviewed, in practice, the two are very different.
 
- if a citation is an author or publisher, isn't that just an hCard
- Citations usually contain two parts--a notice that the material is a quoted or paraphrased from a source, and a reference to the location of that source. It seems like we're attempting to do both simultaneously should we make more of an effort to differentiate the two?
Modularity
My hope for this microformat is that it can be a sort of module that can be used in other microformats. Once this is developed and flushed out, citation references could easily be used for publications on a Resume/CV, therefore the citation microformat would be a module (subset) of all the possible Resume Values.
Other Microformats that use the Citation Module
- Resume Microformat (possibly)
Other Microformats that the Citation Module will use
- hCard encodings for things like Author, Publisher (people and companies)
- hAtom encodings as a possible container, and author/date-time properties
- rel-tag encoding for keywords
- rel-license encoding for copyright
References
Informative References
- COinS
- XMLResume: if part of the drive for citations is for publications for a resume/CV then some of this information could be useful
- CiteUlike is a free service to help academics to share, store, and organise the academic papers they are reading
- OpenURL with Autodiscovery
- "Gather, Create, Share" and "Personal Collection Systems" memes, and systems implementing either or both
- Metadata Object Description Schema developed by the Library of Congress
- Dublin Core Metadata
- BibTeX references (I think a citation micro-format would be useful, but BibTeX is not the best model to use. It has a flat metadata model that does a really poor job representing the sort of citations that people outside of the hard sciences cite).
Comments
I'm the author of the citeproc project, which includes a micro-format of sorts (though I never thought of it as such) in its XHTML output mode. See here for an example. The difference compared to the bibtex-derived model is that is a) more generic and b) hierachical.
It would be possible, certainly, to do a flat model if for some reason there was a good technical reason not to go hierarchical (though is there?), but then you need to think outside the BibTeX box in any case. Any model of this sort ought to be able to handle legal citations, magazine articles, patents, etc. etc.; not just a narrow range of BibTeX types.