citation-issues: Difference between revisions

From Microformats Wiki
Jump to navigation Jump to search
(moving "requirements" from brainstorming to issues.)
(No difference)

Revision as of 23:56, 8 April 2007

BenWest will start this by reorganizing material from http://microformats.org/wiki?title=citation-brainstorming&diff=0&oldid=15286.

Issues

  • Generally, use cases are used to flesh out requirements, but I don't see any on this page, so I've added a new section for this. Here are some suggested requirements. ThomasBreuel
  • I've made these into issues. [[BenWest 16:56, 8 Apr 2007 (PDT)]]
  • open issue!

Lossless Round-Trip Conversions

Should citation support roundtrip conversions? Which formats should be supported? One of the primary uses for a citation format is to permit people to put individual citations or entire bibliographies on the web. For that purpose, it's important that if someone puts up my bibliography on the web and someone else downloads it, they actually get back the citations correctly, and don't have to spend time fixing up the citations manually. Therefore, I suggest the following requirement.

If X is one of the common citation formats (BibTeX, EndNote, etc.), then conversion of the form X -> hCitation -> X must not lose information and must not require manual fixing up of the result.

Note that this has multiple components. First, for a format like BibTeX, it's important that the field names be preserved. Second, in general, markup (italics, math, chemical formulas, spacing, special characters) needs to be preserved.

  • open issue!

===Citation Markup=== Should citations preserve presentation? Citations may contain markup, such as italics, subscripts, superscripts, special characters, and chemical formulas. For a correct presentation of the citation format to the user, the format must permit even fairly complex markup. Note that this markup cannot easily be converted automatically between different bibliographic processors.

  • open issue!

===Encapsulation of Non-Textual Content=== Should citations support non-textual content? Systems like document image processors need to be able to represent semantic roles of parts of pages without actually giving a usable textual representation. For example, a system might segment citations into authors, titles, volumes, and years, but represent the actual content of those fields using image tokens rather than characters. Furthermore, no text to put into an ABBR tag may be available

  • open issue!

===No New Semantics=== Should citations avoid introducing new semantics? The proposals for a citation microformat, as they now stand, suggest creating a new format that differs from existing formats not just syntactically, but semantically (different choices of field types than other formats, different handling of proper names, different handling of publications that are part of collections, etc.). This has some serious consequences; in particular, it means that translation into any existing format is not just a simple syntactic transformation, it requires that an tools that deals with the citation microformat needs to be updated to handle new semantics, in addition to new syntax. An alternative is to define one or more microformats that are strictly a syntactic transformation of existing formats (e.g., encapsulated BibTeX, encapsulated Endnote).

So, a possible requirement to consider is that citation microformats introduce no new semantics, but are a strict syntactic encapsulation of existing citation formats.