isbn
ISBN
[copy of mailing list post; currently unformatted and unedited - work in progress]
Abstract
We can, quickly, have microformats for ISBNs and ISSNs. The very specific nature of the way in which those codes are formatted means that further evidence gathering would be superfluous. Such microformats would be useful components in a number of other, currently-proposed, microformats. Acting now will ensure that a standard method can be applied across such be formats
Editor
Coverage
This page addresses the use of ISBNs; almost all of it applies to ISSNs equally.
Problem
How to mark-up ISBN numbers, so that they can be extracted and the relevant book found on on-line library catalogues, bookshops, etc., for example in the way which Wikipedia uses ISBNs. Indeed, that service could be the target used by a user agent (as could WorldCat for ISSNs). This would be of benefit the many thousands of publishers, and the any millions of consumers, of ISBNs and ISSNs.
How to make a standard ISBN microformat available as a component for other microformats.
ISBNs include a checksum digit and can (only) be either 10 or 13 digits long. (ISSNs are 8 digits)
Proposal
There as been previous discussion of marking-up ISBNs (International Standard Book Numbers); for example around the proposed citation microformat.
I (Andy Mabbett) believe, however, that a microformat for ISBN should stand alone, and thus be available for use in any other microformat (for recipes, for example, or for hResume, hAtom, hReview or hListing, as well as citations).
Evidence
So many websites use ISBN numbers, that we would never have time to examine even 1% - so we could ever be sure that we were dealing with cases in the larger side of an 80-20 divide.
However, there are very few ways in which ISBN numbers can be marked up, in a meaningful and valid sense. Furthermore, the very nature of ISBNs, with rigidly defined formats, and check-sums, means that detecting, validating and parsing ISBNs is relatively easy to describe.
Consider these examples (all found on the aforesaid citations-examples)
<div class="isbn">0-313-32847-1</div> <span id="lblIsbn">0-313-32847-1</span> <span class="isbnNumber">0195162471</span>
(note that "isbnNumber" is a tautology!)
and these other possible methods of marking up those ISBNs:
<div class="isbn">ISBN 0-313-32847-1</div> <span id="lblIsbn">ISBN: 0-313-32847-1</span> <span class="isbnNumber">the ISBN is 0195162471</span> <span class="isbn">ISBN: 0-95115-320-X</span>
in each case, the marked-up text includes a valid ISBN (some with permitted, but superfluous, dashes) and, in the latter cases, some other non-numerical characters. All a parser need do is discard the non numerical characters, apart from a possible "X" check-digit (it's interesting to note that no "X" check-digit occurs on citation-examples), and check that the remaining digits validate to the included checksum digit.
If the mark-up has introduced additional digits:
<span class="isbnNumber">the ISBN of book #42 is 0195162471</span>
then a parser may simply discard the results as invalid - and thus requiring a more tightly applied element.
Next steps
Decide whether ISBNs should be used as TYPEs of UIDs, or be a stand-alone microformat.
A test-case implementation should be made available, perhaps in a beta version of Operator.
A test page using class="isbn", which has two with "X" check digits, is available and further examples can be added to it.
Additional examples should be created using UID and mark-up examples that already exist in the wild should be marked-up to see if UID covers all these scenarios.