[uf-new] ISBN, ISSN and the case for moving forward now

Andy Mabbett andy at pigsonthewing.org.uk
Sun Mar 18 03:11:40 PST 2007


Abstract
========

We can have uFs for ISBNs and ISSNs. The very specific nature of the way
in which those codes are formatted means that further evidence gathering
would be superfluous. Such uFs would be useful components in a number of
other, currently-proposed, microformats. Acting now will ensure that a
standard method can be applied across such be formats


Coverage
========

This post addresses the use of ISBNs; almost all of it applies to ISSNs
equally.


Proposal
========

There as been previous discussion of marking-up ISBNs (International
Standard Book Numbers:

        <http://en.wikipedia.org/wiki/ISBN>

ISBNs include a checksum digit:

        <http://en.wikipedia.org/wiki/ISBN#Check_digit_in_ISBN_10>

and can (only) be either 10 or 1 digits long.

Such discussion has, for example, taken place around the proposed
citation uF:

        <http://microformats.org/wiki/citation-examples>


I believe, however, that a UF for ISBN should stand alone, and thus be
available for use in any other microformat (for recipes, for example, or
for hResume, hAtom, hReview or hListing).


I then started to think about evidence-gathering, and it occurred to me
that so many websites use ISBN numbers, that I would never have time to
examine even 1% - so I could ever be sure that I was dealing with cases
in the larger side of an 80-20 divide.

It then occurred to me that there are very few ways in which ISBN
numbers can be marked up, in a meaningful and valid sense. Furthermore,
the very nature of ISBNs, with rigidly defined formats, and check-sums,
means that detecting, validating and parsing ISBNs is relatively easy to
describe.


Consider these examples (all found on the aforesaid citations-examples)

        <div class="isbn">0-313-32847-1</div>

        <span id="lblIsbn">0-313-32847-1</span>

        <span class="isbnNumber">0195162471</span>

(note that "isbnNumber" is a tautology!)

and these other possible methods of marking up those ISBNs:

        <div class="isbn">ISBN 0-313-32847-1</div>

        <span id="lblIsbn">ISBN: 0-313-32847-1</span>

        <span class="isbnNumber">the ISBN is 0195162471</span>

        <span class="isbn">ISBN: 0-95115-320-X</span>

in each case, the marked-up text includes a valid ISBN (some with
permitted, but superfluous, dashes) and, in the latter cases, some other
non-numerical characters. All a parser need do is discard the non
numerical characters, apart from a possible "X" check-digit (it's
interesting to note that no "X" check-digit occurs on
citation-examples), and check that the remaining digits validate to the
included checksum digit.

If the mark-up has introduced additional digits:

        <span class="isbnNumber">the ISBN of book #42 is
        0195162471</span>

then a parser may simply discard the results as invalid - and thus
requiring a more tightly applied element.


Use case
========

Marking up ISBNs would allow tools to enable users to quickly locate the
relevant title in a shop or library; for example in the way which
Wikipedia uses ISBNs:

        <http://en.wikipedia.org/w/index.php?title=Special:Booksources&isbn=0950788120>

Indeed, that service could be the target used by a user agent (as could:

        <http://worldcat.org/issn/0006-3657>

for ISSNs).


Conclusion
==========

I believe that we could, quickly, have uFs for ISBN and ISSN; and that
this would be of benefit to the development of any other uF which
includes an ISBN or ISSN component - as well as benefiting the many
thousands of publishers, and the any millions of consumers, of ISBNs and
ISSNs

All that remains is to agree a suitable class-name (ISBN vs. hISBN, for
example).


Next steps
==========

I hope Mike Kaply, to whom I'm forwarding a copy of this e-mail, will
agree to place a test-case version in a beta version of Operator.


I've marked up a test-case (using class="isbn") on:

        <http://www.westmidlandbirdclub.com/biblio/sandwellISBNTEST.htm>

which has two with "X" check digits, and I'll be happy to add further
examples.

I'll be placing a version of this message on the 'wiki' in due course.

-- 
Andy Mabbett
                 <http://www.pigsonthewing.org.uk/uFsig/>

            Ten-day moderation delays amount to a defacto ban!


More information about the microformats-new mailing list