[uf-discuss] hCite progress

Scott Reynen scott at randomchaos.com
Tue Nov 14 21:15:59 PST 2006


On Nov 14, 2006, at 8:26 PM, Jeremy Boggs wrote:

> On Nov 13, 2006, at 2:20 PM, Brian Suda wrote:
>
>> But as Bruce said: start-end pages are not really important, just
>> capture the string "pages 10-50". So i think something akin to the
>> first example here will work.
>
> One reason why a string might not be useful is capturing a citation  
> for a specific page of a work versus capturing a citation of a work  
> in its entirety. Its one thing to cite a specific quote from page  
> 40 of an article in a journal, and another to cite an entire  
> article that exists on pages 37-65 in a journal.
>
> If I were to quote something specific, or refer to a specific idea  
> or statement in a journal article on page 40, I would use some  
> variation of the following:
>
> John Doe, "Lorem Ipsum Dolor," _Sit Amet_ vol. 81, no. 3 (2000), 40.
>
> If, however, I would want to refer to the entire article, I would  
> use the following:
>
> John Doe, "Lorem Ipsum Dolor," _Sit Amet_ 81, no.3 (2000), 37-65.
>
> I don't see how leaving pages as a simple string can account for  
> this difference. I wouldn't want a parser to say that the article  
> is only one page long, and that it exists only on page 40 of a  
> journal.

I think the idea is that the parser isn't saying much of anything  
about the pages, just that a given string is a textual description of  
them, and a human reader needs to take it from there.

> Granted, neither of these citations, in and of themselves, really  
> lets the reader know whether the entire article, or just a portion  
> of it, is being cited. In this case, start-end pages are important.

I think the reader can read that just as well in HTML as in any  
academic citation on paper.  But a machine parser can't, or at least  
we haven't determined any rules by which a machine parser could.

> I'm not really sure offhand how to remedy this, but I'll certainly  
> think about it and offer up whatever I come up with. (I've tended  
> to do that on this list; raise questions without offering much on  
> solutions. My apologies.) Does anyone else have thoughts about this?

There are specific formatting rules for page ranges in various formal  
citation styles, right?  Are they clear and consistent enough that we  
can just adopt one of those for page ranges?

For example, can we say that any string matching the format "AA-BB"  
means page AA, page BB, and all pages in between?  And any string  
matching the format "A, B, C" means the pages A, B, and C? It's been  
a while since I wrote a formal citation, but I remember there were  
syntax rules for this sort of thing, so how about just adopting those  
rules instead of adding markup?  That seems similar to adopting the  
syntax rules for ISO 8601 instead of adding additional class="year",  
class="month", etc. markup.  And for syntax that doesn't follow a  
given syntax standard, we could use <abbr> just like with the date  
syntax standard, e.g. <abbr class="pages" title="20-30">20 to 30</abbr>.

Peace,
Scott


More information about the microformats-discuss mailing list