[uf-discuss] hCite progress

Brian Suda brian.suda at gmail.com
Mon Nov 13 08:39:44 PST 2006

I have had a few free cycles this last week so i have been making some
head-way with the citation microformat.

I took some to to re-organize the XSLT code, so now it should be alot
easier to create new transformations. So i have added Dublin Core and
RIS to possible output types.

This is the new home for all the citation transformations:

Once we get our version system setup for a citation test suite, i will
be creating HTML and cite-specific formats and will need some feedback
on other things to check-in (anyone else is more than welcome to
create some tests too *hint* *hint* :) ).

There have been a few hiccups that i am starting to uncover - so any
guidance is welcomed.
1) The term "Pages" i think that actually has two meanings which i
have confused in the implied schema. The first being "This book is 45
pages long" which is metadata about the book, and is in the realm of
media-info microformat. Then there is "this sites pages 43-45" meaning
a location. So now we need figure out what we are to do? does the
first metadata become <span class="page-count">45</span> and the
citation stay "pages" or do we have "start-page" and "end-page" or
something else? Some systems use "pages" as a string "43-45" others
have it broken out into SP (start page) 43 EP (end page) 45. I'm not
sure how they handle references in something like a newspaper where
the article starts on page 1 and then jumps to page 43... that is not
start-end, but a list of pages. Then that leads to our
"singularization" of plural terms. In vCard it is categories (plural)
but we use "category" singular and just let you have multiple
instances... can "pages" go the same way? the first instance of
class="page" is the start page, and the last instance if the last
page? Any suggestions?

2) one of the manditory properties across several different citation
formats is TYPE. Is this a Book, Journal entry, Thesis, Video, etc.
Usually and enumerated list of values. The issue is that EVERYONE does
it differently... so should hCite have an enumerated list of types
such as "Thesis" and that maps to bibTeX "mastersthesis" and RIS
"THES" or should that be something transforming apps handle. I'm not
sure how to handle this (i'd prefer not to use enumerated lists of
possible values) but if we allow open values, and i write <span
class="type">Thesis</span> and that gets converted to a citation
format, it will fail most of the formats because the string "Thesis"
is not a valid type. I also think it is silly then to do <span
class="type">THES</span> and then be valid for only one format. This
is where a hard-coded list of values in hCite would help, hCite's
"thesis" can be interpreted into various formats' TYPE values -
although i don't like that idea, but don't have any other suggestions
except to ignore it and let the implementor figure it out? Any

Also, i have wiki-fied several citation examples from a previous
email, with accessed date. I have not updated any implied-schemas to
reflect any changes yet. I haven't outputted the accessed date into
BibTeX, RIS or Dublin Core because i don't know what field they equate
too? Alot of this will get flushed out when we start building examples
and tests.

All input is welcomed.


brian suda

More information about the microformats-discuss mailing list