[uf-discuss] Citation format straw proposal on the wiki

Wed Mar 29 09:40:14 PST 2006

At the risk of boring people, but for benefit of the archives ...

On 3/29/06, Ross Singer <ross.singer at library.gatech.edu> wrote:

> On 3/29/06, Bruce D'Arcus <bdarcus.lists at gmail.com> wrote:
>
> > But the XML representation is basically the same notion: key/values in
> XML.
>
>  No, it's not.  They are key/values based /contextually/ on what the thing
> is.

So they are key/values. To indicate an analytical title, you do:

<attile>Some Ttitle</atitle>

You do not do:

<analytical>
  <title>Some Title</title>
</analytical>

> > Consider these elements:
> >
> > atitle
> > title
> > stitle
> > jtitle
> >
> > We have three that are speciic to journals. One should be able to just
> > have title and short-title and use those in ANY resource.
>
>  This is also incorrect.  atitle is also used in books to denote a chapter
> or bookpart.  It's also used to identify the specific proceeding from a
> conference.  title and jtitle (in this case) are interchangable (the analog
> to books is 'btitle'), but you can just use title and be done with it.

Don't blame me for unclear documentation. The schema you pointed me to
includes this:

<xs:element name="atitle" type="xs:string" minOccurs="0">
  <xs:annotation>
      <xs:documentation>Article title</xs:documentation>
  </xs:annotation>
</xs:element>

So you can see why I might conclude reasonably that "atitle" is
specific to journal articles.

Regardless, notwithstanding the url linking use case, I think this is
a kind of data modeling that goes against the grain of all the
advances we've seen in the past decade: DC, RDF, MODS, hCard, etc.,
etc.

> > It's just that when one adopts that flat approach, then in order to
> > encode different resources, one has to add new properties, which tools
> > then have to updated to understand. So if I need to encode a
> > conference paper, then that suggests we need to add:
>
>  The community profiles are designed to handle different resources.
> Currently there are 'book', 'journals', 'dissertations' and 'patents' (as
> well as dc and marcxml).  The spec is designed to handle /exactly/ what
> you're talking about here.

I really didn't want this to be about OpenURL in general; my point
here has always been that's it's not appropriate for this context.

That said ...

So every time I need to include some new resource, I have to create
another community profile? No thanks. RDF acomplishes more with
greater power and generality.

> > ptitle
> > ctitle
>
>  No, these would be atitle and title, accordingly.  If the proceedings are
> in a serialized format, you would use the community profile for 'journal'
> and add the genre 'conference' or 'proceeding', depending on the granularity
> of the citation.  If the conference was published as a monograph, you would
> use the community profile for book, otherwise the rest the is the same.

OK.

> > The coding of authors has similar issues (in addition, it uses very
> > Western -- even U.S. -- specific name structures).
>
>  So do the publishers.  What's your solution to this?

Raise the bar. Push the publishers to do better.

This is among my problems, incidentally, with OpenURL; it seems
entirely vendor driven (and library focused).

> > Isn't it just easier and more robust to exploit the fact that you can
> > use more than one class, or containment?
>
>  Sure, I think that's what I'm saying, but one of those /had better/ get the
> user to full-text (or the possibility thereof) or the whole exercise will
> leave the user wanting.

OK, here we can agree. I have no quarrel with that (though I think
robust identifiers that be rendered as uris are even more important)..

> > OpenURL has to adopt the flat approach because it's primary use case
> > is to provide a url.
>
>  Currently.  Now that we seem to have gotten that part down, we can actually
> make (and are starting to do) the OpenURL do what it was intended to do.

Which is? Finding stuff?

> > I guess my point is it's hard to fit a square peg (relational bib
> > data) in a round hole (flat data structures).
>
>  HTML is a flat data structure.  We're talking about flat data structures.
> There are no relations to work with in a 'published HTML citation'.  How
> your citation manager deals with it, or how the backend database that
> produces these citations deal with it is entirely different matter.  For
> display, they are flat.

Are all we concerned about is display? If yes, why bother with a
microformat at all?

It seems to me with all the interest in stuff like web clipboards and
using microformats for data exchange and such, we can do better.

To quote Tantek's message from a bit ago:

===
In short, properties of a microformat MUST go in descendant elements
*inside* the root class element of that microformat.  Similarly with
subproperties of a property (e.g. "region" must be on a descendant of
"adr").
===

This is a good principle; the OpenURL model violates it.

Bruce