[uf-discuss] hCite elevator pitch and my bibliography generator
hsivonen at iki.fi
Thu Mar 22 13:08:54 PST 2007
(Sorry about my frustrated tone. I always get frustrated when I try
to extract implementation directions from the wiki and fail. This
isn't the first time. And I can read specs in general.)
On Mar 10, 2007, at 23:10, Paul Wilkins wrote:
> Henri Sivonen wrote:
>> I needed a .bib-based bibliography generator for XHTML, so I
>> wrote one with help from a friend who had developed a .bib
>> parser. The output of my generator can be seen at
>> I've wrapped the values of .bib fields in elements whose class
>> name is the .bib field name. I did it just in case. I don't have
>> any consumer use case for those class names. It was just super-
>> easy to generate them.
>> My use case (publishing an academic paper with a bibliography) is
>> not mentioned as a use case at
>> http://microformats.org/wiki/citation-brainstorming . More to the
>> point, the wiki has no consumer use case for my publication use case.
>> Does this mean that hCite is not for me at all?
> Not at all. You are using the BibTex format, which is covered in
> the citation formats http://microformats.org/wiki/citation-formats
Sure, but considering that I share my .bib, should I expect people to
want to scrape my (X)HTML-formatted bibliography?
>> If hCite is for me, what's the elevator pitch convincing me to
>> put more effort into my generator? What benefits should I expect
>> if I do? Is hCite mature enough to be implemented yet?
> The citation microformat is a work in progress at this stage, so
> it's not mature enough for programs to extract information from it,
I guess this means that I shouldn't try to support hCite on the
generator side in my thesis considering that the document should go
final on the first week of April.
Would it be of any use to anyone if I wrapped the name of each author/
editor in a <span class='fn'> if I otherwise leave my markup the way
it is now?
> The benefits are that people visitng your content with next
> generation tools wil be able to easily extract and use the
> information in more interesting and useful ways.
So basically, my effort would not be about catering to specific
realistic foreseeable use cases. Instead, it would be about putting
data out there in case someone figures out a use case later on.
> Tantek has a recent presentation about the big picture of
> microformats at http://tantek.com/presentations/2007/02/microformats/
I think I know the base theory. I am interested in practical use
cases and implementability in this particular case.
>> Moreover, is it even possible to generate hCite from my source
>> data (http://hsivonen.iki.fi/thesis/dippa.bib) without
>> sacrificing the presentation that I want and without potentially
>> generating bogus markup for personal names?
> One of the big ideas behind the use of microformats is that it will
> allow you to markup the content on your page without modifying the
> presentation of it.
Somehow, I was under the impression that hCite required bibliography
items as <li>s instead of <dt>/<dd> pairs (which is what I use and
what W3C and WHATWG specs use).
>> For example, my source data does not encode explicitly the given
>> name, the family name and other stuff that isn't quite neither.
>> As far as I can tell, it is impossible to tell heuristically that
>> the middle token in these two names is semantically different:
>> Gavin Thomas Nicol
>> Henrik Frystyk Nielsen
> Those issues haven't yet been covered for for the citation
What I'm trying to say is that I think hCite should allow names to be
marked up as formatted names tossing the deformatting problem to the
consumer. After all, one of the most popular bibliography data
format, BibTeX, stores formatted names.
> It may be possible for for a generator to parse through them and
> extract the appropriate information though.
> For example, honorific-prefix and honorific-suffix are a rather
> short list. Then after those, the given name, family name and
> additional name could be extracted in that particular order.
Using heuristics in the generator to make explicit metadata
statements is generally a bad idea. If the result is wrong, it still
pretends to be authoritative. If heuristics are involved, the input
to the heuristic should be sent and consumers should be able to
compete on how good their heuristics are.
hsivonen at iki.fi
More information about the microformats-discuss