process, [citation] (was Re: [uf-new] announcing the hOCR and hBIB microformats)

Tantek Ç elik tantek at
Wed Mar 28 00:03:19 PST 2007

On 3/28/07 12:25 AM, "Thomas Breuel" <tmbdev at> wrote:

> We're currently developing a new open source OCR system, with a focus on
> digital library applications (  As part of this, we needed
> formats for representing both OCR output and bibliographic metadata, and we
> have defined two new microformats for this purpose: hOCR and hBIB.



First of all, welcome, and you have found the right mailing-list to discuss
new microformats.

Second, that is great news to hear that you are working on an *open source*
OCR system.

Third, the path to defining a new microformat is through the microformats

The goals of the process are to ensure that the microformats defined follow
the microformats principles.  Among those is to reuse existing work, and
thus minimize reinvention.  In particular, as far as hBIB, note that the
microformats community has done a significant amount of research and work
developing a citation microformat.  Start with reading these:

Finally, I strongly encourage you to both read those pages, and ask any
questions you have about the process or the citation microformat to date
here on the list.

I sincerely hope you join the effort to develop a citation microformat and
help with your contributions and experience.

Thanks and welcome,


