[uf-new] Microformats for hidden data

Fiann O'Hagan fianno at jshub.org
Thu Nov 26 03:27:14 PST 2009


Hi everyone,

A little while ago my colleague Liam posted on this list about the
jsHub project and our ideas for a microformat to replace the
proprietary JavaScript currently used for web analytics metadata. He
got some good feedback, and I can see there's work we need to do.

Here's the use case we want to address: there is a lot of information
currently stored in pages which is encoded in vendor-specific
JavaScript variables. There are many reasons why the microformat
approach (in principle) would be better than the current situation.
Publishers of big sites find that they are now using multiple tags,
and therefore it makes sense to have a single version of the data
about each page rather than re-declaring it in multiple formats. I
also believe that some of this information (page name and category,
for example) would be of great interest to search engine spiders if it
was accessible.

I would like to take a step back from comments on our specific
proposal and ask a much more general question.

Are there any materials currently available about information which is
not in the visible HTML of the page?

As far as I can see, all the microformats currently in use start with
information which is visible in the page, and then add markup to
indicate what it represents. For example, with hProduct, you start
with the existing product name, price etc in the page, and add the
appropriate classes to indicate what these fields represent.

But there is a wealth of information hidden within the page in <meta>
tags and in JS blocks. For example on the microformats.org wiki at
http://microformats.org/wiki/hcard-faq

var wgPageName = "hcard-faq";
var wgTitle = "hcard-faq";
var wgAction = "view";

It's quite possible that for web analytics purposes, you might want to
use the page name "hcard-faq" which is different from both the HTML
title element "hCard FAQ  &middot; Microformats Wiki" and the URL path
"/wiki/hcard-faq".

Is there any guidance available about these cases, where the
information we want to capture is not part of the visible page? Please
note that it is human readable, but the person consuming the data is
different from the end user browsing the site, for example, it is
someone looking at reports on the most popular pages on the site.

This means that some of the microformats principles, such as visible
data not invisible metadata, can't directly apply.

Thanks for any feedback,

Fiann



More information about the microformats-new mailing list