[uf-discuss] FYI: Jeff Jarvis on microformats and Google Base

Brian Suda brian.suda at gmail.com
Mon Nov 21 18:05:54 PST 2005


I agree with Danny Ayers' comment in the Blog post:

> Ok, so let's say you've prepared this bunch of data about
> your site, all neatly encoded in one of their format options.
> So you give it to Google. So why not give it to the rest of
> the Web as well?

This doesn't have to be an either/or issue.

The paradigm is one of submit, rather than crawl. Google wants you to
submit your info into their data silo. There is no reason you can't
send it or keep it somewhere else as well. (On your own site, or in
"OpenBase")

If i had a "wishlist" of things that Google Base had that incorporated
Microformats it would be something like the following:

Allow a file-upload type of XHTML, where things are marked-up with
class="vcard", class="vevent", etc.

This way you can generate the data ONCE and use it several places. I
don't think this would be hard for google because they are already
parsing RSS and Atom, which is much more of a moving target than
XHTML. Plus, the number of HTML editor is vastly greater than RSS
editors out there.

The reasons you should still have to 'submit' your data rather than
crawl it are two fold.
1) Google Base has additional fields that things like hCard do not
have built in (e.g. Marital Status, vCard doesn't care about this, but
an online dating service does). The two options are create something
new besides hCard (which is a bad idea). Or Extend hCard - which then
isn't a 1:1 mapping of vCard. So this would be an hCard with a few
additional class="" values inside that are ignored by everyone but
Google Base. (this alone is not a requirement for "submitting" of
data)
2) My site has an hCard on it, if google started using the web spider
to extract this data and put me unwillingly onto their dating service
i might not be too happy. Same goes for events, etc. So there should
still be an explicit way to say NOT to save this data. The easiest way
is to index nothing unless told too do so.

So my other "wishlist" items would be the obvious, display the data in
Microformat encoded formats (where applicable).

Create a Trackback/ping service, so when you add a blog post that DOES
have microformat mark-up you can send google base a ping/trackback and
they can either fetch the data, or you send it in the request. Then
google knows you explicitly want this to be saved into the Google
Base, while still displaying it on your own site. If google wants too
make sure to are 'registered' you could either 'connect' your blog's
URL so any ping/trackback from that URL are connected to your account,
or add a Developer Key as a query parameter, so they know this
Ping/Trackback came from this user. etc.

This doesn't have to be a either/or issue. You can have your cake and
eat it too! Lets just hope that Google decides to place nice.

-brian

--
brian suda
http://suda.co.uk


More information about the microformats-discuss mailing list