[uf-discuss] semantic web and microformats

Tom Morris bbtommorris at gmail.com
Tue Oct 9 10:27:33 PDT 2007

On 10/9/07, Patrick Aljord <patcito at gmail.com> wrote:
> Hey all,
> I need to do a presentation on the semantic Web and all the articles I
> read about it talks about RDF and usually show this schema:
> http://en.wikipedia.org/wiki/Image:W3c-semantic-web-layers.svg
> Could anyone please tell me what's the relation between the semantic
> web and microformats, where would the microformats stands on that
> schema for example?
> Can microformats and the SW be related and in what way?

Microformats and the Semantic Web are related, but how they are
related is something people dispute.

1. There are a number of people who thinks that microformats are a
drop-in replacement for the W3C and Tim Berners-Lee's vision of the
SW. They see the SW as too complicated, bureaucratic and out-of-step
with current web development practice. For instance, at FOWA in London
this week, one presentation described the relationship of microformats
to RDF as being like the relationship of REST to SOAP.

2. There are obviously people who think the complete opposite - that
microformats are an irrelevant distraction from the Semantic Web.
There are probably more people in the former group than the latter.

3. There are people in the middle, who think that microformats are a
valuable approach, but that there is a space for work between the two

I'm definitely of the 'compatibilist' bent. Here's why:

I think that the Semantic Web approach, with it's use of URIs and
namespacing, allows solutions to domain-specific problems that can't
necessarily be solved by microformats, because domain-specific
problems do not correlate very well with the principles underlying
microformats - namely codifying existing practices. For instance, I've
been looking at how we could represent fictional characters in an
hCard-style format.

I think that there is value in a "top-down" approach too, because
sometimes there are problems where there is no bubble-up solution
forthcoming. For instance, I visited the Gene Campus in Cambridge (UK)
recently, and saw that they are just working out how to publish huge
volumes of genetic data online in a big database. They are going to
provide an API, but were also looking for a more light-weight data
format. There would be no microformat for this - because of the social
organisation of the microformats community. There is value in the way
that microformats are organised and run, but there is value in
allowing people to experiment and come up with their own forms of
social organisation. It may be that in some circumstances, one person
working on a schema on their own may come up with a better solution
than everybody else. Because RDF and the Semantic Web approach is
based on URIs, all it takes to coin a new schema is some URI-space -
and everybody can get URI space for a reasonably low cost.

The cost of entry for extending existing standards in a useful way is
lowered. For instance, Danny Ayers wrote a FOAF extension a while back
that allowed you to represent your pets. Has it taken off? No. Does it
matter? Not particularly. If approach X doesn't work, *anyone* can try
to approach the problem in a different way...

The microformats community works on the basis of having the data
embedded into the HTML. The RDF/SemWeb approach looks to have a
consistent data model, and then having as many representations as you
like of that data model. The data model for microformats differs based
on which tool you use (perhaps it's in a key-value-pair array, or an
object, or in an XML format) - even though it's getting the same
syntax (HTML or XHTML). With RDF, you have the same model (subject,
predicate, object) but with different syntaxes (XML, JSON,
(X)HTML-with-GRDDL, N3/N-Triples, TriX, the new JavaScript proposal
that's been circulating on the W3C semantic web mailing list, internal
memory models, SQL table etc.)

There is also value in the 'write the parser once' approach. Each new
microformat requires a new set of tools - Operator, Tails, X2V,
Optimus and so on, will have to be rewritten or extended to cover new
microformats. But RDF tools keep on reading RDF regardless of how many
new schemas people create. Imagine if we had to recreate the DOM, XML
parsers, XSLT, XPath, validators, XQuery and the rest of the XML stack
whenever anyone came up with a new XML-based specification.

Microformats have changed how Semantic Web development is taking
place. There are a small but growing number of increasingly pragmatic
developers who are less concerned with large-scale ontology projects,
less concerned with Rules and inference (etc.) and more concerned with
publishing the data out on the web, now. That said, often I feel that
people just try and make problems disappear. Which never happens, of
course. Problems never really disappear. They just reappear in more
complex forms.

For me, microformats are like the 'literals' of the Semantic Web - the
most reusable chunks (people, places, events, 'votes', tags and
reviews) - which can then be used and extended with the pre-existing
Semantic Web technologies for custom purposes.

The GRDDL approach - which recently got the W3C seal of approval - is
a bridge between microformats - both those officially created by
microformats.org ("upper case microformats"? heh heh), and those just
defined into existence by authors - and the Semantic Web. What I find
neat about it is that anyone can just define a GRDDL profile and start
using it - without having to spend time arguing on mailing lists.
e.g. http://tommorris.org/profiles/nsfw

I'm working on showing ways that we can do the sort of mashup-style
behaviour that is currently done with APIs quite easily with the
Semantic Web approach. Here's a guide I published recently on how you
can use RDF in Python to quickly query data:

If you want to, feel free to e-mail me off-list.


Tom Morris

More information about the microformats-discuss mailing list