[microformats-discuss] a micro micro-format for an' item'

Thu Oct 13 12:01:34 PDT 2005

On 10/13/05, S. Sriram <ssriram at gmail.com> wrote:
> From: "David Janes -- BlogMatrix" <davidjanes at blogmatrix.com>

Well now, I have use case/application which will require an XHTML
expression of an item, and intend to develop this generally in line
with the microformats process (I probably won't be doing the actual
development myself, but this is what I'll be recommending). The kind
of item is a commercial product, in fairly general sense.

There isn't very much in the way of *consistent* existing practice for
this, beyond things like title/name and description. The application
will be RDF-oriented, so whatever is used for the format must map
simply to an RDF model, but I don't anticipate that being an issue.

One approach would be to go in the kind of direction S. Sriram
proposes, to generalise back to a generic item. This would be in line
with the principles of not repeating yourself or making things harder
than they need to be. RDF certainly can support this approach, and
there are many in the knowledge representation that such an approach
would appeal to (see [1]).

But I think in the context of microformats this is almost certainly a
bad idea. An item in the blog-post definiton will mean a blog-post
item, never a product item. If you only had generic-item, then you've
lost a useful chunk of semantics.

It's also worth remembering that the Web already has a concept of a
resource, which is essentially anything that can be identified (with a
URI). That isn't far off the notion of a generic item.

Anyhow, most microformat documents will contain fairly domain-specific
information, so only one or two of the microformat definitions will be
needed per document. It remains to be seen how well XHTML can support
complex mixing of microformats. I strongly suspect it would get ugly
and unusable very quickly once you go beyond two or three in the same
doc. Take a look at some RDF/XML, or the KIF used for SUMO at [1] -
and those are designed to represent such stuff.

I believe the pragmatic approach is that if you want a model like an
item hierarchy, then keep that in the model, and map from the
domain-specific format to the model. This allows bottom-up
development, without premature commitment to any kind of top-down
semantics.

Going back to the product example, there is an existing vocabulary,
FRBR, which contains description needed for the entities I'm looking
at. This vocabulary has been expressed as RDF [2]. There's a whole lot
more in that vocabulary than I'll need, so my plan is to look what's
needed/used in the wild and cherry-pick the terms with the required
semantics (more than once I've seen criticism of RDF vocabularies
saying they're over-engineered - this is usually misguided, because
RDF supports cherry-picking, you don't have to commit to everything,
and missing isn't broken). The net result will be a specialised
microformat, I anticipate comparable in size and specialization to
hReview or whatever.

<aside>
However this will have the RDF mapping, and in systems that understand
that model it will be straightforward to say that:

blogpost:item rdfs:subClassOf generic:item .
product:item rdfs:subClassOf generic:item .
blogpost:title rdfs:subPropertyOf generic:title .
product:title rdfs:subPropertyOf generic:title .

An RDF system that support inference can produce the instance data
from this that would allow SPARQL queries like:

?x rdf:type generic:item .
?x generic:title ?y .

which would produce a set of results containing both the
blogpost:items and their titles together with the product:items and
their titles.
</aside>

So basically I'm with David on this one ;-)

Cheers,
Danny.

[1] http://ontology.teknowledge.com/
[2] http://vocab.org/frbr/core

--

http://dannyayers.com