[uf-new] Microformats support for pagination
Toby A Inkster
mail at tobyinkster.co.uk
Wed Jan 14 14:46:51 PST 2009
Brian Suda wrote:
> But this isn´t unique to microformats, other semantic technologies
> would have this issue as well.
FOAF (and RDF more generally) has a set of well-established
conventions for merging data. Certain properties are taken to be what
is called "inverse functional properties" (IFPs) - what that means in
English is that if P is an IFP, and two people have a property P with
the same value, then they're really the same person.
foaf:mbox is for example defined as an IFP - each mailbox marked up
with foaf:mbox belongs to exactly one person. If two people share a
foaf:mbox, then they are the same person, so their data can be
merged. (I know what you're thinking... there are people who share a
mailbox, so doesn't this break? In theory, no it doesn't break - the
specification says that it's for "personal mailboxes" only, "ie. an
Internet mailbox associated with exactly one owner". In practice,
people occasionally ignore the spec, but for the most part it works
well.) There are other IFPs too, such as foaf:jabberID, foaf:openid,
etc.
So, for hCard/vCard, what are candidates for IFPs? We've discussed
"uid" before, and the general agreement is that that should be fairly
safe. "Photo" looks like it might be a good candidate to begin with,
and probably will do in practice, but in theory the vCard spec
defines it far too loosely - two people could allowably have the same
photo. "Key" is pretty much in the same bucket as "photo", but is
probably less useful as few people use it anyway. So really, "uid" is
just about it - shame not many people use that either.
> wouldn't you just keep a list of the pages you have already
> crawled? So if you find a tagcloud on page /item1.html and it links to
> /tags/tag1 then on page item2.htm you re-find the tag cloud which
> links to /tags/tag1 you don't follow it again?
I don't think that that's quite André's point. A lot of blogs have
tag clouds - long lists of perhaps a hundred tags, in various sized
fonts which act as jumping off points to other parts of the site.
They are not tags in the rel=tag sense of the word in that they do
not describe the content of the current page, but of the site as a
whole. People should not be marking them with rel=tag, but
nonetheless some people do. And it means that essentially every
single page on their site has the same massive set of tags - rel=tag
becomes useless on the whole site.
--
Toby A Inkster
<mailto:mail at tobyinkster.co.uk>
<http://tobyinkster.co.uk>
More information about the microformats-new
mailing list