[uf-new] Microformats support for pagination

Toby A Inkster mail at tobyinkster.co.uk
Wed Jan 14 14:46:51 PST 2009


Brian Suda wrote:

> But this isn´t unique to microformats, other semantic technologies
> would have this issue as well.

FOAF (and RDF more generally) has a set of well-established  
conventions for merging data. Certain properties are taken to be what  
is called "inverse functional properties" (IFPs) - what that means in  
English is that if P is an IFP, and two people have a property P with  
the same value, then they're really the same person.

foaf:mbox is for example defined as an IFP - each mailbox marked up  
with foaf:mbox belongs to exactly one person. If two people share a  
foaf:mbox, then they are the same person, so their data can be  
merged. (I know what you're thinking... there are people who share a  
mailbox, so doesn't this break? In theory, no it doesn't break - the  
specification says that it's for "personal mailboxes" only, "ie. an  
Internet mailbox associated with exactly one owner". In practice,  
people occasionally ignore the spec, but for the most part it works  
well.) There are other IFPs too, such as foaf:jabberID, foaf:openid,  
etc.

So, for hCard/vCard, what are candidates for IFPs? We've discussed  
"uid" before, and the general agreement is that that should be fairly  
safe. "Photo" looks like it might be a good candidate to begin with,  
and probably will do in practice, but in theory the vCard spec  
defines it far too loosely - two people could allowably have the same  
photo. "Key" is pretty much in the same bucket as "photo", but is  
probably less useful as few people use it anyway. So really, "uid" is  
just about it - shame not many people use that either.

> wouldn't you just keep a list of the pages you have already
> crawled? So if you find a tagcloud on page /item1.html and it links to
> /tags/tag1 then on page item2.htm you re-find the tag cloud which
> links to /tags/tag1 you don't follow it again?


I don't think that that's quite André's point. A lot of blogs have  
tag clouds - long lists of perhaps a hundred tags, in various sized  
fonts which act as jumping off points to other parts of the site.  
They are not tags in the rel=tag sense of the word in that they do  
not describe the content of the current page, but of the site as a  
whole. People should not be marking them with rel=tag, but  
nonetheless some people do. And it means that essentially every  
single page on their site has the same massive set of tags - rel=tag  
becomes useless on the whole site.

-- 
Toby A Inkster
<mailto:mail at tobyinkster.co.uk>
<http://tobyinkster.co.uk>






More information about the microformats-new mailing list