[uf-new] Microformats support for pagination

Wed Jan 14 15:17:14 PST 2009

On Wed, Jan 14, 2009 at 10:46 PM, Toby A Inkster <mail at tobyinkster.co.uk> wrote:
> Brian Suda wrote:
>
>> But this isn´t unique to microformats, other semantic technologies
>> would have this issue as well.
>
> FOAF (and RDF more generally) has a set of well-established conventions for
> merging data. Certain properties are taken to be what is called "inverse
> functional properties" (IFPs) - what that means in English is that if P is
> an IFP, and two people have a property P with the same value, then they're
> really the same person.
>

Wow.. I wasn't aware of this. Thanks for the tip.

> foaf:mbox is for example defined as an IFP - each mailbox marked up with
> foaf:mbox belongs to exactly one person. If two people share a foaf:mbox,
> then they are the same person, so their data can be merged. (I know what
> you're thinking... there are people who share a mailbox, so doesn't this
> break? In theory, no it doesn't break - the specification says that it's for
> "personal mailboxes" only, "ie. an Internet mailbox associated with exactly
> one owner". In practice, people occasionally ignore the spec, but for the
> most part it works well.) There are other IFPs too, such as foaf:jabberID,
> foaf:openid, etc.
>
> So, for hCard/vCard, what are candidates for IFPs? We've discussed "uid"
> before, and the general agreement is that that should be fairly safe.
> "Photo" looks like it might be a good candidate to begin with, and probably
> will do in practice, but in theory the vCard spec defines it far too loosely
> - two people could allowably have the same photo. "Key" is pretty much in
> the same bucket as "photo", but is probably less useful as few people use it
> anyway. So really, "uid" is just about it - shame not many people use that
> either.
>

Hmmm.. can't we use emails? if two hcards have the same email, aren't
they the same entity?

>> wouldn't you just keep a list of the pages you have already
>> crawled? So if you find a tagcloud on page /item1.html and it links to
>> /tags/tag1 then on page item2.htm you re-find the tag cloud which
>> links to /tags/tag1 you don't follow it again?
>
>
> I don't think that that's quite André's point. A lot of blogs have tag
> clouds - long lists of perhaps a hundred tags, in various sized fonts which
> act as jumping off points to other parts of the site. They are not tags in
> the rel=tag sense of the word in that they do not describe the content of
> the current page, but of the site as a whole. People should not be marking
> them with rel=tag, but nonetheless some people do. And it means that
> essentially every single page on their site has the same massive set of tags
> - rel=tag becomes useless on the whole site.

Exactly. I agree that this is not the purpose of rel-tags but I only
brought it up because out of a very small sample, quite a few examples
popped out. The only way out of this mess that I can think of, is to
create a microformat for tagclouds, like a root element with
class="tagcloud" (the actual name could be based on the most used
term) and that would give parsers the mechanism to either exclude all
rel-tags inside .tagcloud or to grab the rel-tags inside of the
.tagcloud and bail out...

This brings me to yet another point that I considered when I gave that
talk... if there was a semantic way of attaching a site-wide weight to
a rel-tag, that would be *awesome* for these cases. :) But we've seen
that embedding machine-data into microformats is a dangerous path...
;)

Thanks for your feedback,
André Luís