[uf-discuss] hCard slowing adoption of microformats?

Tantek Çelik tantek at cs.stanford.edu
Fri Nov 28 16:11:07 PST 2008


Originally sent as a private reply, though I had intended it for the list.

---------- Forwarded message ----------
Date: Sun, Nov 23, 2008 at 12:03 PM
Subject: Re: [uf-discuss] hCard slowing adoption of microformats?

Dan,

I do know of several instances (since corrected so I won't name names)
of sites w 1000s to millions of users publishing birthdays, emails,
and emailhashes (which can be used to perform unintended identity
consolidation) in their FOAF files (while not on visible profile
pages).

The problem is that web pages are typically designed by web designers
who take a very strong user-centric (privacy, expectations etc)
perspective, whereas abstract format files are written by programmers,
and to them such files look like a form to be filled out from a
database query, so they happily do so, empirically often without
considering user perspectives.

Thus another tendency for such invisible data (and invisible data
formats) to induce leakage of private data from databases, simply by
how their design itself influences the population that
supports/publishes/programs them.

Republishing is a challenge for all data on the web, but users
understand copy & paste of visible text on the web. They're surprised
when private details become public.

There is also "quantity surprise" effect when people see 1000s of
pieces of text being copy/pasted/indexed, and currently the Google SG
API is providing an interesting test of that expectation wrt XFN.

So far the anecdotal surprises about SGAPI have been far more "wow
cool" than "yikes creepy".

We'll see what happens when we see web-wide hCard fielded search (more
than just raw search as Y! Searchmonkey supports).

Tantek

-----Original Message-----
From: Dan Brickley <danbri at danbri.org>

Date: Sun, 23 Nov 2008 20:47:31
To: <tantek at cs.stanford.edu>; Microformats
Discuss<microformats-discuss at microformats.org>
Subject: Re: [uf-discuss] hCard slowing adoption of microformats?


Hi Tantek,

Tantek Celik wrote:
> This is also a classic visible data (eg on HTML pages) vs invisible data (eg at URLs not linked to or at least not easily viewable in browsers in random/rare(r) XML formats) probem.
>
> The more visible the data, the less likely users will be surprised by having data they may have thought was private (because they didn't see it on the web) be scraped, aggregated, indexed, republished.
>
> When data *is* visible that users don't feel comfortable publishing, they take steps to remove or make it private.
>
> Hence we discourage publishing of invisible data. It's user unfriendly, and leads to far more frequent violations of user expectations.

I generally agree. We discourage people from exposing anything in FOAF
that isn't otherwise available in textual form in public HTML. While it
seems (I never got the details confirmed before it was switched off)
that Tribe may have exposed more in the RDF/XML than in the HTML, from
reading through the many user comments it was the wholesale-ness of the
thing that really upset people. It looked like their entire profile
*and* those of their buddies had been copied/cloned. This could have
equally well have been accomplished through use of curl/wget and some
scraping tools, and most users wouldn't have been any the wiser, or any
the happier.

You can make your own mind up here,
http://brainstorm.tribe.net/thread/34fb1a79-351d-4251-8318-829623c1c9cb

The initial post is pretty indicative of the tone,
       "Can someone please tell me why my bio and all of my tribe friends are
listed on a site I have never been to or heard of? I didn't think this
was Tribes style. I feel cheated and betrayed. If I wanted my profile to
be farmed out, I would join Facebook."

Short of keeping all public profile data buried inside hard-to-parse
GIFs, any markup describing profiles and linking to buddies is at risk
of being 'exploited' in just this way.

I think the main reason we haven't seen many complaints (about FOAF or
hCard+XFN) is not the visible/invisible issue, but simply that there
aren't many sites who have taken a "download the entire set of people
descriptions and re-assemble them on another site" approach. Thankfully.

cheers,

Dan

--
http://danbri.org/


More information about the microformats-discuss mailing list