[uf-discuss] Microformats search engine: virel

Angus McIntyre angus at pobox.com
Mon Jul 7 16:43:28 PDT 2008

Dan Brickley wrote:
> ... we need to get much better (across various of these projects)
> in making clear to users what's going on, including the bad things that
> might happen.


> Some examples:

These are good examples, not least because they relate to 'good actors'
(rather than 'bad actors', such as spammers, who can be expected to behave
badly). Even well-intentioned (re-)use has implications and consequences.

> 3. identi.ca, twitter-like microblog (opensource as laconi.ca)
> This microblogging platform encourages users to attach a Creative
> Commons license to their postings, which should give downstream
> aggregators a clearer sense of what can and can't be done with the data.
> We lack similar practice for FOAF and microformat content.

I think this is an interesting point. It might be worth reflecting on some
other mechanisms that are used for expressing directives as to how content
can be used.

Of the mechanisms that I've come across, the most obvious are the
CC-licenses that Dan mentioned. Next up is the robots exclusion protocol
[1], and the extensions to it now supported by Google and Yahoo! [2]. The
X-Robots-Tag with its 'noarchive' and 'nosnippet' directives provides
fairly granular control over what may be done with content. Finally,
there's the 'media:restriction' element used in mediaRSS [3]. In the
standard, that's limited to specifying a country and "deny" to indicate
that a given piece of media isn't for distribution to that country.
However, some video hosting services overload it to specify restrictions
on how their content may or may not be aggregated (and by whom).

Possible directives governing use might include:

  individual only - for use by tools like Operator, but not to be crawled
  do not republish - allows automated processing, but not republishing
  non-commercial - only non-commercial republishing allowed
  no-spam - commercial republishing OK, but don't make unsolicited contact
  unrestricted - any legal use permissible

If this actually represents a continuum, then you can make it a principle
that data can only be republished under the same or more restrictive
terms: if A publishes data with 'non-commercial' republishing allowed,
then B may only republish it as 'non-commercial', 'do-not-republish' or
'individual only'.


[1] http://www.robotstxt.org/


[3] https://www.google.com/webmasters/tools/video/en/video.html

More information about the microformats-discuss mailing list