Non-visible microformats was [uf-discuss] Principles of Microformats?

Angus McIntyre angus at pobox.com
Sat Dec 16 05:24:43 PST 2006


At 04:54 -0500 16.12.2006, Mike Schinkel wrote:
>When I've discussed proposing numerous non-visible Microformats, Tantek (and
>others) told me "no" (numerous times.) I can dig up some quotes if need be.
>So I was documenting and clarifying philosophy, not prior deviations from.

I had to go back to the FAQ and read up on this one, to find out 
exactly why invisible microformats are disparaged. After all, it 
seems superficially that we already have some core microformats that 
define information that's not human visible, in the form of Robots 
META, various 'rel' standards and XFN.

The FAQ makes it clear that the concern is that invisible markup is 
associated with spam; because spammers like to hide stuff in their 
pages that causes a search engine to see them differently from human 
viewers, any proposed microformat that encourages people to put 
normally-visible markup into a page invisibly runs the risk of 
getting pages that carry it sanctioned by Google and friends once the 
spammers learn how to abuse it.

The existing invisible microformats relate to information that is (a) 
never visible and (b) is harder to abuse this way. (Those are 
distinct points: there is abusable invisible information, as shown by 
the fact that Google doesn't index META keywords and descriptions).

I can see various possible ways out of this quandary (or perhaps 
deeper into it).

#1 is to stick to the party line and say "Microformats are visible 
for the reasons stated and there's no point encouraging or endorsing 
anything that goes against that."

#2 would be to launch a distinct initiative (with its own site, wiki, 
mailing list, and line of endearing plush toys) to define non-visible 
things-that-are-like-microformats for those who want to take the risk 
of incurring the wrath of Google, accompanied by the clear caveat 
that "this stuff can get you banned from every decent search engine".

#3 would be to develop a microformat convention that allows you to 
mark blocks of embedded data as being 
not-for-search-engine-consumption; a 'no-index' convention at the 
block level that would allow you to say "I'm not trying to spamdex 
you, just ignore this bit and leave it for the robots that understand 
this specific markup". The obvious flaw here is that it requires the 
crawler-makers to tune their bots to handle it, and I doubt they will.

#4 would be to accept that 'buried' semantic data ought to be 
external to the page, but define a way for a savvy search engine to 
recover the data associated with any invidual element. The kind of 
marker required could be just as much a microformat as any of the 
'rel' conventions. The objection here is that maintaining two 
parallel documents is tedious (less so if you're not handbuilding 
pages). If that's not an obstacle, a combination of a 
microformat-endorsed marking convention and something like Andy 
Mabbett's proposed UNAPI <http://unapi.info/> might be a solution 
here.

#5 would be to investigate whether a proposed type of not-for-human 
consumption data could be handled as visible data, but designed in 
such a way that clever CSS styling would allow authors to render it 
more palatable to human viewers (i.e. by pulling it out of page flow, 
using :after content to surround it with human-intelligible labels 
etc).

It would seem to me that #4 and #5 might be worth considering in the 
context of microformats; #2 and #3 probably wouldn't be (but that 
doesn't mean that they're not worth considering in their own right, 
somewhere else).

Angus


More information about the microformats-discuss mailing list