Non-visible microformats was [uf-discuss] Principles of
angus at pobox.com
Sat Dec 16 05:24:43 PST 2006
At 04:54 -0500 16.12.2006, Mike Schinkel wrote:
>When I've discussed proposing numerous non-visible Microformats, Tantek (and
>others) told me "no" (numerous times.) I can dig up some quotes if need be.
>So I was documenting and clarifying philosophy, not prior deviations from.
I had to go back to the FAQ and read up on this one, to find out
exactly why invisible microformats are disparaged. After all, it
seems superficially that we already have some core microformats that
define information that's not human visible, in the form of Robots
META, various 'rel' standards and XFN.
The FAQ makes it clear that the concern is that invisible markup is
associated with spam; because spammers like to hide stuff in their
pages that causes a search engine to see them differently from human
viewers, any proposed microformat that encourages people to put
normally-visible markup into a page invisibly runs the risk of
getting pages that carry it sanctioned by Google and friends once the
spammers learn how to abuse it.
The existing invisible microformats relate to information that is (a)
never visible and (b) is harder to abuse this way. (Those are
distinct points: there is abusable invisible information, as shown by
the fact that Google doesn't index META keywords and descriptions).
I can see various possible ways out of this quandary (or perhaps
deeper into it).
#1 is to stick to the party line and say "Microformats are visible
for the reasons stated and there's no point encouraging or endorsing
anything that goes against that."
#2 would be to launch a distinct initiative (with its own site, wiki,
mailing list, and line of endearing plush toys) to define non-visible
things-that-are-like-microformats for those who want to take the risk
of incurring the wrath of Google, accompanied by the clear caveat
that "this stuff can get you banned from every decent search engine".
#3 would be to develop a microformat convention that allows you to
mark blocks of embedded data as being
not-for-search-engine-consumption; a 'no-index' convention at the
block level that would allow you to say "I'm not trying to spamdex
you, just ignore this bit and leave it for the robots that understand
this specific markup". The obvious flaw here is that it requires the
crawler-makers to tune their bots to handle it, and I doubt they will.
#4 would be to accept that 'buried' semantic data ought to be
external to the page, but define a way for a savvy search engine to
recover the data associated with any invidual element. The kind of
marker required could be just as much a microformat as any of the
'rel' conventions. The objection here is that maintaining two
parallel documents is tedious (less so if you're not handbuilding
pages). If that's not an obstacle, a combination of a
microformat-endorsed marking convention and something like Andy
Mabbett's proposed UNAPI <http://unapi.info/> might be a solution
#5 would be to investigate whether a proposed type of not-for-human
consumption data could be handled as visible data, but designed in
such a way that clever CSS styling would allow authors to render it
more palatable to human viewers (i.e. by pulling it out of page flow,
using :after content to surround it with human-intelligible labels
It would seem to me that #4 and #5 might be worth considering in the
context of microformats; #2 and #3 probably wouldn't be (but that
doesn't mean that they're not worth considering in their own right,
More information about the microformats-discuss