[uf-discuss] 2 billion hCards! gathering material for a "microformats.org turns 5" blog post

Peter Mika pmika at yahoo-inc.com
Tue Jul 6 01:27:03 PDT 2010

Hi Ed,

The comparison to the number of people online is misleading, because the 
microformat stats quoted (both the Google and Yahoo figures) include 
duplicate counting. One of my illustrative examples is 
news.stanford.edu, where the microformat annotation is in the template, 
and thus every single page has exactly the same microformat markup, i.e. 
the address of Stanford University.

To verify, try the query

searchmonkey:com.yahoo.page.uf.hcard site:stanford.edu

in Yahoo Search.

The second point to make is that RDFa usage is underreported by [1]. Compare




These indicate that there are 2.7B pages with RDFa compared to 2B pages 
with hCard. There are many caveats to these numbers, but they are more 
or less on equal footing.



Ed Summers wrote:
> On Sat, Jul 3, 2010 at 10:18 PM, Tantek Çelik <tantek at cs.stanford.edu> wrote:
>> Some additional recent news:
>> * microformats has 94% marketshare compared to alternatives (e.g.
>> RDFa) according to Google (announced at the Semantic Technology
>> conference)
>>  - http://www.readwriteweb.com/archives/google_semantic_web_push_rich_snippets_usage_grow.php
>>  - http://www.readwriteweb.com/images/richsnippets_june10b.jpg
> Was it clear if Google's stats were comparing all microformat usage
> with usage of only their particular rich snippet vocabulary [1]? I'd
> be surprised if it was *all* RDFa vocabulary use, since that would
> mean that Google are indexing all RDFa on the web. John Breslin asked
> a similar question in the comments on that RWW post [2].
> If it isn't clear, I'd probably refrain from citing the 94% market
> share statistic in the microformats-turns-5 post. Although I guess
> this sort of posturing is to be expected, and most people take it as a
> given that "there are three kinds of lies: lies, damned lies, and
> statistics.", especially in religious debates [3]
> The 2 Billion statistic is astounding, considering there are an
> estimated 1.8 Billion people online [3]. It makes me appreciate how
> important efforts are to give people the ability identify, link, and
> unlink their online identities [4].
> //Ed
> [1] http://rdf.data-vocabulary.org/rdf.xml
> [2] http://www.readwriteweb.com/archives/google_semantic_web_push_rich_snippets_usage_grow.php#comment-219873
> [3] There are three kinds of lies: lies, damned lies, and statistics."
> [4] http://code.google.com/apis/opensocial/
