Category: News

Google launches microformat-powered recipe search

Today, Google launched a new search feature: Recipe View!

The new search category enables users to discover recipes that have been marked-up following the hRecipe specification from a variety of sources with a new level of accuracy. Google have made it easy for users to find recipes, because authors are now making it easy for them to locate their data from within their web pages.

Wired reports:

“Our intent is to make better user expereince to see if we can jumpstart this ecosystem,” Menzel said. “That way when someone thinks ‘Hey, I just invented a great recipe, let me put it on my blog,’ and that person’s recipe should be a candidate.”

But Menzel insists it’s got to be easy and that Google doesn’t want to push busy webmasters to do any work that won’t result in more traffic.

“This is really a pragmatic response to the dream of the semantic web,” Menzel said. “We would love if the XML world existed — it would make search awesome, but no one is going to to do it. But we need to start somewhere, and a lot of the internet is built manually by people and their time is valuable.”

ReadWriteWeb also notes:

Google didn’t indicate if it has plans to expand this sort of markup into other search efforts, but it’s a good reason – at the very least for recipe publishers – to mark up your websites.

We’re very excited about this new feature from Google, and are pleased to see the organisation continue to support and implement open standards that are simple for authors everywhere to use.

For more information on the specification, check out the hRecipe wiki pages.

Facebook Adds hCalendar and hCard Microformats to Millions of Events

As of today, Facebook has marked up all events with the hCalendar microformat including marking up their venues with hCard as well. According to a simple Google search, that’s millions of public events now with microformats (if anyone knows a more precise number for the total number of Facebook events including private ones, or how many events are created per day, please let us know!).

Here’s an upcoming example: the Great British Booze-up at SXSW 2011 (which I highly recommend if you’re into high fidelity markup, as a bunch of us will be there sharing drinks and post-future-panel microformats conversations).

Visit it in a browser that supports microformats (there are now plugins for Chrome, Firefox, Internet Explorer, and Safari). E.g. try Firefox 4 with the Operator Add-On, and you’ll see “Events (1)” in the toolbar. Just one click reveals “Great British Booze-up” which you can then choose to export to your iCal, 30 Boxes, Google, or Yahoo calendars.

And if you’re a user, that’s it. The ability to copy events to where you want them. If you publish hCalendar, Google will index and show rich snippets for your events. That’s what microformats empower you to do.

For the web authors/designers/developers out there, let’s take a closer look at Facebook’s markup. If you view source on the page and search for “vevent”, you see the following code:

<body class="vevent ...

Which indicates that this page is an event. Searching further in the source you’ll find that Facebook generates the rest of the from a script. While this isn’t ideal (in general it’s better to have the markup in markup), apparently there are sometimes performance reasons for script generated content. No matter, we can use Firefox’s “View Selection Source” context menu to view the generated markup.

I want to call out two specific aspects of Facebook’s implementation. Select the entire date time row: “Time Monday, March 14 · 6:00pm – 9:00pm” and right/control-click on it and choose “View Selection Source”. Here’s most of the markup you’ll see (white-space added for readability).

 <div>Monday, March 14 · 
  <span class="dtstart">
   <span class="value-title" title="2011-03-14T18:00:00"> </span>6:00pm</span> - 
  <span class="dtend">
   <span class="value-title" title="2011-03-14T21:00:00"> </span>9:00pm</span>
 </div>

Minor update 2011-02-18: The “-07:00” timezone offsets were removed as they were not reliably accurate. Better to omit the timezone offset, use floating datetimes and let the consumer infer timezone from location as needed.

Encoding dates and times that work for humans and machines has been one of the biggest challenges in microformats, and what you’re seeing here is the result of years of community iteration (techniques, feedback, research) called the ‘value-title’ technique of the value class pattern. In short, by placing the machine readable ISO datetime into the title attribute of a harmless empty span element adjacent to the human readable date and time, we are able to achieve a pragmatic balance between user experience, content fidelity, and minimizing the effects of what is essentially duplicating the data (a DRY violation, something we avoid due to potential inconsistencies unless absolutely necessary for a greater principle, such as usability – humans first as it were).

This is the largest deployment of the ‘value-title’ technique known to date, and works great with the value class pattern support in Operator.

Let’s take a look at the venue. View Selection Source on the entire block from “Location Shakespeare’s Pub” to “78701” and you’ll see (again with whitespace added)

 <div class="location vcard">
  <span class="fn org">Shakespeare's Pub </span>
  <div class="adr">
   <div class="street-address">314 East 6th Street</div>
   <div class="locality">Austin, TX 78701</div>
  </div>
 </div>

This is an excellent example of using a nested hCard for an hCalendar event venue, except for one thing:

   <div class="locality">Austin, TX 78701</div>

Which should be marked up more like:

   <span class="locality">Austin</span>, 
   <span class="region">TX</span>
   <span class="postal-code">78701</span>

I’m guessing what’s going on here is too coarse of an interface between a backend and frontend system, that is, for convenience the developers may be retrieving the entirety of the city, state, and zip from their backend as a single string, and thus the best the front end can do is to mark up the entire thing as the city (locality).

While not ideal, this isn’t horrible either. Using Operator again, choose “Export Contact” for Shakespeare’s Pub, and note that your address book displays it just fine (even if the fields aren’t exactly in the right spots). Copy and pasting that address to a map site also works. The markup isn’t ideal, but it’s usable and useful, and I for one am happy that Facebook chose to go ahead and make that pragmatic decision and ship now, while knowing they could iterate and improve data fidelity in a subsequent update.

Facebook’s deployment of hCalendar is just the latest in their series of slow but steadily increasing support for open standards and microformats in particular. Over two years ago Facebook added hCard support to their user profiles. Last year they announced support for OAuth 2.0, as well as adding XFN rel-me support to user profiles, thus interconnecting with the world wide distributed social web. They proudly documented their use of HTML5. And now, millions of hCalendar events with hCard venues. Looking forward to seeing what they support next.

Well done Facebook, and keep up the good work.

Wiki Updates In IRC Again: Welcome Loqi

Inspired by the Wikipedia community, in the early days of microformats.org we had a bot, mfbot, setup by microformats.org co-founder and admin Ryan King, that reported wiki changes to the #microformats IRC channel. mfbot performed an invaluable role, both helping newcomers see what areas of the wiki were active, and helping admins quickly revert and block spammers. Unfortunately mfbot turned out to require just enough handholding that made it too much work to keep running over the long run (the cognitive surplus of our all volunteer admins tends to be quite limited, we’re busy folks).

Enter Loqi. In the process of setting up indiewebcamp.com (which all microformatters should check out) with Aaron Parecki today, he setup his IRC bot, Loqi to report IndieWebCamp wiki changes to #indiewebcamp. Turns out Loqi is running continuously and can handle watching changes across multiple wikis, notifying the appropriate IRC channels.

So now, Loqi is now watching the changes from the microformats wiki and reporting them to our our IRC channel (click to join).

Welcome Loqi!

Seven Year Itch: What’s next for microformats

7 years ago Kevin Marks and I presented “real world semantics” in an after-hours open sign-up slot at O’Reilly’s Emerging Technology (ETech) conference and first publicly introduced “microformats” to the world.

To put it in perspective, that very same ETech saw the launch of Flickr, which just celebrated its 7th birthday yesterday. Happy Birthday Flickr!

Since then we’ve had many ups and downs in microformats, learned many hard lessons in community management, and seen both billions of pages add microformats, as well as the adoption of new microformats like hRecipe, along with search engine support and widespread adoption by the SEO crowd (certainly quite a measure of having “made it”).

We’ve seen the emergence of competing syntax standards like RDFa and microdata, both of which have sought to solve general purpose problems beyond the common use cases addressed by microformats. We’ve also seen the rise of proprietary non-standard (AKA “snowflake”) APIs, and Facebook’s standards-based Open Graph Protocol (OGP).

From a cultural perspective, the launch of microformats.org in 2005 inspired and influenced the creation of numerous additional independent community-based standards efforts outside traditional organizations like W3C and IETF. From OAuth (which Twitter and others now depend on) to OEmbed, from the HTML Design Principles (many of which echoed microformats principles), to perhaps most recently ActivityStreams, and their real-world pragmatic microformats-process-like approach to introducing new object types and verbs.

Despite all these successes, there are still some longstanding issues affecting microformats, both the formats themselves overall, and the community. Upon reflecting on the past 7 years as well as learning from what’s worked (and not) in other open standards and open source organizations (W3C, IETF, Mozilla, WHATWG) it’s clear to me that these community and cross-format issues are what need to be addressed for microformats to continue advancing.

I’m not going to attempt to explore all the issues in a single blog post. Suffice it to say there are concrete issues around specification stages, simplifying microformats (use of, parsing, extending), and overall community participation models that need attention.

One aspect of microformats that has stood the test of time is the set of microformats principles. The first principle encourages us to solve a specific problem. That principle applies to processes as well as formats and as such the path forward is to solve specific, real world overall microformats problems, one at a time.

I’ve chosen document stages as the first such specific overall problem to solve, documented a real-world problem statement, and begun brainstorming more precise definitions for “draft”, “specification”, and a new one, “standard”. These definitions will update, expand, and clarify the microformats process and what we mean by microformats specifications. If you’re curious, you can take a look at the brainstorming in progress.

These new document stages are just one of many updates to microformats that we are working on and will be introducing and implementing in the coming months. Stay tuned: join the IRC channel, mailing lists, and follow @microformats on Twitter.

Mark your calendars: in just over a month’s time, myself and fellow admins Frances Berriman (Nature Publishing Group) and Ben Ward (Twitter), as well as Paul Tarjan (Facebook) will present a panel at SXSW Interactive 2011 on “The Future of Microformats where a lot of this and more will be discussed.

Join us and help shape the future.

microformats.org at 5: Two Billion Pages With hCards, 94% of Rich Snippets

The microformats.org community recently celebrated its 5th birthday – five plus years of openly researching, creating, and iterating on web standards to express common semantics designed for humans first, machines second.

Two Billion pages with hCards

Originally brainstormed in September 2004, and rapidly adopted by numerous tools, sites, large and small, the number of pages published with one or more hCards recently crossed the 2 billion mark a few days ago according to Yahoo Search Monkey, making it the most popular format for people or organizations on the web:

screenshot of Yahoo Search Monkey search results for pages with hCards showing just over 2 billion pages with hCards, taken 2010-07-03 at 7pm Pacific Time

Search Monkey’s results do tend to fluctuate a few percentage points, even hour by hour, so you may see different numbers, both lower, and over time, higher and higher. Here are a few recent hCard deployments that no doubt contributed to crossing the two billion mark:

1. Basecamp adds hCards: people and companies

Just a few days ago, Jason Zimdars of 37 Signals reported that Basecamp has been updated to support hCards for people and companies, and is now looking into more uses:

I’m pretty happy with this added functionality so I intend to explore using hCards in other parts of our apps where it makes sense.

Thanks to Jeremy Keith for making the request and following-up with 37 Signals.

2. All .tel domains now support hCard

And just yesterday Telnic announced that all .tel names now support the hCard microformat

3. Over 14 Million of Gravatar Profile hCards

About a month ago, Automattic‘s Gravatar launched public, linkable profiles for all WordPress.com users , beautifully presented and marked up with hCard, e.g. check out Beau Lebens‘s profile:

screenshot of Beau Lebens's Gravatar profile loaded in Firefox with the Operator toolbar showing one hCard

That’s another 14+ million hCards (figure from WordPress.com), each representing an individual blogger on the public web.

4. Over 20 Million BrightKite hCards

Finally, just before microformats.org’s 5th birthday on this past June 20th, developers of BrightKite informed us that they’ve fully implemented hCard on all of their 5.5 million registered user profiles and 16.5 million venue pages – another 22 million new hCards. Thanks for the birthday present BrightKite!

94% of rich snippets markup

All of these deployments come from the powerful combination of: 1. microformats ease-of-authoring (the easiest way to semantically markup people, venues, etc. in HTML), and 2. the fact that search engines like Yahoo and Google index microformats and make them visible in their user interfaces.

In May of 2009 Google launched Rich Snippets with support for microformats and RDFa, with a set of content partners like Yelp who all chose to use microformats to produce rich snippets in Google search results.

screenshot fragment of a Google Rich Snippet of a Yelp search result showing average rating and number of reviews from their use of the hReview-aggregate

Starting with support for hCard, hReview, hReview-aggregate, and hProduct, over the past year, Google added support for hCalendar and hRecipe as well.

For all of these, Google provided side-by-side examples for each snippet type in multiple formats (microformats, RDFa, microdata), which in many ways has helped to demonstrate how much simpler/easier microformats are in many respects (and some of the promise that microdata shows for more general extensibility).

As recently reported by ReadWriteWeb, Google themselves reported at the Semantic Technologies conference that when Google finds data for rich snippets on pages, 94% of the time that data for rich snippets is marked up with microformats (40,091 vs. 2,514, conservatively assuming none of of those pages contain both, if they did, the 94% number would be even higher).

photograph of a slide presented by Google at the Semantic Technologies conference showing a table of sources of rich snippets comparing microformats, about 40k total, vs. RDFa at about 2.5k total.

Photo credit: Read Write Web: Google’s Semantic Web Push: Rich Snippets Usage Growing.

The numbers comparing hCard vs. alternative person markup are particularly staggering:

  • ~30x more person snippets use hCard (33,675 vs. 1,160).

This is no surprise, as The State of Web Development 2010 survey showed nearly an order of magnitude gap, that is far more (6x more) web developers use microformats in their day to day work (34.52% use microformats vs 5.63% use RDFa, per the survey).

Given many more web developers are using microformats, it’s not surprising that Google is finding more microformats than alternatives. What is interesting though is that while 6x more developers use microformats, Google is finding 16x more microformats for rich snippets than alternatives.

One could conclude from these two numbers that developers using microformats are 2-3 times more net productive in terms of number of pages produced with rich snippets. This net productivity could be because microformats are easier (take less time) to author, and possibly that microformats are easier to get right, and thus have Google recognize them, as compared to alternatives.

Making Micoformats Even Simpler

Still, we can do even better than that. And no, I’m not just talking about going from 94% to 99+%.

The Google presentation slide noted that the results were out of one million web pages sampled from the Internet. Out of that, only ~40,000 had microformats. Given that nearly every web page mentions people, organizations, events, or some other popular microformat, that number should be much higher.

Thus there is much room for us to improve, and in particular, based on feedback, from Google, Yahoo, from numerous smaller companies and independent web developers, we can and should make microformats even simpler. Simpler to write, easier to get right, and ideally, even more micro – less code, less page weight. Starting with a few ideas brainstormed a couple of months ago, there’s now a few folks working on a “microformats 2.0” to achieve these goals.

Do you have feedback or ideas about how microformats could be made even simpler and easier for authors?

Please add your thoughts to the “microformats-made-simpler” wiki page.

Have you implemented hCard profiles on your site?

Add your site to the hCard supporting user profiles page.

Thanks to all of the hard work and contributions by everyone in the microformats community for an excellent fifth year of microformats.org. Here’s looking forward to even more microformats accomplishments in our sixth year.