misconceptions

From Microformats Wiki
Revision as of 22:38, 6 November 2007 by Tantek (talk | contribs) (added misconception: current behaviors and usage patterns in general)
Jump to navigation Jump to search

misconceptions about microformats

Some misconceptions either appear often enough, or may have been held by people that are experts in the fields of markup, web standards etc., and thus it is useful to document and debunk them.

misconceptions

microformats use unscoped class names

misconception: Microformats definitions use unqualified/scoped class attribute strings as semantic tags.

This is incorrect on two counts.

First, microformats make use of profile URLs to define class name semantics. Thus it is entirely possible (though both unlikely, and undesirable) for someone to redefine class names with their own definitions in their own profile URL.

Second, compound microformats (e.g. hCard, hCalendar, hReview) use a fairly uniquely chosen string for the "root element" (e.g. vcard, vcalendar, vevent, hreview) and then generic terms only inside that root element. Thus use of any generic terms are scoped (some might even say "namespaced") within the fairly uniquely chosen "root element" class name.

microformats use non URI based extensibility

misconception: Microformats use non-URI based extensibility.

Microformats make use of the profile attribute, in the <head> element to reference one or more profiles (this is all per HTML4 spec) to an XMDP profile document (XMDP is derived from the "hints" in HTML4 as to what a profile document "could" be), to define specific rel (e.g. XFN 1.1 profile and class (e.g. hCard profile) values.

Thus microformats are built upon a form of URI based extensibility. Tantek did this by design for XFN, his first experiment into formally extending HTML, and before he even coined the word "microformat" (XMDP & XFN were developed in 2003, "microformats" were first proposed in 2004).

What we have found is that, just like HTML was often used in the wild without explicit DOCTYPE URIs (and tools e.g. browsers supported it), microformats are often used in the wild without explicit profile URLs (and tools e.g. browser plugins support it).

Missing a DOCTYPE does little or no damage today, as (modulo tag soup issues) the DOCTYPE is a link in the chain of reasoning about what the document means. It's been asserted that the HTML profile for microformats is however a crucial link, which perhaps similar to the assertion made by the SGML community back when HTML was introduced that the DOCTYPE for HTML is a crucial link. The parallels are nearly identical.

However, despite how browsers make good sense of HTML sans DOCTYPEs today, witness how nearly no general user-centric user agents have been built to make sense of the babel of XML sans DOCTYPEs that is being published. Given the failure of XML use in practice to make use of URI based extensibility, and the subsequent failure for there to be any widespread user-centric user agents (e.g. browsers) that make use of that content, the lesson to learn here is that it is therefore important to use the profile attribute for microformats, and encourage its use.

The XMDP spec and the GRDDL spec show how to make a profile, and how generic data clients to follow, to either ground the data into RDF, or use the data directly as microformats with terms defined by their XMDP+ID URIs. This will maximize re-use of the data, in combination wit other data. There is a growing class of grddl-aware systems which will use GRDDL-enabled microformat data without any alteration.

microformats tools will erroneously pick up data

One danger of omitting profiles is that, because tools such as browser plugins support microformats without checking for a profile, then those tools will erroneously pick up data from pages which use classes for a completely unrelated purpose. This attributes to the author information which they never meant to give. '

This scenario is highly unlikely and has yet to occur in the real web due to the fairly uniquely chosen root class names for microformats which tools look for before they look for the more generically named classes inside microformats.

no generic data gathering device can be built

The other danger of omitting profiles is that no generic data-gathering device can be built. The web ceases to be self-describing, in that there then would be no one common algorithm for deriving the data from a given page.'

It is difficult to disprove a negative statement as the first sentence, and, this is strictly a theoretical problem, as no generic data-gathering device has been built.

Similarly, the web is not self-describing currently with widespread use of HTML, tag soup, invalid XML, etc. thus saying it ceases to be self-describing is misleading, because there is no "ceasing". If anything, by increasing the semantics expressed on the existing web, the use of microformats increases how much each page self-describes its own semantics.

do not scale when domain specific microformats are added

microformats do not scale when domain-specific, or culture-specific, or company-specific microformats are added.

Microformats are not trying to solve all problems, in fact, that is a specific non-goal. See the microformats principles. In practice, one does not have to solve all problems, nor even make it possible to solve all problems.

Microformats are trying to represent the 80/20 of semantics on the public web, and thus solve most problems that will actually help most people on the public web.

For domain-specific, or culture-specific, or company-specific semantics, those authors should simply make use of the best POSH that they can, with their own profiles and profile URLs, and if their domain-specific, or culture-specific, or company-specific semantics become widely adopted on the web, then that may provide a good case for taking their POSH through the microformats process to develop a new microformat.


current behaviors and usage patterns in general

More there once folks have made the overgeneralization that microformats are/adapted to "current behaviors and usage patterns" in general, and then use that overgeneralization to justify:

  • current behaviors of user agents
  • current browser usage patterns
  • other generalizations

as design centers for microformats. Unfortunately this is incorrect and dilutes the focus of microformats.

A quote taken out of context such as the following has been used to justify the overgeneralization: From the microformats about page:

microformats are: [...] adapted to current behaviors and usage patterns

This is precisely demonstrates the "taking out of context" logical flaw. If you read the entire quote:

Designed for humans first and machines second, microformats are a set of simple, open data formats built upon existing and widely adopted standards. Instead of throwing away what works today, microformats intend to solve simpler problems first by adapting to current behaviors and usage patterns (e.g. XHTML, blogging).

You can see that the "current behaviors and usage patterns" specifically applies to *markup*, *content*, and *publishing* "(e.g. XHTML, blogging)".

Microformats adapt to current content publishing behaviors and markup usage patterns. Not current behaviors and usage patterns in general. More on this is clear from the principles and the process.

This is actually a very important distinction, as focusing on the content publishing side of things is one the ways microformats greatly succeeds. By focusing on making things easier for publishers (rather than developers, parsers, browser vendors, etc.), microformats lowers barriers for the most people. Specifically, lowering barriers for publishers to publish semantic content (from POSH on up) helps solve the chicken-egg problem that such content often suffers from, as publishers will often do something if it at least has some benefit, if it is very easy to do so.

see also