microformats: Difference between revisions
|  (Added related principles, point about using UTF-8) | ScottReynen (talk | contribs)  | ||
| Line 10: | Line 10: | ||
| * highly correlated with semantic XHTML, AKA the [http://www.tantek.com/presentations/2004etech/realworldsemanticspres.html real world semantics, AKA lowercase semantic web], AKA [http://www.whump.com/moreLikeThis/link/04069 lossless XHTML] | * highly correlated with semantic XHTML, AKA the [http://www.tantek.com/presentations/2004etech/realworldsemanticspres.html real world semantics, AKA lowercase semantic web], AKA [http://www.whump.com/moreLikeThis/link/04069 lossless XHTML] | ||
| * described by [http://tantek.com/log/2005/03.html#d13t1722 Tantek's recent presentation at SXSW: The Elements of Meaningful XHTML] | * described by [http://tantek.com/log/2005/03.html#d13t1722 Tantek's recent presentation at SXSW: The Elements of Meaningful XHTML] | ||
| * a set of simple open data format standards that  | * a set of simple open data format standards that a diverse community of | ||
| individual and organizations are actively developing and implementing for more/better structured blogging and web microcontent publishing in general. | |||
| * [http://theryanking.com/blog/archives/2005/04/07/an-evolutionary-revolution/ "An evolutionary revolution" - Ryan King] | * [http://theryanking.com/blog/archives/2005/04/07/an-evolutionary-revolution/ "An evolutionary revolution" - Ryan King] | ||
| * all the above. | * all the above. | ||
Revision as of 04:30, 23 June 2006
microformats
What are microformats?
microformats are
- a way of thinking about data
- design principles for formats
- adapted to current behaviors and usage patterns ("Pave the cow paths." - Adam Rifkin)
- highly correlated with semantic XHTML, AKA the real world semantics, AKA lowercase semantic web, AKA lossless XHTML
- described by Tantek's recent presentation at SXSW: The Elements of Meaningful XHTML
- a set of simple open data format standards that a diverse community of
individual and organizations are actively developing and implementing for more/better structured blogging and web microcontent publishing in general.
- "An evolutionary revolution" - Ryan King
- all the above.
microformats are not
- a new language
- infinitely extensible and open-ended
- an attempt to get everyone to change their behavior and rewrite their tools
- a whole new approach that throws away what already works today
- a panacea for all taxonomies, ontologies, and other such abstractions
- defining the whole world, or even just boiling the ocean
- any of the above
the microformats principles
- solve a specific problem
- start as simple as possible
- solve simpler problems first
- make evolutionary improvements
 
- design for humans first, machines second
- be presentable and parsable
- visible data is much better for humans than invisible metadata
- adapt to current behaviors and usage patterns, e.g. (X)HTML, blogging
- ease of authoring is important
 
- reuse building blocks from widely adopted standards
- semantic, meaningful (X)HTML. See SemanticXHTMLDesignPrinciples for more details.
- existing microformats
- well established schemas from interoperable RFCs
 
- modularity / embeddability
- design to be reused and embedded inside existing formats and microformats
 
- enable and encourage decentralized and distributed development, content, services
- explicitly encourage the original "spirit of the Web"
 
- Related Principles we re-use from other design paradigms
- DRY (Don't Repeat Yourself)
- Least Surprise
- Pareto Principle (80/20)
- Data Integrity.  One of the common objectives which many of the principles help achieve is data integrity.
- Visible data = more accurate data. By designing for humans first and making the data presentable (thus viewed and verified by humans), the data is inevitably more accurate not only to begin with (as errors are easily/quickly noticed by those viewing the pages/sites), but over time as well, in that changes are noticed, and if data becomes out-of-date or obsolete, that's more liklely to be noticed as well.  This is in direct contrast to "side files" and invisible data like that contained in <meta>tags.
- Not repeating yourself (following DRY) - means there are fewer chances for inconsistency
- Multi-language integrity. Perhaps not a principle, but many of those involved with microformats have found that consistently using UTF-8 helps ensure that the human text content itself is not corrupted, especially when using non-ASCII7 characters.
 
- Visible data = more accurate data. By designing for humans first and making the data presentable (thus viewed and verified by humans), the data is inevitably more accurate not only to begin with (as errors are easily/quickly noticed by those viewing the pages/sites), but over time as well, in that changes are noticed, and if data becomes out-of-date or obsolete, that's more liklely to be noticed as well.  This is in direct contrast to "side files" and invisible data like that contained in 
 
current microformats
See the main page for a list of current microformats specifications, drafts, and discussions.
more thoughts on how microformats are different
There are plenty of existing formats that are nearly totally useless/ignored.
They're not totally useless though. They're useful in that they illustrate what at least someone thought might be useful, which unfortunately is typically a lone-inventor working a-priori without any domain expertise.
Or there is the other extreme. Lots of corporate inventors working with plenty of experience, over-designing a format for what might be needed some day. In particularly bad cases, the corporate vendors collude to prevent openness and/or adoptability by the open source community. Media standards often suffer from this kind of deliberate "strategic" positioning.
We seek to combat all of those problems with the microformat approach.
- We're not lone-inventors; we're a community.
- We don't work a-priori ("from reason alone"); we require documentation of existing examples, previous attempts at formats. See process.
- When lacking domain expertise, we seek out the domain experts to provide it, and we immerse ourselves in examples and prior art from the domain (see previous point).
- We do our work in the open with open discussion forums.
- We're a diverse mix of corporate, independent, hobbyist, enthusiast.
- We don't over-design. We under-design, deliberately, and then only add things when they are absolutely necessary.
- We adopt very liberal copyright/licensing (CC,GMPG,IETF,W3C) and patent positions (RF,IETF,W3C).
- We ruthlessly self-criticize based on our principles in order to keep to the above.
Some ask what the purpose of the (intended) standards is.
Why do you need purpose? More often than not, premature focus on purpose tends to distort data formats towards a particular application which may not be all that relevant. Hence rather than focus on a-priori purpose, we focus on modeling existing behavior, with the knowledge that additional structure will yield plenty of interesting uses, most of which we will not be able to a-priori predict.
This is obviously a very different approach than traditional data format efforts.