If this is your first visit, please see the Introduction to Microformats page first.
<entry-title>The microformats process</entry-title>
So you wanna develop a new microformat?
Or just a new vocabulary?
Or create a new standard based on empirical research and scientific methods?
This document will help guide you through the steps to take towards achieving these goals.
The microformats process has been cited as inspiration for other standards groups and efforts such as the WHATWG and ActivityStreams.
- Tantek Çelik
- 1 Get some experience first
- 2 Why?
- 3 Check Exploratory Discussions For Previous Work
- 4 Document Current Behavior
- 5 Document Previous Efforts
- 6 Brainstorm Proposals
- 7 Iterate
- 8 Naming considerations
- 9 Pages to create
- 10 Other Documents
- 11 Related
Get some experience first
Before even considering pursuing a new microformat, first:
- Make your site posh.
- Add existing microformats to your sites like hCard for your contact info etc., hCalendar for your events, hAtom for your episodic content (e.g. blogs). See Get Started for more specific examples of adding microformats to your sites.
This will help familiarize you with how posh and microformats currently work. Such "real world" experience will greatly help you with the development a new microformat. For more on this see why-using-existing-matters.
Then, ask yourself:
There must be a problem to be solved (i.e. a real world use case). No problem, no microformat.
Once you've found your 'problem,' ask yourself: 'is there a simpler problem here?' If so, let's solve that problem first. We want to deal with the simplest problems first and only then build up to more complex problems.
Perhaps someone else is pursuing (or has pursued) a similar problem.
Check the Exploratory Discussions page, and search around on the web. Chances are that someone else has encountered the same problem as you. There are very few truly new problems.
If you still believe that you have a new and unsolved problem, post something to the #microformats chat channel, and if no one is around, then the microformats-new mailing list. It's better to get the community involved in the discussion. Start the discussion before you start creating any pages on the wiki.
We're not using the wiki as a general "scratch pad". If you can't summarize the problem you are trying to solve in a short email, and feel like you need a long document, you're probably trying to solve too big of a problem - see previous point about 'is there a simpler problem here?' and simplify until you describe your problem and real world use case in a short paragraph.
Check Exploratory Discussions For Previous Work
It may be possible that someone else has tried (or is trying to) solve the same real world problems as you.
Check the Exploratory Discussions page before starting new pages for your effort to see if you can add to / iterate on an existing effort before creating a new one.
Document Current Behavior
Document examples of current human publishing behavior.
Why examples first? Read that. It's important and gets to the core of what makes microformats efforts different than many/most previous/other format development efforts.
We're paving the cowpaths- before you do that you have to find the cowpaths.
By cowpaths here we mean real world publishing examples. Your Best Practices for Examples Pages should be a collection of real world sites and pages which are publishing the kind of data you wish to structure with a microformat. From those pages and sites, you should extract markup examples and especially the schemas implied therein, and provide analysis.
This collection of examples should be public, preferably on the microformats wiki because it's unlikely you can do it by yourself (no matter how many of you there are), and having others even just check your work will help improve its quality.
The reviews-formats page is a good example of research done before the creation of a microformat. Before developing hReview, the collaborators went out, documented current practices around reviews on web sites, and provided some analysis of the schemas implied therein.
Note: these examples are about the content (and schemas implied therein) not about formats, class names, pre-existing standards etc. That comes next.
Study the examples and determine the implied (not explicit) schema of data that is being published (the conceptual properties about the data - not literal class names). Sort these by frequency and determine which appear in roughly 80% of cases (per the 80/20 rule, Pareto principles).
If you can't find real world publishing examples of the type of data you want to represent using a microformat, then stop here. It doesn't need a microformat.
Document Previous Efforts
Once you've documented real world publishing examples of the kinds of data you want to represent, the next step is to research efforts at developing formats for those kinds of data.
There are almost always previous efforts at formats for whatever kind of data you want to represent using microformats.
Learning from previous format efforts is important, because it helps:
- minimize repeating mistakes by others
- explore why a previous format succeeded (or didn't)
- provide a source of vocabulary to express a new microformat
In particular, ask yourself: "are there any well established, interoperably implemented standards we can look at which address this problem?"
For example, hCard and hCalendar were built on top of the IETF standards for vCard and iCal, respectively, both of which are widely interoperably implemented, and dominant in their space (there were no competing formats with anywhere near the same levels of adoption).
The developers of those standards had already spent many years in standards committees arguing about and developing their schemas. Better to leverage all the hard work that others have done before you, than to go off as a solo cowboy inventor, and waste time repeating all their mistakes. It's also much easier to start from a well established schema, and map into into semantic HTML than to develop a new schema.
It's quite possible during this step that you'll find someone else who has dealt with the problem(s) you're addressing. Perhaps even solved it/them. Do your best to open a dialog with others who have encountered the same problem(s). We don't want to build walls between competing communities - we want people to work together to develop a good solution which will cover the majority of cases.
By now you've researched previous examples and previous formats and you're finally ready to write up brainstorms towards a new microformat.
There are other things to try before developing a microformat. First, ask yourself these questions:
- Is there a standard element in HTML that would work?
- Is there a compound of HTML elements that would work?
If so, document these on the brainstorming page, and stop. There is no need for a microformat.
Let's not unnecessarily reinvent what you can already do with HTML.
For more details on semantic HTML, examples of using HTML elements, and constructing HTML (and XHTML) compounds, see The Elements of Meaningful XHTML.
Otherwise, if you can clearly and confidently answer "no" to the above two questions, we can talk about a microformat.
Before you start your act of creation, familiarize yourself the microformats principles.
There's basically two steps to writing up a brainstorm proposal:
- Go back to that list of real world publishing examples that you researched, and collect the 80/20 of the implied conceptual schema from the examples.
- Give each of those concepts a property name, re-using from, in order:
- existing microformats
- previous formats
- simple but specific human readable English words and phrases
Congratulations, you've written up a brainstorm proposal with a list of class names for a possible microformat.
You may notice that we completely skipped naming the potential new microformat itself. This is not an accident, this is deliberate. Naming is tempting, and good naming is hard. Thus naming is discussed later.
The key is, the explicit schema of the microformat (what properties and hence class names are in it) is more important than the name.
Remember, microformats should be designed for humans first and machines second. Here are few questions that may help you decide if you really need a microformat for the problem you are trying to solve:
- If I looked at this microformat in a browser that didn't support CSS or had CSS turned off, would it still be human-readable?
- Are this format's elements stylable with CSS?
If the proposed format doesn't pass these two things, it's not likely to gain much acceptance. Remember: humans first, machines second.
Now that you have a simple brainstorm proposal, what do you do?
Iterate. Iterate. Iterate.
In the process of developing a microformat, you'll likely get a lot of feedback from others interested in microformats. The effort needs to be iterated and adapted. Microformat development should be open, and preferably collaborative and community-based.
Here's an ASCII-art flow diagram of where you're going
DIAGRAM: problem statement---->research/discussion---->proposal/draft---->specification ^________________V ^___________________V ^______________V
Note that each stage involves iteration. That iteration consists of discussion and feedback and may result in major changes. Do not be afraid to make major changes and please don't get too attached to any particular solutions.
Feel free to explore multiple proposals one after another on the brainstorming page. The goal here is to explore reasonable microformat solutions, including multiple possibilities, alternatives etc.
DO NOT start with even labeling your effort "hXYZ". This is a very common mistake.
Always start with the general problem area.
Thus name the problem area (*- pages below) generically and specifically avoid starting with code name / brand name like hNewCoolFormat.
Good: Product Examples. Bad: hProduct-examples.
After you've iterated your research and brainstorming at least a bit, you've likely gained sufficient understanding of the problem-space that you're solving to start looking at naming considerations.
Pages to create
After a specific problem area (*) has been determined (principle 1), consider creating and filling out the following pages for it, and add the first to Exploratory Discussions. If you're unable to come up with material for the pages, then you should probably reconsider whether or not the problem is worth (or ready for) solving.
- *-examples Find examples on today's web of the the type of content you think needs a microformat. Document them with URLs. Document the schemas implied by the content examples. This is the action that helps follow principle 3, design for humans first, machines second ... adapt to current behaviors and usage patterns. Start by cloning the Best Practices for Examples Pages page and filling it out.
- *-formats Find widely adopted interoperable current data formats and standards that attempt to or have attempted to solve the problem previously. Document their explicit schemas. This is necessary prerequisite for following through with principle 4, "reuse building blocks from widely adopted standards".
- *-brainstorming Use the current real-world web examples and their implicit schemas to determine an 80/20 as-simple-as-possible (principle 2) generic schema to represent their data. Yes, this means you will explicitly omit some features of some use cases, or perhaps entire use cases which are more edge-cases than representative of larger, aggregate/macro behaviors. See which existing microformats can be reused as building blocks (principle 5, modularity). Use those existing data formats and schemas as a source of names for the fields (principle 4). Consider how would you embed this microformat in other formats (also principle 5, embeddability). See the Brainstorming page for a bit more info.
With an 80/20 schema, and a source of field names, write up one or more straw proposals for a microformat in the *-brainstorming page. Make sure the straw proposals encourage the decentralized distribution of data (principle 6). Postpone the choice of root class name as it often overlaps with the naming of the microformat itself. Always keep close at hand the microformats naming principles when choosing names.
Brainstorming about the substance of the microformat (its properties and values) should precede naming the microformat itself. Thus after proposals have been written up and are being discussed for the schema, create a naming section for the microformat itself and root class name, where various names can be considered.
- ** When it seems like there is some amount of consensus around one of the brainstorm proposals for a microformat with a specific name(**), write it up as a separate wiki page as a draft specification (see style-guide), and then start creating the following pages to track it.
- **-faq There will likely be common questions about the new microformat which can/should be answered in an FAQ page.
- **-issues Folks may also raise issues about the microformat which aren't immediately addressable. An issues document helps serves to capture these issues, who raised them, and when, so that folks working on the microformat can be sure to go through and thoroughly answer them.
- **-examples Eventually there may be too many real world examples of a microformat to document them in an informative section at the end of the specification, thus the list deserves its own page.
- **-implementations Eventually there may be too many implementations of a microformat to document them in an informative section at the end of the specification, thus the list deserves its own page.
- **-brainstorming Eventually there will be non-trivial proposals/suggestions/clarifications for changes in the microformats as part of iteration. Create such a format specific Brainstorming page for such suggestions.
Moving from Stage to Stage
These stages of development are mirrored on the main page where microformats are divided into "Exploratory Discussions", "Drafts", and "Specifications".
How do microformats move from one stage to the other?
Document your problem area, existing research, and brainstorm proposals on the Exploratory Discussions page. You should do it on this wiki using current items under exploratory discussion as a guide.
Then send a note to the microformats-new list to get the attention of others who are interested in microformats. This is probably a good chance to pull in people from outside the current microformats community who may also be experiencing the same issue.
Feedback will probably range the gamut. Others may challenge your problem statement, the need for a microformat, concur, or add. All constructive feedback is good.
As a result of feedback, you may decide to abandon your microformat idea or substantially modify it or add more alternatives to brainstorm proposals to consider.
One thing you want to be sure to do at this stage is to avoid reinventing the wheel.
Are there elemental microformats you can reuse as building blocks? Doing this will save you effort and help you get implemented later because implementers will have less work to do.
Once there seems to be some degree of consensus around a particular brainstorm proposal, including a specific schema and list or properties, write that brainstorm proposal into a draft.
Here, you need to write what is essentially a specification, but with the idea that it could change a lot. Again, this needs to go in the wiki, and you should send a note to the microformats-new mailing list to alert people that something new has happened. Continue encouraging feedback from relevant resources both inside and outside the community. Drafts need to include at least the following:
- Statements regarding the fact that you will not patent and are adopting appropriate copyright (preferably PUBLIC DOMAIN) as illustrated by current drafts.
- An XMDP stating explicit property and value names. You may want to place this on a separate wiki page and link to it. In that case use the naming convention *-profile, e.g. hCard Profile.
- Examples from current practice that show how the microformat would be used. Keep an eye out for how the microformat is actually improving things. If it's not, that might be an indication that you either need to abandon or change a lot.
- Use of rfc-2119 terms.
- References that back up your design decisions for the microformat. To the extent possible, you do not want to invent things from whole cloth. This should be relatively easy if you have followed the process so far with proper research.
- A list of examples in the wild (an empty section is fine and expected to start with)
- A list of implementations (an empty section is fine and expected to start with)
- An issues section for people to provide feedback.
You will usually need at least one iteration to get past the draft stage.
By the time something becomes a specification, it should be stable so that developers can pick it up and write to it. This in turn implies that there are at least a couple of implementations.
Before moving to the specifications section, drop a note to microformats-new and encourage discussion and major objections. If none are forthcoming, suggest moving the microformat to the specifications area. If there is sufficient explicit positive support from the community to do so, then do so. If not, then leave as draft. The absence of feedback is not approval and should be noted instead as a lack of interest.
What about other types of pages on the wiki, like "patterns" (e.g. include-pattern)?
How do those get created?
The short explanation is this:
The patterns are not formats at all. They do not stand on their own. They are merely documentation of pieces of other formats that are expected to (or are) being reused by other formats.
The real world examples for includes in particular were documented in the context of resume-examples, Resume Formats, and resume-brainstorming (as noted at the very top of the document) "Initially developed as part of resume-brainstorming..." Later, real world examples for reviews were found to need the include pattern as well.
When real world examples for multiple microformats require the solution of the same or a similar problem, then it is worth exploring the creation of a pattern that solves the problem across microformats.