The microformats process

From Microformats Wiki
Jump to navigation Jump to search

If this is your first visit, please see the introduction page first.


So you wanna develop a new microformat?

Or just a new vocabulary?

Or create a new standard based on empirical research and scientific methods?

This document will help guide you through the steps to take towards achieving these goals.

The microformats process has been cited as inspiration for other standards groups and efforts such as the WHATWG and ActivityStreams.

Editor
Tantek Çelik

Get some experience first

Before even considering pursuing a new microformat, first:

  1. Make sure your site uses POSH.
  2. Add existing microformats to your sites like h-card for your contact info, h-event for your events, h-entry for your episodic content (e.g. blogs). See get-started for more specific examples of adding microformats to your sites.
    • In particular, start adding microformats2 markup for all your microformats. Any new microformats must use microformats2 - using it will help you become familiar with it.

This will help familiarize you with how POSH and microformats currently work. Such "real world" experience will greatly help you with the development of a new microformat. For more on this see why-using-existing-microformats-matters.

Then, ask yourself:

Why?

There must be a problem to be solved. No problem, no microformat.

Once you've found your 'problem,' ask yourself: 'is there a simpler problem here?' If so, let's solve that problem first. We want to deal with the simplest problems first and only then build up to more complex problems.

Perhaps someone else is pursuing (or has pursued) a similar problem.

Check the exploratory-discussion page, and search around on the web. Chances are that someone else has encountered the same problem as you. There are very few truly new problems.

If you still believe that you have a new and unsolved problem, post a one sentence summary of the problem to the irc channel. If no one answers, perhaps start a page documenting the problem you're solving, then ask again.

It's better to get the community involved in the discussion as the community can help find previous attempts at describing or solving the problem.

Start the discussion before you start creating any pages on the wiki.

We're not using the wiki as a general "scratch pad".

If you can't summarize the problem you are trying to solve in a short message on IRC (think tweetsized), and feel like you need a long document, you're probably trying to solve too big of a problem - see previous point about 'is there a simpler problem here?' and simplify until you describe your problem and real world use case in a short paragraph.

What is the use case

Once you've documented the problem, start exploring and documenting additional use-cases that would be addressed/enabled by a solution.

Check Exploratory Discussions For Previous Work

It may be possible that someone else has tried (or is trying to) solve the same real world problems as you.

Check the exploratory-discussions page before starting new pages for your effort to see if you can add to / iterate on an existing effort before creating a new one.

Document Current Behavior

Document examples of current real world human publishing behavior of the type of content you want to mark up in a *-examples wiki page.

Why examples first? Read that. It's important and gets to the core of what makes microformats efforts different than many/most previous/other format development efforts.

We're paving the cowpaths- before you do that you have to find the cowpaths.

By cowpaths here we mean real world publishing examples. Your examples should be a collection of real world sites and pages which are publishing the kind of data you wish to structure with a microformat. From those pages and sites, you should extract markup examples and especially the schemas implied therein, and provide analysis.

This collection of examples should be public, preferably on the microformats wiki because it's unlikely you can do it by yourself (no matter how many of you there are), and having others even just check your work will help improve its quality.

The review-examples page is a good example of research done before the creation of a microformat. Before developing hReview, the collaborators went out, documented current practices around reviews on web sites, and provided some analysis of the schemas implied therein.

Note: these examples are about the content (and schemas implied therein) not about formats, class names, pre-existing standards etc. That comes next.

Study the examples and determine the implied (not explicit) schema of data that is being published (the conceptual properties about the data - not literal class names). Sort these by frequency and determine which appear in roughly 80% of cases (per the 80/20 rule, Pareto principle).

If you can't find real world publishing examples of the type of data you want to represent using a microformat, then stop here. It doesn't need a microformat.

Document Previous Formats

Document examples of previous formats related to the problem area in a *-formats wiki page.

Once you've documented real world publishing examples of the kinds of data you want to represent, the next step is to research efforts at developing formats for those kinds of data.

There are almost always previous efforts at formats for whatever kind of data you want to represent using microformats.

Documenting these previous-formats helps in a number of ways:

  • explore why previous format(s) succeeded (or didn't)
  • avoid (or at least minimize) repeating mistakes of previous formats
  • use successful previous formats for interoperability
  • provide a source of vocabulary to express a new microformat

In particular, ask yourself: "are there any well established, interoperably implemented standards we can look at which address this problem?"

For example, hCard and hCalendar were built on top of the IETF standards for vCard and iCal, respectively, both of which are widely interoperably implemented, and dominant in their space (there were no competing formats with anywhere near the same levels of adoption).

The developers of those standards had already spent many years in standards committees arguing about and developing their schemas. Better to leverage all the hard work that others have done before you, than to go off as a solo cowboy inventor, and waste time repeating all their mistakes. It's also much easier to start from a well established schema, and map into into semantic HTML than to develop a new schema.

It's quite possible during this step that you'll find someone else who has dealt with the problem(s) you're addressing. Perhaps even solved it/them. Do your best to open a dialog with others who have encountered the same problem(s). We don't want to build walls between competing communities - we want people to work together to develop a good solution which will cover the majority of cases.

If you can't find previous efforts at formats for the data you want to microformat, ask in IRC.

Brainstorm Proposals

By now you've researched previous examples and previous formats and you're finally ready to write up brainstorms towards a new microformat in a *-brainstorming wiki page.

Actually, don't!

There are other things to try before developing a microformat. First, ask yourself these questions:

  1. Is there a standard element in HTML that would work?
  2. Is there a compound of HTML elements that would work?

If so, document these on the brainstorming page, and stop. There is no need for a microformat.

Let's not unnecessarily reinvent what you can already do with HTML.

For more details on semantic HTML, examples of using HTML elements, and constructing HTML (and XHTML) compounds, see The Elements of Meaningful XHTML.

Otherwise, if you can clearly and confidently answer "no" to the above two questions, we can talk about a microformat.

Before you start your act of creation, familiarize yourself the microformats principles.

There's basically two steps to writing up a brainstorm proposal:

  1. Go back to that list of real world publishing examples that you researched, and collect the 80/20 of the implied conceptual schema from the examples.
  2. Give each of those concepts a property name, re-using from, in order:
    1. existing microformats
    2. previous formats
    3. simple but specific human readable English words and phrases

Congratulations, you've written up a brainstorm proposal with a list of class names for a possible microformat.

You may notice that we completely skipped naming the potential new microformat itself. This is not an accident, this is deliberate. Naming is tempting, and good naming is hard. Thus naming is discussed later.

The key is, the explicit schema of the microformat (what properties and hence class names are in it) is more important than the name.

Remember, microformats should be designed for humans first and machines second. Here are few questions that may help you decide if you really need a microformat for the problem you are trying to solve:

  1. If I looked at this microformat in a browser that didn't support CSS or had CSS turned off, would it still be human-readable?
  2. Are this format's elements stylable with CSS?

If the proposed format doesn't pass these two things, it's not likely to gain much acceptance. Remember: humans first, machines second.

Iterate

Now that you have a simple brainstorm proposal, what do you do?

Iterate. Iterate. Iterate.

In the process of developing a microformat, you'll likely get a lot of feedback from others interested in microformats. The effort needs to be iterated and adapted. Microformat development should be open, and preferably collaborative and community-based.

Here's an ASCII-art flow diagram of where you're going

DIAGRAM:
problem statement---->research/discussion---->proposal/draft---->specification
^________________V   ^___________________V   ^______________V

Note that each stage involves iteration. That iteration consists of discussion and feedback and may result in major changes. Do not be afraid to make major changes and please don't get too attached to any particular solutions.

Feel free to explore multiple proposals one after another on the brainstorming page. The goal here is to explore reasonable microformat solutions, including multiple possibilities, alternatives etc.

Naming considerations

DO NOT start with even labeling your effort "hXYZ". This is a very common mistake.

Always start with the general problem area.

Thus name the problem area (*- pages below) generically and specifically avoid starting with code name / brand name like hNewCoolFormat.

Good: product-examples. Bad: hProduct-examples.

After you've iterated your research and brainstorming at least a bit, you've likely gained sufficient understanding of the problem-space that you're solving to start looking at naming considerations.

Pages to create

After a specific problem area (*) has been determined (principle 1), consider creating and filling out the following pages for it, and add the first to exploratory-discussions. If you're unable to come up with material for the pages, then you should probably reconsider whether or not the problem is worth (or ready for) solving.

  1. *-examples Find examples on today's web of the the type of content you think needs a microformat. Document them with URLs. Document the schemas implied by the content examples. This is the action that helps follow principle 3, design for humans first, machines second ... adapt to current behaviors and usage patterns. Start by cloning the examples page and filling it out.
  2. *-formats Find widely adopted interoperable current data formats and standards that attempt to or have attempted to solve the problem previously. Document their explicit schemas. This is necessary prerequisite for following through with principle 4, "reuse building blocks from widely adopted standards".
  3. *-brainstorming Use the current real-world web examples and their implicit schemas to determine an 80/20 as-simple-as-possible (principle 2) generic schema to represent their data. Yes, this means you will explicitly omit some features of some use cases, or perhaps entire use cases which are more edge-cases than representative of larger, aggregate/macro behaviors. See which existing microformats can be reused as building blocks (principle 5, modularity). Use those existing data formats and schemas as a source of names for the fields (principle 4). Consider how would you embed this microformat in other formats (also principle 5, embeddability). See the brainstorming page for a bit more info.

    With an 80/20 schema, and a source of field names, write up one or more straw proposals for a microformat in the *-brainstorming page. Make sure the straw proposals encourage the decentralized distribution of data (principle 6). Postpone the choice of root class name as it often overlaps with the naming of the microformat itself. Always keep close at hand the microformats naming principles when choosing names.

    Brainstorming about the substance of the microformat (its properties and values) should precede naming the microformat itself. Thus after proposals have been written up and are being discussed for the schema, create a naming section for the microformat itself and root class name, where various names can be considered.

  4. ** When it seems like there is rough consensus around one of the brainstorm proposals for a microformat with a specific name(**), write it up as a separate wiki page as a draft specification (see style-guide), and then start creating the following pages to track it.
  5. **-faq There will likely be common questions about the new microformat which can/should be answered in an FAQ page.
  6. **-issues Folks may also raise issues about the microformat which aren't immediately addressable. An issues document helps serves to capture these issues, who raised them, and when, so that folks working on the microformat can be sure to go through and thoroughly answer them.
  7. **-examples Eventually there may be too many real world examples of a microformat to document them in an informative section at the end of the specification, thus the list deserves its own page.
  8. **-implementations Eventually there may be too many implementations of a microformat to document them in an informative section at the end of the specification, thus the list deserves its own page.
  9. **-brainstorming Eventually there will be non-trivial proposals/suggestions/clarifications for changes in the microformats as part of iteration. Create such a format specific brainstorming page for such suggestions.

Moving from Stage to Stage

These stages of development are mirrored on the main page where microformats are divided into "Exploratory Discussions", "Drafts", and "Specifications". Based on feedback we've added a new stage: "Standards".

How do microformats move from one stage to the other?

Exploratory Discussions

Document your problem area, existing research, and brainstorm proposals on the exploratory-discussions page. You should do it on this wiki using current items under exploratory discussion as a guide.

Then send a note to irc to get the attention of others who are interested in microformats. This is probably a good chance to pull in people from outside the current microformats community who may also be experiencing the same issue.

Feedback will probably range the gamut. Others may challenge your problem statement, the need for a microformat, concur, or add. All constructive feedback is good.

As a result of feedback, you may decide to abandon your microformat idea or substantially modify it or add more alternatives to brainstorm proposals to consider.

One thing you want to be sure to do at this stage is to avoid reinventing the wheel.

Are there elemental microformats you can reuse as building blocks? Doing this will save you effort and help you get implemented later because implementers will have less work to do.

Drafts

Once there seems to be rough consensus around a particular brainstorm proposal, including a specific schema and list or properties, write that brainstorm proposal into a draft.

Summary of what is a draft:

  • draft - experimental new microformat
    • preconditions:
      • rough consensus has been reached among the various *-brainstorming proposals
    • actions:
      • determine a name for the new microformat (**)
      • write-up consensus brainstorm into a stand-alone draft
      • create **-issues page for collecting feedback
    • means:
      • ready for publishers to start experimenting with on their public pages
      • ready for consumers to start experimenting with consuming such publishers
      • and both providing feedback for draft iteration accordingly
    • stability:
      • major changes can still occur
      • e.g. add/drop/rename arbitrary features (per research/feedback from publishers/consumers)

Drafts are written up on the wiki using the draft template. Upon writing up a draft, send a note to IRC channel with URL to alert people that something new has happened. Continue encouraging feedback from relevant resources both inside and outside the community. Drafts need to include at least the following:

  • Statements regarding the fact that you will not patent and are adopting appropriate copyright (preferably PUBLIC DOMAIN) as illustrated by current drafts.
  • An XMDP stating explicit property and value names. You may want to place this on a separate wiki page and link to it. In that case use the naming convention *-profile, e.g. hcard-profile.
  • Examples from current practice that show how the microformat would be used. Keep an eye out for how the microformat is actually improving things. If it's not, that might be an indication that you either need to abandon or change a lot.
  • Use of rfc-2119 terms.
  • References that back up your design decisions for the microformat. To the extent possible, you do not want to invent things from whole cloth. This should be relatively easy if you have followed the process so far with proper research.
  • A list of implementations (if any).
  • An issues section for people to feed back to you with detailed objections.

Specifications

You will usually need at least one iteration to get past the draft stage.

By the time something becomes a specification, it should be stable so that developers can pick it up and write to it. This in turn implies that there are at least a couple of implementations.

This in turn implies that there are at least a couple of sites/implementations that have shown interest beyond the creator/editors of the specification.

Summary of what is a specification:

  • specification - a stable and mature draft
    • preconditions:
      • all outstanding **-issues resolved (hit zero issues for a few weeks)
      • 1+ solid real world publisher(s) that
        • are not the editor(s) of the spec (demonstrates some breadth of interest)
        • benefit end users (i.e. not just artificial data published for data's sake)
      • 1+ solid real world consumer(s) that
        • are not the editor(s) of the spec (demonstrates some breadth of interest)
        • successfully consume the microformats published by the publisher(s).
        • provide real world end user benefits / solve use-cases
        • (libraries/opensource are nice as enablers but insufficient for this)
    • actions:
      • update draft to use specification template
      • create **-examples-in-wild, **-implementations pages to collect them
    • means:
      • ready for publishers to start depending on
      • ready for consumers to start depending on
    • stability:
      • minor changes can still occur
      • e.g. dropping of properties and values (e.g. to reach "standard" - see below)
      • iteration based on real-world publisher/consumer experience
    • actions:
      • encourage widespread adoption by all publishers and consumers
      • evaluate properties published by <2 publishers
      • evaluate properties consumed by <2 consumers
        • such properties block advancement from specification to standard.
        • drop such under-supported property(ies) - insufficient market support to keep them. properties which fail in the real world should be and get dropped (per the simplify principle). features may be reconsidered for future versions.
        • or wait for more additional publisher(s)/consumer(s) to support them. up to editor's preference.
    • example microformats to advance:
      • hReview and hAtom can become "specifications" with a bit of work:
        • resolve issues, incorporate into spec updates (0.4, 0.2 respectively)
        • *-examples-in-wild *-implementations pages show publisher(s)/consumer(s) support


Before moving to the specifications section, you must have resolved all outstanding issues.

Then, drop a note to microformats-new noting that all issues have been resolved, and encourage discussion and major objections. Since this may elicit additional feedback, allow some time (editor discretion, 3 weeks suggested), for more issues, resolutions, fixes.

If none are forthcoming, suggest moving the microformat to the specifications area. If there is sufficient explicit positive support from the community to do so, then do so. If not, then leave as draft. The absence of feedback is not approval and should be noted instead as a lack of interest.

Standard

This is a new document stage to represent not only stability, but market acceptance.

Summary of what is a standard:

  • standard - market proven microformat
    • preconditions:
      • test suite with 1+ test per feature (e.g. property) (should be easy)
      • 2+ solid real world publishers as defined above for "specification"
      • 2+ solid real world consumers as defined above for "specification" that:
        • pass the test suite tests
        • interoperably consume microformats published by the publishers.
      • each property is published by 2+ of those publishers
      • each property is consumed by 2+ of those consumers
    • actions:
      • bump the version # to 1.0 (if it isn't there already)
    • means:
      • the market is interoperably publishing and consuming this microformat
    • stability:
      • the microformat is considered stable
      • minor errata expected per real world publishing/consuming experience
    • example microformats to advance:
      • hCard and hCalendar can become "standards" with a bit of work:
        • incorporate resolved issues into 1.0.1 releases (use 1.0.1 due to how long we left them a 1.0)
        • write up remaining missing test cases for a full property (not necessarily value) coverage test suite
        • drop insufficiently supported properties (e.g. "class", "key")

documenting implementation support

Ideally implementation support should be documented with test suite results:

  • implementation test suite report: list all tests and whether an implementation passes them or not
    • these can be added to entries for each implementation in the **-implementations wiki page
  • test suite implementation summary report: list all tests and implementations that pass each one

Claimed implementation: it's also useful to document:

  • what properties/values/features an implementation claims to support
    • with links to supporting documentation from the implementer's site
    • list properties/values an implementation claims to support on the implementation's entry in the **-implementations wiki page

related issues questions regarding document stages

  • if/when does it make sense to demote a microformat spec to a lower level?
    • can a standard be undone? we haven't seen any examples of this, but it is certainly possible that a sufficient implementations/publishers of a standard could disappear until it no longer meets requirements, should (and when should) it be demoted to just a specification or perhaps even a draft?
    • from spec back to draft it's possible that implementations or publishers may disappear, and thus what used to qualify as a specification no longer meets the requirements, when should it be demoted explicitly back to a draft?
      • Can we just mark these as "specification" but also "deprecated"? Possibly with a note that says something along the lines of "this draft is deprecated because of X, Y, Z reasons. Returning this specification to a standard requires A, B, C work". I imagine this situation is most likely to arise in the situation that a problem with the standard arose, preventing or reducing implementations, right? Phae 09:51, 21 September 2011 (UTC)
    • archived (was: undrafts) e.g. there are plenty of drafts that never got any traction, should we have another category for "uninteresting-draft" that means no one other than the editor(s)/author(s) really cared for it, and thus it isn't a priority for the community. maybe we could call these "undrafts" - where drafts go when they don't make it to spec after some amount of time. From there it's probably best to simply use them for more brainstorming, and not encumber any future microformat effort with their legacy. This is likely important to keep the list of drafts as accurate/recent, and will also likely be challenging case-by-case judgment calls.
      • Perhaps rather than ‘undraft’, which is a slightly horrendous word, or ‘uninteresting’ which carries a subjective judgement, we could instead approach this with a kind of archiving strategy. After some period of non-progress/non-involvement/instability a draft becomes ‘Archived’ (or ‘Archived Draft’) to indicate the stagnation. It could become a regular draft again if someone finds it an interesting base. --BenWard 00:48, 20 September 2011 (UTC)
        • I know this is super unimportant in the scheme of things, but in my mind, "archive" also tends to suggest old, if not also uninteresting. Couldn't they just be "Suggested draft" or something that suggests that they're not necessarily naff, they just don't have any friends/evidence/solidified paths yet, but could do if you went and had a look and got on-board with the effort for that draft? Should we have "rejected drafts" for those we really do mean to put down because they've been identified as out of scope etc. Phae 09:51, 21 September 2011 (UTC)
        • I like the balance struck by "Archived" as suggested by Ben Ward. It's fairly neutral while accurately reflecting an apparent lack of interest. "Suggested draft" seems like a stronger endorsement than is deserving, "suggested" that is, seems wrong for something that has clearly been rejected by the community/market. Let's go with "Archived", until/unless someone suggests something better. Tantek 00:00, 4 August 2012 (UTC)

Other Documents

Patterns

What about other types of pages on the wiki, like "patterns" (e.g. include-pattern)?

How do those get created?

The short explanation is this:

The patterns are not formats at all. They do not stand on their own. They are merely documentation of pieces of other formats that are expected to (or are) being reused by other formats.

The real world examples for includes in particular were documented in the context of resume-examples, resume-formats, and resume-brainstorming (as noted at the very top of the document) "Initially developed as part of resume-brainstorming..." Later, real world examples for reviews were found to need the include pattern as well.

When real world examples for multiple microformats require the solution of the same or a similar problem, then it is worth exploring the creation of a pattern that solves the problem across microformats.

Other Groups

Aspects of the microformats process have either inspired and/or been used in their entirety by other groups working on standards/stanardization of features and formats.


Related