microformats.org at 11

Thanks to Julie Anne Noying for the meme birthday card.

10,000s of microformats2 sites and now 10 microformats2 parsers

The past year saw a huge leap in the number of sites publishing microformats2, from 1000s to now 10s of thousands of sites, primarily by adoption in the IndieWebCamp community, and especially the excellent Known publishing system and continually improving WordPress plugins & themes.

New modern microformats2 parsers continue to be developed in various languages, and this past year, four new parsing libraries (in three different languages) were added, almost doubling our previous set of six (in five different languages) that brought our year 11 total to 10 microformats2 parsing libraries available in 8 different programming languages.

microformats2 parsing spec updates

The microformats2 parsing specification has made significant progress in the past year, all of it incremental iteration based on real world publishing and parsing experience, each improvement discussed openly, and tested with real world implementations. The microformats2 parsing spec is the core of what has enabled even simpler publishing and processing of microformats.

The specification has reached a level of stability and interoperability where fewer issues are being filed, and those that are being filed are in general more and more minor, although once in a while we find some more interesting opportunities for improvement.

We reached a milestone two weeks ago of resolving all outstanding microformats2 parsing issues thanks to Will Norris leading the charge with a developer spec hacking session at the recent IndieWeb Summit where he gathered parser implementers and myself (as editor) and walked us through issue by issue discussions and consensus resolutions. Some of those still require minor edits to the specification, which we expect to complete in the next few days.

One of the meta-lessons we learned in that process is that the wiki really is less suitable for collaborative issue filing and resolving, and as of today are switching to using a GitHub repo for filing any new microformats2 parsing issues.

more microformats2 parsers

The number of microformats2 parsers in different languages continues to grow, most of them with deployed live-input textareas so you can try them on the web without touching a line of parsing code or a command line! All of these are open source (repos linked from their sections), unless otherwise noted. These are the new ones:

Elixir – created by ckruse.
Haskell – created by myfreeweb
Java (2)

The Java parsers are a particularly interesting development as one is part of the upgrade to Apache Any23 to support microformats2 (thanks to Lewis John McGibbney). Any23 is a library used for analysis of various web crawl samples to measure representative use of various forms of semantic markup.

The other Java parser is mf2j, an early-stage Java microformats2 parser, created by Kyle Mahan.

The Elixir, Haskell, and Java parsers add to our existing in-development parser libraries in Go and Ruby. The Go parser in particular has recently seen a resurgence in interest and improvement thanks to Will Norris.

These in-development parsers add to existing production parsers, that is, those being used live on websites to parse and consume microformats for various purposes:

Javascript (cross-browser client-side, and Node)
PHP
Python

As with any open source projects, tests, feedback, and contributions are very much welcome! Try building the production parsers into your projects and sites and see how they work for you.

Still simpler, easier, and smaller after all these years

Usually technologies (especially standards) get increasingly complex and more difficult to use over time. With microformats we have been able to maintain (and in some cases improve) their simplicity and ease of use, and continue to this day to get testimonials saying as much, especially in comparison to other efforts:

…hmm, looks like I should use a separate meta element: https://schema.org/startDate .

Man, Schema is verbose. @microformats FTW!

On the broader problem of schema.org verbosity (no matter the syntax), Kevin Marks wrote a very thorough blog post early in the past year:

Microformats 2 and Schema 2015-06-30

More testimonials:

I still prefer @microformats over microdata

* * *

@microformats are easier to write, easier to maintain and the code is so much smaller than microdata.

* * *

I am not a big fan of RDF, semanticweb, or predefined ontologies. We need something lightweight and emergent like the microformats

This last testimonial really gets at the heart of one of the deliberate improvements we have made to iterating on microformats vocabularies in particular.

evolving h-entry

We have had an implementation-driven and implementation-tested practice for the microformats2 parsing specification for quite some time.

More and more we are adopting a similar approach to growing and evolving microformats vocabularies like h-entry.

We have learned to start vocabularies as minimal as possible, rather than start with everything you might want to do. That “start with everything you might want” is a common theory-first approach taken by a-priori vocabularies or entire “predefined ontologies” like schema.org’s 150+ objects at launch, very few of which (single digits?) Google or anyone bothers to do anything with, a classic example of premature overdesign, of YAGNI).

With h-entry in particular, we started with an implementation filtered subset of hAtom, and since then have started documenting new properties through a few deliberate phases (which helps communicate to implementers which are more experimental or more stable)

Proposed Additions – when someone proposes a property, gets some sort of consensus among their community peers, and perhaps one more person to implementing it in the wild beyond themselves (e.g. as the IndieWebCamp community does), it’s worth capturing it as a proposed property to communicate that this work is happening between multiple people, and that feedback, experimentation, and iteration is desired.
Draft Properties – when implementations begin to consume proposed properties and doing something explicit with them, then a postive reinforcement feedback loop has started and it makes sense to indicate that such a phase change has occured by moving those properties to “draft”. There is growing activity around those properties, and thus this should be considered a last call of sorts for any non-trivial changes, which get harder to make with each new implementation.
Core Properties – these properties have gained so much publishing and consuming support that they are for all intents and purposes stable. Another phase change has occured: it would be much harder to change them (too many implementations to coordinate) than keep them the same, and thus their stability has been determined by real world market adoption.

The three levels here, proposed, draft, and core, are merely “working” names, that is, if you have a better idea what to call these three phases by all means propose it.

In h-entry in particular, it’s likely that some of the draft properties are now mature (implemented) enough to move them to core, and some of the proposed properties have gained enough support to move to draft. The key to making this happen is finding and citing documentation of such implementation and support. Anyone can speak up in the IRC channel etc. and point out such properties that they think are ready for advancement.

How we improve moving forward

We have made a lot of progress and have much better processes than we did even a year ago, however I think there’s still room for improvement in how we evolve both microformats technical specifications like the microformats2 parsing spec, and in how we create and improve vocabularies.

It’s pretty clear that to enable innovation we have to ways of encouraging constructive experimentation, and yet we also need a way of indicating what is stable vs in-progress. For both of those we have found that real world implementations provide both a good focusing mechanism and a good way to test experiments.

In the coming year I expect we will find even better ways to explain these methods, in the hopes that others can use them in their efforts, whether related to microformats or in completely different standards efforts. For now, let’s appreciate the progress we’ve made in the past year from publishing sites, to parsing implementations, from process improvements, to continuously improving living specifications. Here’s to year 12.

Originally published at: tantek.com/2016/173/b1/microformats-org-at-11.