Blog Archive for the 'News' Category

Improving the php-mf2 parser

During the past year, the popular php-mf2 microformats parser has received quite a few improvements. My site runs ProcessWire and one of the plugins for it uses php-mf2, so I have been spending some time on it.

My own experience with microformats started when I discovered the hCard microformat. I was impressed with the novelty of adding some simple HTML classes around contact information and having a browser extension parse it into an address book. Years later, when I started to get involved in the IndieWeb community, I learned a lot more about microformats2 and they became a key building block of my personal site.

php-mf2 is now much better at backwards-compatible parsing of microformats1. This is important because software should be able to consistently consume content whether it’s marked up with microformats1, microformats2, or a combination. An experimental feature for parsing language attributes has also been added. Finally, it’s now using the microformats test suite. Several other parsers use this test suite as well. This will make it easier to catch bugs and improve all of the different parsers.

php-mf2 is a stable library that’s ready to be installed in your software to start consuming microformats. It is currently used in Known, WordPress plugins, and ProcessWire plugins for richer social interactions. It’s also used in tools like XRay and microformats.io. I’m looking forward to more improvements to php-mf2 in the coming year as well as more software using it!

Original published at: https://gregorlove.com/2017/06/improving-the-php-mf2-parser/

Evolving for 12 Years

For the 12th birthday of microformats.org (congratulations!) Tantek asked the community if any of us would like to highlight whatever we liked in a guest post. I am taking this opportunity to talk about my favourite feature of microformats: its constant evolvement.

Sometimes it feels like a standard is done. Sometimes it feels like a standard is abandoned before its time. In a few special cases a standard keeps evolving. I think we can agree that microformats goes in the latter category. This is hugely thanks to the fact that anyone can help it grow.

As you read this, work is being done to upgrade h-event from a Draft to a full Specification. This prompted a few of us to have a look at what people are doing with the format. As it turns out: it has departed from the Draft!

The IndieWeb community is putting events in their feeds, interleaving them with other items (like blog posts) that use h-entry. To make the events fit within this context properties are being copied over from h-entry, properties completely new to h-event. Somehow these separate implementations introduced the same properties, showing how h-event is evolving quicker than its Draft Specification without splintering it in lots of different versions. Naturally evolving the format forwards!

Then there are the small, fringe changes. Work on pronouns in h-cards has been mostly dormant since 2015. I spent time with it during IndieWebCamp Nuremberg and came to a completely different conclusion on how to mark-up my pronouns. The beauty there is that anyone can do the same! All it takes is to put something on your site, like the IndieWeb community did with h-event, and tell the world about this piece of extra information they now have access to.

Here is to one more year of constantly tinkering with our HTML and giving more meaning to the information we publish 🥂

microformats.org at 11

ERMERGERD!!! (excited girl holding a large microformats logo) HERPER BERTHDER!!!
Thanks to Julie Anne Noying for the meme birthday card.

10,000s of microformats2 sites and now 10 microformats2 parsers

The past year saw a huge leap in the number of sites publishing microformats2, from 1000s to now 10s of thousands of sites, primarily by adoption in the IndieWebCamp community, and especially the excellent Known publishing system and continually improving WordPress plugins & themes.

New modern microformats2 parsers continue to be developed in various languages, and this past year, four new parsing libraries (in three different languages) were added, almost doubling our previous set of six (in five different languages) that brought our year 11 total to 10 microformats2 parsing libraries available in 8 different programming languages.

microformats2 parsing spec updates

The microformats2 parsing specification has made significant progress in the past year, all of it incremental iteration based on real world publishing and parsing experience, each improvement discussed openly, and tested with real world implementations. The microformats2 parsing spec is the core of what has enabled even simpler publishing and processing of microformats.

The specification has reached a level of stability and interoperability where fewer issues are being filed, and those that are being filed are in general more and more minor, although once in a while we find some more interesting opportunities for improvement.

We reached a milestone two weeks ago of resolving all outstanding microformats2 parsing issues thanks to Will Norris leading the charge with a developer spec hacking session at the recent IndieWeb Summit where he gathered parser implementers and myself (as editor) and walked us through issue by issue discussions and consensus resolutions. Some of those still require minor edits to the specification, which we expect to complete in the next few days.

One of the meta-lessons we learned in that process is that the wiki really is less suitable for collaborative issue filing and resolving, and as of today are switching to using a GitHub repo for filing any new microformats2 parsing issues.

more microformats2 parsers

The number of microformats2 parsers in different languages continues to grow, most of them with deployed live-input textareas so you can try them on the web without touching a line of parsing code or a command line! All of these are open source (repos linked from their sections), unless otherwise noted. These are the new ones:

The Java parsers are a particularly interesting development as one is part of the upgrade to Apache Any23 to support microformats2 (thanks to Lewis John McGibbney). Any23 is a library used for analysis of various web crawl samples to measure representative use of various forms of semantic markup.

The other Java parser is mf2j, an early-stage Java microformats2 parser, created by Kyle Mahan.

The Elixir, Haskell, and Java parsers add to our existing in-development parser libraries in Go and Ruby. The Go parser in particular has recently seen a resurgence in interest and improvement thanks to Will Norris.

These in-development parsers add to existing production parsers, that is, those being used live on websites to parse and consume microformats for various purposes:

As with any open source projects, tests, feedback, and contributions are very much welcome! Try building the production parsers into your projects and sites and see how they work for you.

Still simpler, easier, and smaller after all these years

Usually technologies (especially standards) get increasingly complex and more difficult to use over time. With microformats we have been able to maintain (and in some cases improve) their simplicity and ease of use, and continue to this day to get testimonials saying as much, especially in comparison to other efforts:

…hmm, looks like I should use a separate meta element: https://schema.org/startDate .

Man, Schema is verbose. @microformats FTW!

On the broader problem of schema.org verbosity (no matter the syntax), Kevin Marks wrote a very thorough blog post early in the past year:

More testimonials:

I still prefer @microformats over microdata

* * *

@microformats are easier to write, easier to maintain and the code is so much smaller than microdata.

* * *

I am not a big fan of RDF, semanticweb, or predefined ontologies. We need something lightweight and emergent like the microformats

This last testimonial really gets at the heart of one of the deliberate improvements we have made to iterating on microformats vocabularies in particular.

evolving h-entry

We have had an implementation-driven and implementation-tested practice for the microformats2 parsing specification for quite some time.

More and more we are adopting a similar approach to growing and evolving microformats vocabularies like h-entry.

We have learned to start vocabularies as minimal as possible, rather than start with everything you might want to do. That “start with everything you might want” is a common theory-first approach taken by a-priori vocabularies or entire “predefined ontologies” like schema.org’s 150+ objects at launch, very few of which (single digits?) Google or anyone bothers to do anything with, a classic example of premature overdesign, of YAGNI).

With h-entry in particular, we started with an implementation filtered subset of hAtom, and since then have started documenting new properties through a few deliberate phases (which helps communicate to implementers which are more experimental or more stable)

  1. Proposed Additions – when someone proposes a property, gets some sort of consensus among their community peers, and perhaps one more person to implementing it in the wild beyond themselves (e.g. as the IndieWebCamp community does), it’s worth capturing it as a proposed property to communicate that this work is happening between multiple people, and that feedback, experimentation, and iteration is desired.
  2. Draft Properties – when implementations begin to consume proposed properties and doing something explicit with them, then a postive reinforcement feedback loop has started and it makes sense to indicate that such a phase change has occured by moving those properties to “draft”. There is growing activity around those properties, and thus this should be considered a last call of sorts for any non-trivial changes, which get harder to make with each new implementation.
  3. Core Properties – these properties have gained so much publishing and consuming support that they are for all intents and purposes stable. Another phase change has occured: it would be much harder to change them (too many implementations to coordinate) than keep them the same, and thus their stability has been determined by real world market adoption.

The three levels here, proposed, draft, and core, are merely “working” names, that is, if you have a better idea what to call these three phases by all means propose it.

In h-entry in particular, it’s likely that some of the draft properties are now mature (implemented) enough to move them to core, and some of the proposed properties have gained enough support to move to draft. The key to making this happen is finding and citing documentation of such implementation and support. Anyone can speak up in the IRC channel etc. and point out such properties that they think are ready for advancement.

How we improve moving forward

We have made a lot of progress and have much better processes than we did even a year ago, however I think there’s still room for improvement in how we evolve both microformats technical specifications like the microformats2 parsing spec, and in how we create and improve vocabularies.

It’s pretty clear that to enable innovation we have to ways of encouraging constructive experimentation, and yet we also need a way of indicating what is stable vs in-progress. For both of those we have found that real world implementations provide both a good focusing mechanism and a good way to test experiments.

In the coming year I expect we will find even better ways to explain these methods, in the hopes that others can use them in their efforts, whether related to microformats or in completely different standards efforts. For now, let’s appreciate the progress we’ve made in the past year from publishing sites, to parsing implementations, from process improvements, to continuously improving living specifications. Here’s to year 12.

Originally published at: tantek.com/2016/173/b1/microformats-org-at-11.

microformats.org turns 9 — upgrade to microformats2 and more

microformats logo
Nine years ago we launched microformats.org with a basic premise: that it is possible to express meaning on the web in HTML in a simple way—far simpler than the complex alternatives (XML) being promoted by mature companies and standards organizations alike.

Today microformats.org continues to be a gathering place for those seeking simpler ways to express meaning in web pages, most recently the growing IndieWeb movement.

Looking back nine years ago, none of the other alternatives promoted in the 2000s (even by big companies like Google and Yahoo) survive to this day in any meaningful way:

From this experience, we conclude that what large companies support (or claim to prefer) is often a trailing indicator (at best).

Large companies tend to promote more complex solutions, perhaps because they can afford the staff, time, and other resources to develop and support complex solutions. Such approaches fundamentally lack empathy for independent developers and designers, who don’t have time to keep up with all the complexity.

If there’s one value that’s at the heart of microformats’ focus and continued evolution of simplicity, it is that empathy for independent developers and designers, for small consulting shops, for curious hobbyists who are most enabled and empowered by the simplest possible solutions to problems.

We now know that no amount of large company marketing and evangelism can make up for a focus on ever simpler solutions which take less time to learn, use, and reliably maintain. As long as we focus on that, we will create better solutions.

Community Changes

Speaking of taking less time, we’ve learned some community lessons about that too. Perhaps the most important is that as a community we are far more efficiently productive using just IRC and the wiki, than any amount of use of email. In fact, the microformats drafts that were developed wtih the most email (e.g. hAudio) turned out to be the hardest to follow and discuss (too many long emails), and sadly ended up lacking the simplicity that real world publishers wanted (e.g. last.fm).

Email tends to bias design and discussions towards those who have more time to read and write long emails, and (apparently) enjoy that for its own sake, than those who want to quickly research & brainstorm, and get to actually creating, building, and deploying things with microformats.

Thus we’re making these changes effective today:

  • IRC for all microformats discussions, whether research, questions, or brainstorming
  • email only for occasional announcements and to direct people to IRC.
  • wiki for capturing questions, brainstorming, conclusions, and different points of view

We’re going to update the site to direct all discussion (e.g links) to the IRC channel accordingly.

Hope to see you there: #microformats on irc.freenode.net

Upgrading to microformats2

Over the past few years microformats2 has proven itself in practice, with numerous sites both publishing and consuming, several open source parsing libraries, and a growing test suite. All the lessons learned from the evolution from original microformats, from RDFa, and from microdata have been incorporated into microformats2 which is now the simplest to both publish and parse.

It’s time to throw the switch and upgrade everything to microformats2. This means three things:

Upgrading microformats.org

First, we’re starting by upgrading the links on the microformats.org home page to point to the microformats2 drafts, which are ready for use.

We’ll be incrementally upgrading the markup of the microformats.org site itself to use microformats2 markup.

Upgrade sites

Second, if you publish any kind of semantic information, start upgrading your web pages to microformats2 across the board.

If you’re concerned about what search engines claim to support, there are two approaches to choose from:

  1. Know that search engines are a trailing indicator, and as microformats2 usage grows, they’ll index it as well.
  2. Or: Use one classic microformat (supported by all major search engines) at top of your page, e.g. on the <body>, in addition to your microformats2 markup throughout your pages. Search engines only really care to summarize the primary topic or purpose of a web page in their “rich snippets” or “cards”, and thus that’s sufficient.

Check out the latest validators which now include some microformats2 support as well!

Upgrade tools

Third, this is a call to upgrade all microformats supporting tools to microformats2. As nearly all of these are open source, this is an open call for contributions, updates, patches, etc. for:

If it generates microformats, upgrade it to instead generate microformats2.

If it consumes microformats, upgrade it to also consume microformats2 (which may be most easily done by making use of one of the microformats2 parsers that has backward compatible parsing built in).

10th Year Goal

As we enter the tenth year of microformats.org let’s make it our collective goal to upgrade our pages, our sites, and our tools to microformats2.

Our goal is to complete all the above upgrades by microformats.org’s tenth birthday, if not sooner. Let’s get to work.

Thanks to Barnaby Walters and fellow microformats admins Rohit Khare, Kevin Marks, & Ted O’Connor for reviewing drafts of this post. Thanks to Kevin especially for some copy edits!

This post was originally posted on tantek.com.

Getting Started With microformats2

Classic microformats have been serving the web community’s need to extend HTML’s expressive power since 2004. Through an evolutionary, open, rigorous community process and human-first design principles, structured use of the class and rel attributes have paved the cowpaths of publishing data about people, places, events, reviews, products and more.

Microformats2 is the next big effort by the community to improve how microformats are authored, parsed and defined. Version two has multiple working open source implementations which independents are using in production and is easier to publish and consume than ever.

In this series of guides I’ll show you how to be the next site publishing and consuming microformats.

You can see how a microformats parser sees your markup by pasting any of the code samples below into this php-mf2 sandbox. Go ahead and experiment with adding more properties and see what happens!

Incremental Steps

In order to demonstrate some of the differences between microformats 2 and Classic Microformats/other competing technologies, I’ll use the process of content-out markup — going from plain text to HTML and finally adding a sprinkling of microformats. Let’s start with my favourite example: mentioning a person.

As plain text:

Barnaby Walters 

With HTML:

<a href="http://waterpigs.co.uk">Barnaby Walters</a> 

With classic microformats:

<span class="vcard"><a class="fn n url" href="http://waterpigs.co.uk">Barnaby Walters</a></span>

That’s 37 extra characters and a whole extra nested element just to say “This link is to a person”, not to mention the strangely named root classname (vcard? I thought this was an hcard?) and multiple cryptic fn n classnames. Competing technologies are typically even longer and messier.

With microformats 2 this all becomes much simpler:

<a class="h-card" href="http://waterpigs.co.uk">Barnaby Walters</a> 

Weighing in at only 15 characters, this is quicker to type, easier on the eyes and easier to remember.

There are two fundamental changes in microformats 2 which make this helium-esque lightness possible: Implied Properties and prefixed classnames.

Implied Properties

When you give class=h-card to an element, you’re saying “This element represents a person”. In many cases the element will be simple; just a name, perhaps with a link or photo. Why should you add extra elements and classnames just to tell a dumb computer which bit is the name, URL or photo URL when that information is already expressed by the markup?

Implied properties save you from this tedium. When you specify an element as an h-card without explicitly defining which parts are the name, url or photo url, the parser will figure out what you meant. And it’s not just for h-cards either — thanks to the new generic parsing in microformats 2, this shorthand works for any microformat.

Prefixed Classnames

Classic microformats used plain classnames which looked like any other (e.g. vcard, n or note). There were a few problems with this — classnames would clash, cause false positives or be thrown away by developers who weren’t microformats-aware (“these classnames aren’t doing anything!”).

This also meant parsers were tricky to write, as each one had to maintain a long list of classnames used by each microformats, resulting in many parsers quickly going out of date.

Prefixing classnames solves both of these problems: semantic microformats2 classnames are set apart from styling hooks, and parsers can figure out which classnames to look for, cutting down on maintenance. There are 5 prefixes:

  • h-* root classnames specify that an element is a microformat, e.g. <span class="h-card">
  • p-* specifies an element as a plain-text property, e.g. <span class="p-name">My Name</span>
  • u-* parses an element as a URL, e.g. <a class="u-url" href="/"></a>
  • dt-* parses an element as a date/time, e.g. <time class="dt-published" datetime="2013-05-02 12:00:00" />
  • e-* parses an element’s whole inner HTML, e.g. <div class="e-content">

I’ll demonstrate the use of all of these prefixes with some real-world examples.

Firstly, another h-card, more fleshed out than the earlier example. This might be the sort that you put on your homepage:

<div class="h-card">
  <p><img class="u-photo" href="/me.png" alt="" /></p>
  <p class="p-name">
    <a href="u-url" href="http://waterpigs.co.uk">Barnaby Walters</a>
  </p>
</div>

p-name, u-url and u-photo are fairly standard properties you’ll see over and over again. Another improvement in microformats 2 is increasing consistency between different microformat specifications — again, making them easier to authors to remember and consumers to understand. A nice side effect is that a single element can be more than one type of microformat at once — for example a h-entry and h-review.

To demonstrate dt-* and e-*, here’s a note (like a tweet or short blog post), marked up using h-entry:

<article class="h-entry">
  <div class="e-content p-summary p-name">
    <p>Just writing a guide to using <a href="http://microformats.org/wiki/microformats-2">
microformats-2</a></p>
  </div>
  <time class="dt-published" datetime="2013-05-01 12:00:00">20 minutes ago</time> 
</article> 

Here, I want the HTML inside the content to be passed to the parser, so I mark it up as e-*. Notice I’m also specifying that content as the summary and name — one element can be parsed as multiple properties.

I’m using the time element to mark up the time this note was published. Because I’ve prefixed the classname with dt- and it’s on a time element, parsers know to look in the datetime attribute if it exists.

Combining Microformats

The third area in which microformats2 improves on the previous version is in combining microformats and making each microformat specification more reusable — for example, both a person or an event might have an address, so it makes sense to reuse the same markup for both.

There are many reasons to combine microformats — say you want to specify the author of a blog post or review. You would do so by making the p-author of the post an h-card:

<article class="h-entry">
  <p class="p-author h-card">Barnaby Walters</p>
  <p class="p-content">Blah blah blah</p>
  … 

Or a comment on an article, via h-cites as p-comments:


  …
  <article class="p-comment h-cite">
    <p class="p-author h-card">Jón Jónsson</p>
    <p class="p-summary">Woah that’s insightful.</p>
    <p><a class="u-url" href="http://jonsson.com/replies/1">
      <time class="dt-published" datetime="2014-03-01T14:00:25+00:00>
        2014-03-01 14:00
      </time>
    </a></p>
  </article>
</article> 

The reason comments are h-cites instead of h-entrys is that h-entry implies syndication — it’s something you’ve posted, or have re-posted, whereas a comment is a reference to a post on another site.

Or the address of a person or event, using h-adr:

<div class="h-event">
  <h1 class="p-name">Microformats Meetup</h1>
  <p>Join us at <b class="p-adr h-adr">
    <span class="p-street-address">Some Bar</span>,
    <span class="p-locality">Someplace</span></b>
  </p>
</div> 

Further Reading

Hopefully this overview tickled your interest and gave you a firm foundation from which to base further investigation. To learn more about the topics covered in this post, see the following URLs:

This article reposted from Getting Started With microformats2 on waterpigs.co.uk, and is the first of a series covering microformats2. Be sure to follow my articles feed to be notified about the others. I also post notes about microformats quite often. They’re syndicated to my twitter account too.