From Microformats Wiki
Jump to navigation Jump to search

The Problem

HTML has long used the meta tag for metadata to describe the contents of a document. While this works well for "intrinsic" metadata related to authoring (e.g, category, copyright), there's no equivalent function for "extrinsic" metadata provided by the server or external sources (e.g., last accessed time, size, alternate representations).

To address this need, [WebDAV] defined a new set of "PROP" methods to create, search, and retrieve properties. Unfortunately, in addition to defining a whole new protocol this violates the [rest]ful notion of each resource having a URL for manipulating it. This raises the question, "What is the RESTful way to use HTML and HTTP to provide useful properties?"



Our proposal, currentlly called "relProperty", is motivated by the following principles:

  1. Every property must have at least one well-defined URL that points to it
  2. That URL must be useable to both retrieve and update that property
  3. There must be an easy way to discover all the properties associated with a given document.
  4. It must be simple to implement on existing web servers without requiring non-trivial modifications
  5. It should respect and build on existing microformat principles and practices
  6. It should be consistent with URL rest/opacity (properly understood)

While other systems (e.g., RDF) nominally attempt to solve similar problems, the advantage of a microformats approach are:

  • no new schemas or syntax to learn
  • easily embedded in standard, validated HTML
  • trivially accessed/updated via existing HTTP methods


Together, this implies that that the optimal way to associate a property with a document is via the HTML link tag (or the equivalent, if somewhat deprecated, HTTP Link: header). This provides the requisite mechanism for telling the client how to construct an appropriate URL for getting or setting each property, as in:

<link rel="property" href=".;prop1">

This syntax assumes precisely one property per link statement, though it may be possible/desireable to "chain" multiple property declarations into a single statement.


Since individual properties must each be explicitly specified by a link, they can in principle be expressed using any notation whatsoever. However, we here propose a convention for how to describe, chain, and hierarchicalize properties by using thesemicolon (";") as the first character of each property. This follows the convention used in, e.g. ColdFusion, and eases human-readability.

The proposed convention is as follows are:

  1. Each and every ';' in the URL indicates the beginning of a property
  2. Thus, multiple properties can be unambiguously chained into a single URL, though this SHOULD NOT be done unless the properties in question are very closely coupled (e.g. ";month=10;day=2")
  3. Within a given property, the user is free to define their own namespace. To avoid confusion with HTTP hierarchies, we recommned the use of '.' as an internal separator. However, due to the above rule, this does mean that the full namespace must be explicitly spelled out for every property

Response format

Unless otherwise specified, we recommend that property queries be returned as type "html", using XOXO encoding to return multiple or composite values.


primary document, returns HTML
get property, returns "<ol><li>2006T...</li></ol>"
set property "; owner" to the value "John Doe"
set two properties, ";status" and ";lastTested"

Open Issues

  • Is ".;prop1" the right 'href' syntax for a relative property?
  • Are those the right rules for chaining? Do we even need to worry about namespace/hierarchy?
  • Should we not allow multiple updates? If we do, should we just make "&" the separator instead of ';', for clarity
  • Would we need to worry about semicolon exploits?
Probably not in general, but any implementation which uses a database-style back end (vs. flat files) would need to guard against injection attacks. The ";" itself isn't the problem at all, expect for the fact that it was used as a "escape sequence" by some applications, with the terminal string fed directly into a database (bad!).
  • Are there other conventions we should follow/avoid?
  • Should the "Link:" tag itself be declared in HTTP "OPTIONS"?
  • How does this relate to RDF? Could a normative RDF schema be mapped into relProperty? Or just vice versa?