XMDP Brainstorming

From Microformats Wiki
Jump to navigation Jump to search


introduction

Tantek Çelik developed XMDP to define extensions to XHTML including rel values, class names, and <meta name> properties and values. Per the XMDP spec, a link to a microformat's XMDP in the profile attribute of head element indicates that that microformat's vocabulary is formally defined in the document. A parser could read the allowed attribute values from the linked XMDP and thus know explicitly which microformats may be in use, and which class names are meant to convey which meanings.

This page is for exploring possible additions / extensions to XMDP, contributed by numerous folks in the microformats community.

See xmdp-faq and xmdp-issues for questions and issues.

Some of the below are probably better addressed as questions and/or issues and should be moved to those pages accordingly. -- Tantek

requests from TimBL

At the 2009 W3C Technical Plenary I (Tantek) had a conversation with Tim Berners-Lee about what he would like to see in XMDP to enable rich(er) translation into RDFSchema (RDFS).

The following subsections represent my notes on specific asks/requests/feedback from Tim. Tantek 01:16, 5 November 2009 (UTC)

labels

  • labels are useful for multiple languages
  • "fn" - is a property name
  • rdfs:label would be "formatted name" - but not a long explanation, e.g. also "nom" in French
  • ok to use existing HTML "lang" attribute and standard language codes
  • XMDP should offer labels for terms, with labels in specific (human) languages

serve RDFS using conneg

It would be useful/nice if requests to microformats profiles, e.g. http://microformats.org/profile/hcard - if made with the Accept header requesting the mime type of RDFS (conneg / content negotiation), would be returned as an automatic translation (perhaps using XSLT) of the XMDP to RDFS.

aliasing

TimBL likes to be able to say this term is the same as this other term.

atomic types

It would be useful to specify the atomic type of a microformats property, e.g. one of the following:

  • datetime
  • url/email
  • number/fixed
  • string

TimBL also suggested location lat/long/altitude, however that's more of a composite type (e.g. geo) that is made of multiple atomic types

Possible XMDP Additions

resolving when microformats may be in use

Currently the potential existence of microformats in a document can be declared by referencing the profile URLs for those microformats in the profile attribute of the head element of that document.

In addition to the profile attribute, the rel-profile value is being strongly considered for inclusion in an update to XMDP. See the rel-profile page for details.

In short: another way would be to include the <a rel="profile" href="XMDP URL">powered by microformat xyz</a> within the container element for the microformat. The XMDP spec could then specify that when the <a> element is used in this way, it indicates that the microformat is used by the element containing the <a> element.

Issues:

  • Not every microformat has a container element. Consider rel-tag one of the most widely used microformats.
    • RESOLVED. This is easily resolved by having the context of the rel-profile be the parent of the element with rel-profile and descendants, or perhaps latter siblings of the element with rel-profile and their descendants.
  • To some extent, using microformats adds to the size of the document, just as using markup adds to the size of a plain text document. Putting <a> elements with each microformat adds unwanted links on top of that.
    • RESOLVED. There is no need to add an <a> for each instance of a microformat, as the profile for a microformat can be declared once, perhaps near the top of the body of the document. In practice, many pages that use microformats already link to the microformats specs themselves with badges or "powered by" links which could easily be modified to link to profiles using <a rel="profile"> hyperlinks, no additional links needed.

root class name identification

Use-case:

It could be quite convenient for "generic/universal" microformat parsers if they could read an XMDP profile and understand which of the defined class names were root class names for microformats, and thus be able to distinguish those object boundaries.

XMDP profiles can and do contain definitions for multiple root class names (e.g. http://microformats.org/wiki/hcard defines "vcard", "adr", and "geo").

possible solutions

XMDP definition flag

Introduce some sort of markup or textual flag that indicates inside an XMDP definition (<dd>) for a class name that the class name may be used as a root class name.

rejected solutions

first class name defined in a profile

One simple thought would be that the first class name defined in a profile (e.g. hcard-profile) is the root class for that microformat.

Critical problem(s):

  • Does not handle the case of multiple root class names in an XMDP. E.g. a microformat that defines multiple possible root class names (e.g. hCalendar permits "vcalendar" or "vevent", hAtom permits "hfeed" or "hentry").
publisher linking to root class name

The author including a reference to the XMDP could link directly to the root class name.

<!-- This profile link indicates that "vcard" is a root class name. -->
<head profile="http://www.w3.org/2006/03/hcard#vcard">

Critical problem(s):

  • The problem is this moves the information of what is the root class to perhaps one of the worst places, which is in every reference to the XMDP, whereas the XMDP itself should be defining what is a root class.
publisher inline additional class name

Another possibility that may be worth exploring, is the ability to indicate inline in the code that a class name is the root class name for a microformat, rather than (or perhaps in addition to) the XMDP.

E.g.

<span class="vcard ufroot">
 <span class="fn">Tantek Çelik</fn>
</span>

would indicate that the element with classname of "vcard" is the root of a microformatted piece of information.

Critical problem(s):

  • The problem is this moves the information of what is the root class to perhaps one of the worst places, which is in every instance of the microformat, whereas the XMDP itself should be defining what is a root class.

Possible drawbacks:

  • How would you know which class name (other than "ufroot") was the root class name? e.g.
    class="vcard person ufroot"
    • perhaps by only looking at classes defined in the XMDPs for the document.
    • perhaps by only allowing one root class name in addition to the "ufroot"
    • or perhaps by saying that all of the other class names in the same attribute are root class names (so that for example you could say:
      <span class="root hreview hentry">

This is also very similar to, but not the same as, the mfo problem, and should be considered in that context as an independent solution.

linking to the XMDP

As hinted in the note on "when microformats may be in use", there are additional methods under discussion for linking to the XMDP in addition to the current method of using the profile attribute of the head element:

  • Using <link rel="profile" href="link to XMDP"/>. This method can be used now and will be formalized in XHTML 2.
    • A problem with this method is that it (still) requires access to the head element.
  • Using <a rel="profile" href="link to XMDP">powered by microformat xyz</a> in the body of the document.
    • As noted by a number of people, this approach has the added benefit of creating a viral marketing opportunity for the microformats used. For instance, developers could add badges saying they are using microformat xyz as suggested by the example.
    • Blog authoring environments allow you to insert links at will, so this squarely obviates the need to access the head element.

includes / aggregate profiles

Methods for including one or more values, properties, or an entire XMDP into an other XMDP as a way of creating an aggregate profile that effectively contains definitions from multiple profiles would be quite useful. They would enable documents with microformats to simply refer to a single profile URL rather than a complete space separated set of all the profile URLs of the microformats that may be in use.

vocabulary aliasing

An XMDP document could be used to define a microformat profile that is nothing more than a simple dictionary mapping between an existing, non-standard set of HTML classes and the terms in a standard microformat profile. This would allow a publisher to support a given microformat by merely using the URI of a new profile document as the value of an individual document's head/profile attribute, rather than modifying the individual class values throughout each document to conform to an existing profile. Initial suggestion with use case description in this microformats-discuss post. Note (from Kevin's response) that HTML class attributes can contain multiple values, e.g. class="post hentry", so a publisher doesn't have to discard their existing class values to use those of a microformat.

subclassing / ontology addition

One may want to introduce a new property (or value) and base it on an existing property (or value). In this sample XMDP, the value "self" is defined, based on the value "me" from XFN 1.1:


<dl class="rel">
  <dt id='self'><a href="http://www.gmpg.org/xfn/11#me" rev="extends">self</a></dt>
   <dd>This is a pointer to me, it extends the "me" value of XFN</dd>
</dl>

There are two interesting pieces that have been added, a URL with an anchor to another XMDP profile and a rev attribute. The rev value in this example is 'extends'. These means that the page this is refering too, is extended by the property SELF. So you could make an XMDP that lists all the possible rev attributes, 'extends', 'inverse', 'equivalent', etc. Then you could 'alias' one microformat property to another.

A universal XMDP validator/parser/etc could extract data across two or more XMDP profiles and potentially reason between them. This could create a small ontology.

It is not clear if this idea actually has utility or is simply a solution looking for a problem.

XMDP XML Schema

The link shows a bad example of creating XMDP from an XSD schema. The big question I guess is why? Having XMDP defined in XSD should make it easier for machines to read Microformats, rules and strict data typing will allow Microformats to be validated when contained within an XML/XHTML document. If a document is using microformats with and XSD behind simple XPath queries can be used to harvest the information, this can then be rendered to straight XML for translation to RDF or other XML transport formats.

XSD behind XMDP also has distinct advantages for CMS authors, the XSD sitting behind xforms or sxforms to allow data entry into a CMS can be used to generate XMDP and valid Microformats when rendering content. This in theory should make it easier for CMS authors to develop a semantic core around data before exporting to XHTML + Microformats, RDF etc. and/or make data querying via web services a little more straightforward.

Follow up

Having looked into Microformats a little more I realise how bad that example is; however I still feel that placing a schema behind XMDP is a worthwhile exercise. I don't mind spending a little time on this if anyone feels it's a worthwhile exercise, but I'd propose the following:

  • Define a loose set of microformat conventions (i.e. a meta property will be bound to an attribute etc.), and have these defined in a microformat namespace (mf:?).
  • Create a XSD for common microformat fields without structures (dtStart etc.), with XSD typing and mf: rules (i.e. mf:optional-html-attribute-binding="title" or mf:html-attribute-binding="href" - names were never my strong point )
  • Start working towards creating XSD schema including the common schema for agreed specifications

There would still need to be some form of link between the XMDP and the defining XSD (profile attribute or link element?). With these in place it should be possible for an application like tails, or new apps to pick up on any Microformat in a page and display the data, without the application having to be aware of the specific Microformat standard.

Microformats are cool, especially the fact that you don't have to be a rocket scientist to start using them. However if there can be a way of interleaving grassroots microformat adoption into the more complex semantic forms (RDF etc.), through XML then that's got to be a bonus?

more here

ID Attribute

  • A problem that I've had using XDMP is that it requires the use of the ID attribute (e.g. <dt id="foo">foo</dt>) to define the term "foo". As (X)HTML only allows one element with any given ID, this raises problems if you need to define the same term multiple times -- e.g. to define "category" as a class within both hcard and hcalendar, or to define "copyright" as both a class value and a rel value. TobyInk 06:26, 18 Feb 2008 (PST)
    • Two things. First, "category" MUST NOT be different between hCard and hCalendar, and thus it is a feature, not a problem, that there can only be one id="category" between the two of them. Second, for the rel case, this is solved by using ID values prefixed with "rel-" for rel values. E.g. in http://gmpg.org/xmdp/1, rel-profile is defined with id="rel-profile", and the class name "profile" is defined with id="profile". Tantek 17:48, 4 October 2009 (UTC)

automatic parsability enabling

The current XMDP is useful for people to read and learn about a microformat, but of very limited utility to automate parsing microformats/poshformats (simply identification of vocabulary to parse for, and what attributes to parse for them). It would be nice if people could design their own poshformats, create an XMDP profile, and for the poshformat to be thus instantly parsable by machines. Here is the information that I think would need to be added to XMDP for this to be possible:

For each profile defined:

  • What is/are the root class name(s) (as previous brainstormed above: root class name identification) of the microformats being defined by the XMDP (required)
  • What are the properties of each microformat? Or alternatively (and preferably), which microformat(s) may a property be used with? (to handle the common and encouraged case of vocabulary re-use across microformats) (required)

For each property defined:

  • A human-readable description of what the property means (XMDP already has this)
  • Is it a class/rel/id (or rev, but deprecated) value (XMDP already has this)
  • Is it singular or plural? (default: plural)
  • What datatype is it? (e.g. text, URI, email, datetime, duration. default:text)
  • Might it contain a nested poshformat/microformat? If so, then this profile should link to the profile of the nested poshformat /microformat. (Multiple formats could be defined in the same XMDP profile, using ID attributes to link from one to the other.)
  • What nested subproperties might be found within it? Or alternatively (and preferably), whether a property is actually a subproperty, and if so, which properties may it be used inside? (again, to handle the common and encouraged case of vocabulary re-use) (Perhaps this could be indicated using a nested profile.)

We must expect that there will always be some parsing rules (e.g. hAtom's "hunt the author" game) which will not be expressible in a machine readable profile format, but it may be possible to cover 90% of the information a parser should need for most microformats.

Indeed experience has shown that any "real world" semantic markup languages that get significant use requires LOTS of special custom parsing rules (e.g. HTML is not fully parseable simply from the DTD, nor is RSS from the RSS DTD).

Thus while it may make sense to take incremental steps towards capturing more about a microformat in XMDP, full enabling of machine parsability should not be a short-term (nor even medium-term) goal, as others have tried (DTD, RelaxNG, XML Schema) and failed to achieve this.

See Also