xmdp-brainstorming: Difference between revisions

From Microformats Wiki
Jump to navigation Jump to search
m (Replace <entry-title> with {{DISPLAYTITLE:}})
 
(28 intermediate revisions by 10 users not shown)
Line 1: Line 1:
= XMDP Brainstorming =
{{DISPLAYTITLE:XMDP Brainstorming}}


__TOC__
== introduction ==
Tantek Çelik developed [http://gmpg.org/xmdp/ XMDP] to define extensions to XHTML including rel values, class names, and &lt;meta name&gt; properties and values.  Per the [http://gmpg.org/xmdp/description XMDP spec], a link to a microformat's XMDP in the profile attribute of head element indicates that that microformat's vocabulary is formally defined in the document.  A parser could read the allowed attribute values from the linked XMDP and thus know explicitly which microformats may be in use, and which class names are meant to convey which meanings.


== Authors ==
This page is for exploring possible additions / extensions to XMDP, contributed by numerous folks in the microformats community.


* [http://tantek.com/log/ Tantek Çelik]
See [[xmdp-faq]] and [[xmdp-issues]] for questions and issues.
* [http://thecommunityengine.com/home Bud Gibson]


Add your name here if you make significant contributions to this page and wish to take responsibility for them.
Some of the below are probably better addressed as questions and/or issues and should be moved to those pages accordingly. -- [[User:Tantek|Tantek]]


=== UNDER CONSTRUCTION ===
== requests from TimBL ==
At the [http://www.w3.org/2009/11/TPAC/ 2009 W3C Technical Plenary] I (Tantek) had a conversation with Tim Berners-Lee about what he would like to see in XMDP to enable rich(er) translation into RDFSchema (RDFS).


NOTE: This page is currently a bit of a mishmash of [[xmdp-faq]] , [[xmdp-issues]], and XMDP brainstorming.  I'm going to need to spend some time separating all this out.  - [http://tantek.com/log/ Tantek Çelik]
The following subsections represent my notes on specific asks/requests/feedback from Tim. [[User:Tantek|Tantek]] 01:16, 5 November 2009 (UTC)


= XMDP brainstorming =
=== labels ===


== Introduction ==
* labels are useful for multiple languages
* "fn" - is a property name
* rdfs:label would be "formatted name" - but not a long explanation, e.g. also "nom" in French
* ok to use existing HTML "lang" attribute and standard language codes
* XMDP should offer labels for terms, with labels in specific (human) languages


Tantek Çelik, Matt Mullenweb and Eric Meyer have developed the [http://gmpg.org/xmdp/ XMDP] to define extensions to XHTML including rel values, class names, and &lt;meta name&gt; properties and values. Per the [http://gmpg.org/xmdp/description XMDP spec], a link to a microformat's XMDP in the profile attribute of head element indicates that that microformat's vocabulary is formally defined in the document.  A parser could read the allowed attribute values from the linked XMDP and use their presence in the document to infer that that particular microformat was in use.
=== serve RDFS using conneg ===
It would be useful/nice if requests to microformats profiles, e.g. http://microformats.org/profile/hcard - if made with the Accept header requesting the mime type of RDFS (conneg / content negotiation), would be returned as an automatic translation (perhaps using XSLT) of the XMDP to RDFS.


=== Raised Issues ===
=== aliasing ===
* Just because a profile value mentioned in a microformat's linked XMDP also appears in the document does not mean that that microformat is in use.  Such co-occurrences could be purely by chance.
TimBL likes to be able to say this term is the same as this other term.
** REJECTED. No this does not make sense.  By definition, an XMDP profile defines certain properties and values.  Any use of such property or value in the document is thus defined by th definition in the XMDP.
** [[User:Bud|Bud]] 20:01, 13 Jul 2005 (PDT): Actually, this is far from clear.  Reading this excerpt from [http://gmpg.org/xmdp/description the XMDP description]:  "This specification does not define a set of legal meta data properties. The meaning of a property and the set of legal values for that property should be defined in a reference lexicon called a profile. For example, a profile designed to help search engines index documents might define properties such as "author", "copyright", "keywords", etc." seems not to imply exclusivity for the whole document, only for the part covered by the profile.  If we assumed the quoted words implied exclusivity for the whole document, then only defined attribute values could be used '''for the whole document'''.  The current usage suggests that we mean the profile to only cover the part of the document covered by the microformat.  As such, we cannot use occurrence of a value to connote presence of the microformat.  Consider this example, xFolk and hCalendar both use a description class attribute value.  Presence of that value is therefore indeterminate as to which format is being used, even if we accepted your claim here, which seems dubious.
** Bud, that quote you give is XMDP quoting HTML4, please re-read the XMDP spec more carefullly.  This is a non-issue.
* Currently, the XMDP can only be linked from the profile attribute of the head element.  In many instances, authors will not have access to the head element.
** ACCEPTED. There are two additional proposed ways to link to XMDP profiles
**# <code>&lt;link rel="profile"&gt;</code>, as introduced in the XMDP poster submitted to WWW2005.
**# <code>&lt;a rel="profile" href&gt;</code>, as similarly discussed.


* Documents with user-generated content are hard to parse, and microformats present particular parsing challenges.
=== atomic types ===
** REJECTED. This is a straw man issue.
It would be useful to specify the atomic type of a microformats property, e.g. one of the following:
** [[User:Bud|Bud]] 19:44, 13 Jul 2005 (PDT): Tantek needs to supply some justification for why this is a strawman as every developer I have talked to has raised it.  It may be that the solutions described below are sufficient to solve the issue. More neutral statements to that effect might be more constructive.
* datetime
** Bud, saying "particular parsing challenges", without stating them is meaningless.  Hence strawman.  I think you may be mistaking questions for issues.
* url/email
* number/fixed
* string


''Feel free to add issues here.  Keep issues in this list in summary form. Save lengthy discussion and potential solutions for elaboration below.''
TimBL also suggested location lat/long/altitude, however that's more of a composite type (e.g. [[geo]]) that is made of multiple atomic types


== Addressing issues ==
== Possible XMDP Additions ==
=== resolving when microformats may be in use ===
Currently the potential existence of microformats in a document can be declared by referencing the profile URLs for those microformats in the profile attribute of the head element of that document.


These are in no particular order, but an issue should appear in the issues list above if it is addressed here.
In addition to the profile attribute, the [[rel-profile]] value is being strongly considered for inclusion in an update to XMDP.  See the [[rel-profile]] page for details.


=== Linking to the XMDP ===
In short: another way would be to include the <nowiki><a rel="profile" href="XMDP URL">powered by microformat xyz</a></nowiki> within the container element for the microformat.  The XMDP spec could then specify that when the <a> element is used in this way, it indicates that the microformat is used by the element containing the <a> element.


There are at least two additional methods under discussion for linking to the XMDP in addition to the current method of using the profile attribute of the head element:
Issues:
 
* Not every microformat has a container element.  Consider [[rel-tag]] one of the most widely used microformats.
** RESOLVED. This is easily resolved by having the context of the [[rel-profile]] be the parent of the element with [[rel-profile]] and descendants, or perhaps latter siblings of the element with [[rel-profile]] and their descendants.
* To some extent, using microformats adds to the size of the document, just as using markup adds to the size of a plain text document.  Putting <nowiki><a></nowiki> elements with each microformat adds unwanted links on top of that.
** RESOLVED. There is no need to add an <nowiki><a></nowiki> for each instance of a microformat, as the profile for a microformat can be declared once, perhaps near the top of the body of the document. In practice, many pages that use microformats already link to the microformats specs themselves with badges or "powered by" links which could easily be modified to link to profiles using <code>&lt;a rel="profile"&gt;</code> hyperlinks, no additional links needed.
 
=== root class name identification ===
Use-case:
 
It could be quite convenient for "generic/universal" microformat parsers if they could read an XMDP profile and understand which of the defined class names were ''root'' class names for microformats, and thus be able to distinguish those object boundaries.
 
XMDP profiles can and do contain definitions for multiple root class names (e.g. http://microformats.org/wiki/hcard defines "vcard", "adr", and "geo").
 
==== possible solutions ====
===== XMDP definition flag =====
Introduce some sort of markup or textual flag that indicates inside an XMDP definition (&lt;dd&gt;) for a class name that the class name may be used as a root class name.
 
==== rejected solutions ====
===== first class name defined in a profile =====
One simple thought would be that the ''first'' class name defined in a profile
(e.g. [[hcard-profile]]) is the root class for that microformat.
 
Critical problem(s):
* Does not handle the case of multiple root class names in an XMDP. E.g. a microformat that defines multiple possible root class names (e.g. [[hcalendar|hCalendar]] permits "vcalendar" or "vevent", [[hatom|hAtom]] permits "hfeed" or "hentry").
 
===== publisher linking to root class name =====
The author including a reference to the XMDP could link directly to the root class name.
<pre>&lt;!-- This profile link indicates that "vcard" is a root class name. -->
&lt;head profile="http://www.w3.org/2006/03/hcard#vcard"></pre>
 
Critical problem(s):
* The problem is this moves the information of what is the root class to perhaps one of the worst places, which is in every reference to the XMDP, whereas the XMDP itself should be defining what is a root class.
 
===== publisher inline additional class name =====
Another possibility that may be worth exploring, is the ability to indicate inline in the code that a class name is the root class name for a microformat, rather than (or perhaps in addition to) the XMDP.
 
E.g.
<source lang=html4strict>
<span class="vcard ufroot">
<span class="fn">Tantek Çelik</fn>
</span>
</source>
 
would indicate that the element with classname of "vcard" is the root of a microformatted piece of information.
 
Critical problem(s):
* The problem is this moves the information of what is the root class to perhaps one of the worst places, which is in every instance of the microformat,  whereas the XMDP itself should be defining what is a root class.
 
Possible drawbacks:
* How would you know which class name (other than "ufroot") was the root class name? e.g. <pre>class="vcard person ufroot"</pre>
** perhaps by only looking at classes defined in the XMDPs for the document.
** perhaps by only allowing one root class name in addition to the "ufroot"
** or perhaps by saying that all of the other class names in the same attribute are root class names (so that for example you could say: <pre>&lt;span class="root hreview hentry"&gt;</pre>
 
This is also very similar to, but not the same as, the [[mfo]] problem, and should be considered in that context as an independent solution.
 
=== linking to the XMDP ===
 
As hinted in the note on "when microformats may be in use", there are additional methods under discussion for linking to the XMDP in addition to the current method of using the profile attribute of the head element:
* Using <nowiki><link rel="profile" href="link to XMDP"/></nowiki>.  This method can be used now and will be formalized in XHTML 2.   
* Using <nowiki><link rel="profile" href="link to XMDP"/></nowiki>.  This method can be used now and will be formalized in XHTML 2.   
** A problem with this method is that it requires access to the head element.
** A problem with this method is that it (still) requires access to the head element.
* Using <nowiki><a rel="profile" href="link to XMDP">powered by microformat xyz</a></nowiki> in the body of the document.
* Using <nowiki><a rel="profile" href="link to XMDP">powered by microformat xyz</a></nowiki> in the body of the document.
** As noted by a number of people, this approach has the added benefit of creating a viral marketing opportunity for the microformats used.  For instance, developers could add badges saying they are using microformat xyz as suggested by the example.
** As noted by a number of people, this approach has the added benefit of creating a viral marketing opportunity for the microformats used.  For instance, developers could add badges saying they are using microformat xyz as suggested by the example.
** Blog authoring environments allow you to insert links at will, so this squarely <abbr title="avoids">obviates</abbr> the need to access the head element.
** Blog authoring environments allow you to insert links at will, so this squarely <abbr title="avoids">obviates</abbr> the need to access the head element.


It should be noted that none of these linking solutions addresses the issue of when exactly the microformat is being used in the document.  They only indicate that the microformat may be in use.
=== includes / aggregate profiles===
 
Methods for including one or more values, properties, or an entire XMDP into an other XMDP as a way of creating an aggregate profile that effectively contains definitions from multiple profiles would be quite useful.  They would enable documents with microformats to simply refer to a single profile URL rather than a complete space separated set of all the profile URLs of the microformats that may be in use.
 
=== vocabulary aliasing ===
 
An XMDP document could be used to define a microformat profile that is nothing more than a simple dictionary mapping between an existing, non-standard set of HTML classes and the terms in a standard microformat profile. This would allow a publisher to support a given microformat by merely using the URI of a new profile document as the value of an individual document's head/profile attribute, rather than modifying the individual class values throughout each document to conform to an existing profile. Initial suggestion with use case description in this  [http://microformats.org/discuss/mail/microformats-discuss/2005-October/001623.html microformats-discuss post]. Note (from [http://microformats.org/discuss/mail/microformats-discuss/2005-October/001633.html Kevin's response]) that HTML class attributes can contain multiple values, e.g. class="post hentry", so a publisher doesn't have to discard their existing class values to use those of a microformat.
 
=== subclassing / ontology addition ===
One may want to introduce a new property (or value) and base it on an existing  property (or value).  In this sample XMDP, the value "self" is defined, based on the value "me" from XFN 1.1:
 
<pre>
<nowiki>
<dl class="rel">
  <dt id='self'><a href="http://www.gmpg.org/xfn/11#me" rev="extends">self</a></dt>
  <dd>This is a pointer to me, it extends the "me" value of XFN</dd>
</dl>
</nowiki>
</pre>
There are two interesting pieces that have been added, a URL with an anchor to another XMDP profile and a rev attribute. The rev value in
this example is 'extends'. These means that the page this is refering too, is extended by the property SELF. So you could make an XMDP that
lists all the possible rev attributes, 'extends', 'inverse', 'equivalent', etc. Then you could 'alias' one microformat property to another.
 
A universal XMDP validator/parser/etc could extract data across two or more XMDP profiles and potentially reason between them. This could create a small ontology.
 
It is not clear if this idea actually has utility or is simply a solution looking for a problem.
 
=== XMDP XML Schema ===
* [http://www.redantdesign.com/hcard/ XSD and XMDP for Microformats]
The link shows a bad example of creating XMDP from an XSD schema.  The big question I guess is why?
Having XMDP defined in XSD should make it easier for machines to read Microformats, rules and strict data typing will allow Microformats to be validated when contained within an XML/XHTML document.  If a document is using microformats with and XSD behind simple XPath queries can be used to harvest the information, this can then be rendered to straight XML for translation to RDF or other XML transport formats.
 
XSD behind XMDP also has distinct advantages for CMS authors, the XSD sitting behind xforms or sxforms to allow data entry into a CMS can be used to generate XMDP and valid Microformats when rendering content.  This in theory should make it easier for CMS authors to develop a semantic core around data before exporting to XHTML + Microformats, RDF etc. and/or make data querying via web services a little more straightforward.
 
==== Follow up ====
Having looked into Microformats a little more I realise how bad that example is; however I still feel that placing a schema behind XMDP is a worthwhile exercise.  I don't mind spending a little time on this if anyone feels it's a worthwhile exercise, but I'd propose the following:
* Define a loose set of microformat conventions (i.e. a meta property will be bound to an attribute etc.), and have these defined in a microformat namespace (mf:?).
* Create a XSD for common microformat fields without structures (dtStart etc.), with XSD typing and mf: rules (i.e. mf:optional-html-attribute-binding="title" or mf:html-attribute-binding="href" - names were never my strong point )
* Start working towards creating XSD schema including the common schema for agreed specifications
 
There would still need to be some form of link between the XMDP and the defining XSD (profile attribute or link element?).  With these in place it should be possible for an application like tails, or new apps to pick up on any Microformat in a page and display the data, without the application having to be aware of the specific Microformat standard.


No. that is falseReferencing an XMDP introduces its definitions into the document.  Period. Those definitions then take effect for the properties and values defined therein.
Microformats are cool, especially the fact that you don't have to be a rocket scientist to start using themHowever if there can be a way of interleaving grassroots microformat adoption into the more complex semantic forms (RDF etc.), through XML then that's got to be a bonus?


[[User:Bud|Bud]] 20:06, 13 Jul 2005 (PDT): Again, a read of the text I quote above does not support this conclusion. If it did, you could only use values defined in the XMDP.
[http://www.redantdesign.com/hcard/take2.asp more here]


Bud, see above, you are confusing a quote for prose in the spec. It's marked up and displayed and cited as a blockquotePlease read the spec more carefully. And where did you get "only" from?
== ID Attribute ==
<div class="discussion">
* A problem that I've had using XDMP is that it requires the use of the ID attribute (e.g. &lt;dt id="foo">foo&lt;/dt>) to define the term "foo". As (X)HTML only allows one element with any given ID, this raises problems if you need to define the same term multiple times -- e.g. to define "category" as a class within both hcard and hcalendar, or to define "copyright" as both a class value and a rel value. [[User:TobyInk|TobyInk]] 06:26, 18 Feb 2008 (PST)
** Two things. First, "category" MUST NOT be different between hCard and hCalendar, and thus it is a feature, not a problem, that there can only be one id="category" between the two of themSecond, for the rel case, this is solved by using ID values prefixed with "rel-" for rel values. E.g. in http://gmpg.org/xmdp/1, rel-profile is defined with id="rel-profile", and the class name "profile" is defined with id="profile". [[User:Tantek|Tantek]] 17:48, 4 October 2009 (UTC)
</div>
== automatic parsability enabling ==


=== Resolving when microformats are actually in use ===
The current XMDP is useful for people to read and learn about a microformat, but of very limited utility to automate parsing microformats/[[poshformats]] (simply identification of vocabulary to parse for, and what attributes to parse for them). It would be nice if people could design their own poshformats, create an XMDP profile, and for the poshformat to be thus instantly parsable by machines. Here is the information that I think would need to be added to XMDP for this to be possible:


One solution to this issue is simply to include the <nowiki><a rel="profile" href="link to XMDP">powered by microformat xyz</a></nowiki> within the container element for the microformat.  The XMDP spec could then specify that when the <a> element is used in this way, it indicates that the microformat is used by the element containing the <a> element.
For each profile defined:


There are, however, several clear issues with this proposal:
* What is/are the root class name(s) (as previous brainstormed above: [[xmdp-brainstorming#root_class_name_identification|root class name identification]]) of the microformats being defined by the XMDP (required)
* What are the properties of each microformat? Or alternatively (and preferably), which microformat(s) may a property be used with? (to handle the common and encouraged case of vocabulary re-use across microformats) (required)


* Not every microformat has a container element.  Consider [[reltag]] one of the most widely used microformats.
For each property defined:
* To some extent, using microformats adds to the cost of writing the document.  It's like filling in a form just to write your thoughts.  Putting <a> elements with each microformat adds unwanted links on top of that.


=== Parsing microformats ===
* A human-readable description of what the property means (XMDP already has this)
* Is it a class/rel/id (or rev, but deprecated) value (XMDP already has this)
* Is it singular or plural? (default: plural)
* What datatype is it? (e.g. text, URI, email, datetime, duration. default:text)
* Might it contain a nested poshformat/microformat? If so, then this  profile should link to the profile of the nested poshformat /microformat. (Multiple formats could be defined in the same XMDP profile, using ID attributes to link from one to the other.)
* What nested subproperties might be found within it? Or alternatively (and preferably), whether a property is actually a subproperty, and if so, which properties may it be used inside? (again, to handle the common and encouraged case of vocabulary re-use) (Perhaps this could be indicated using a nested profile.)


Parsing user-generated content is challenging.  Frequently, it does not validate and may not even be well formed. Therefore, microformat discovery mechanisms that depend on documents having even minimal xml properties like well-formedness will often fail. This is true, in particular, of [http://suda.co.uk/projects/X2V/ Brian Suda's frequently cited X2V hCard and hCalendar discovery and transformation prototypes] which use XSLT.
We must expect that there will always be some parsing rules (e.g. hAtom's [http://microformats.org/wiki/hatom#Entry_Author "hunt the author" game]) which will not be expressible in a machine readable profile format, but it may be possible to cover 90% of the information a parser should need for most microformats.


However, most microformats, which tend to be agnostic about things like exact element type used, typically require that the developer resort to tools like XPATH that assume well-formedness. Mark Pilgrim's example [http://sourceforge.net/projects/feedparser/ universal feed parser] suggests that it may be possible to sanitize user html to an extent that it is suitable for later processing as xml.
Indeed experience has shown that any "real world" semantic markup languages that get significant use requires LOTS of special custom parsing rules (e.g. HTML is not fully parseable simply from the DTD, nor is RSS from the RSS DTD).


From a pragmatic developer perspective, parsing web pages to discover microformats is likely to be an area of much work.
Thus while it may make sense to take incremental steps towards capturing more about a microformat in XMDP, full enabling of machine parsability should not be a short-term (nor even medium-term) goal, as others have tried (DTD, RelaxNG, XML Schema) and failed to achieve this.


Bud, this section is conflating several questions and issues and needs to be broken down further in order to make sense.
== See Also ==
* [[xmdp]]
* [[xmdp-faq]]
* [[xmdp-issues]]

Latest revision as of 16:35, 18 July 2020


introduction

Tantek Çelik developed XMDP to define extensions to XHTML including rel values, class names, and <meta name> properties and values. Per the XMDP spec, a link to a microformat's XMDP in the profile attribute of head element indicates that that microformat's vocabulary is formally defined in the document. A parser could read the allowed attribute values from the linked XMDP and thus know explicitly which microformats may be in use, and which class names are meant to convey which meanings.

This page is for exploring possible additions / extensions to XMDP, contributed by numerous folks in the microformats community.

See xmdp-faq and xmdp-issues for questions and issues.

Some of the below are probably better addressed as questions and/or issues and should be moved to those pages accordingly. -- Tantek

requests from TimBL

At the 2009 W3C Technical Plenary I (Tantek) had a conversation with Tim Berners-Lee about what he would like to see in XMDP to enable rich(er) translation into RDFSchema (RDFS).

The following subsections represent my notes on specific asks/requests/feedback from Tim. Tantek 01:16, 5 November 2009 (UTC)

labels

  • labels are useful for multiple languages
  • "fn" - is a property name
  • rdfs:label would be "formatted name" - but not a long explanation, e.g. also "nom" in French
  • ok to use existing HTML "lang" attribute and standard language codes
  • XMDP should offer labels for terms, with labels in specific (human) languages

serve RDFS using conneg

It would be useful/nice if requests to microformats profiles, e.g. http://microformats.org/profile/hcard - if made with the Accept header requesting the mime type of RDFS (conneg / content negotiation), would be returned as an automatic translation (perhaps using XSLT) of the XMDP to RDFS.

aliasing

TimBL likes to be able to say this term is the same as this other term.

atomic types

It would be useful to specify the atomic type of a microformats property, e.g. one of the following:

  • datetime
  • url/email
  • number/fixed
  • string

TimBL also suggested location lat/long/altitude, however that's more of a composite type (e.g. geo) that is made of multiple atomic types

Possible XMDP Additions

resolving when microformats may be in use

Currently the potential existence of microformats in a document can be declared by referencing the profile URLs for those microformats in the profile attribute of the head element of that document.

In addition to the profile attribute, the rel-profile value is being strongly considered for inclusion in an update to XMDP. See the rel-profile page for details.

In short: another way would be to include the <a rel="profile" href="XMDP URL">powered by microformat xyz</a> within the container element for the microformat. The XMDP spec could then specify that when the <a> element is used in this way, it indicates that the microformat is used by the element containing the <a> element.

Issues:

  • Not every microformat has a container element. Consider rel-tag one of the most widely used microformats.
    • RESOLVED. This is easily resolved by having the context of the rel-profile be the parent of the element with rel-profile and descendants, or perhaps latter siblings of the element with rel-profile and their descendants.
  • To some extent, using microformats adds to the size of the document, just as using markup adds to the size of a plain text document. Putting <a> elements with each microformat adds unwanted links on top of that.
    • RESOLVED. There is no need to add an <a> for each instance of a microformat, as the profile for a microformat can be declared once, perhaps near the top of the body of the document. In practice, many pages that use microformats already link to the microformats specs themselves with badges or "powered by" links which could easily be modified to link to profiles using <a rel="profile"> hyperlinks, no additional links needed.

root class name identification

Use-case:

It could be quite convenient for "generic/universal" microformat parsers if they could read an XMDP profile and understand which of the defined class names were root class names for microformats, and thus be able to distinguish those object boundaries.

XMDP profiles can and do contain definitions for multiple root class names (e.g. http://microformats.org/wiki/hcard defines "vcard", "adr", and "geo").

possible solutions

XMDP definition flag

Introduce some sort of markup or textual flag that indicates inside an XMDP definition (<dd>) for a class name that the class name may be used as a root class name.

rejected solutions

first class name defined in a profile

One simple thought would be that the first class name defined in a profile (e.g. hcard-profile) is the root class for that microformat.

Critical problem(s):

  • Does not handle the case of multiple root class names in an XMDP. E.g. a microformat that defines multiple possible root class names (e.g. hCalendar permits "vcalendar" or "vevent", hAtom permits "hfeed" or "hentry").
publisher linking to root class name

The author including a reference to the XMDP could link directly to the root class name.

<!-- This profile link indicates that "vcard" is a root class name. -->
<head profile="http://www.w3.org/2006/03/hcard#vcard">

Critical problem(s):

  • The problem is this moves the information of what is the root class to perhaps one of the worst places, which is in every reference to the XMDP, whereas the XMDP itself should be defining what is a root class.
publisher inline additional class name

Another possibility that may be worth exploring, is the ability to indicate inline in the code that a class name is the root class name for a microformat, rather than (or perhaps in addition to) the XMDP.

E.g.

<span class="vcard ufroot">
 <span class="fn">Tantek Çelik</fn>
</span>

would indicate that the element with classname of "vcard" is the root of a microformatted piece of information.

Critical problem(s):

  • The problem is this moves the information of what is the root class to perhaps one of the worst places, which is in every instance of the microformat, whereas the XMDP itself should be defining what is a root class.

Possible drawbacks:

  • How would you know which class name (other than "ufroot") was the root class name? e.g.
    class="vcard person ufroot"
    • perhaps by only looking at classes defined in the XMDPs for the document.
    • perhaps by only allowing one root class name in addition to the "ufroot"
    • or perhaps by saying that all of the other class names in the same attribute are root class names (so that for example you could say:
      <span class="root hreview hentry">

This is also very similar to, but not the same as, the mfo problem, and should be considered in that context as an independent solution.

linking to the XMDP

As hinted in the note on "when microformats may be in use", there are additional methods under discussion for linking to the XMDP in addition to the current method of using the profile attribute of the head element:

  • Using <link rel="profile" href="link to XMDP"/>. This method can be used now and will be formalized in XHTML 2.
    • A problem with this method is that it (still) requires access to the head element.
  • Using <a rel="profile" href="link to XMDP">powered by microformat xyz</a> in the body of the document.
    • As noted by a number of people, this approach has the added benefit of creating a viral marketing opportunity for the microformats used. For instance, developers could add badges saying they are using microformat xyz as suggested by the example.
    • Blog authoring environments allow you to insert links at will, so this squarely obviates the need to access the head element.

includes / aggregate profiles

Methods for including one or more values, properties, or an entire XMDP into an other XMDP as a way of creating an aggregate profile that effectively contains definitions from multiple profiles would be quite useful. They would enable documents with microformats to simply refer to a single profile URL rather than a complete space separated set of all the profile URLs of the microformats that may be in use.

vocabulary aliasing

An XMDP document could be used to define a microformat profile that is nothing more than a simple dictionary mapping between an existing, non-standard set of HTML classes and the terms in a standard microformat profile. This would allow a publisher to support a given microformat by merely using the URI of a new profile document as the value of an individual document's head/profile attribute, rather than modifying the individual class values throughout each document to conform to an existing profile. Initial suggestion with use case description in this microformats-discuss post. Note (from Kevin's response) that HTML class attributes can contain multiple values, e.g. class="post hentry", so a publisher doesn't have to discard their existing class values to use those of a microformat.

subclassing / ontology addition

One may want to introduce a new property (or value) and base it on an existing property (or value). In this sample XMDP, the value "self" is defined, based on the value "me" from XFN 1.1:


<dl class="rel">
  <dt id='self'><a href="http://www.gmpg.org/xfn/11#me" rev="extends">self</a></dt>
   <dd>This is a pointer to me, it extends the "me" value of XFN</dd>
</dl>

There are two interesting pieces that have been added, a URL with an anchor to another XMDP profile and a rev attribute. The rev value in this example is 'extends'. These means that the page this is refering too, is extended by the property SELF. So you could make an XMDP that lists all the possible rev attributes, 'extends', 'inverse', 'equivalent', etc. Then you could 'alias' one microformat property to another.

A universal XMDP validator/parser/etc could extract data across two or more XMDP profiles and potentially reason between them. This could create a small ontology.

It is not clear if this idea actually has utility or is simply a solution looking for a problem.

XMDP XML Schema

The link shows a bad example of creating XMDP from an XSD schema. The big question I guess is why? Having XMDP defined in XSD should make it easier for machines to read Microformats, rules and strict data typing will allow Microformats to be validated when contained within an XML/XHTML document. If a document is using microformats with and XSD behind simple XPath queries can be used to harvest the information, this can then be rendered to straight XML for translation to RDF or other XML transport formats.

XSD behind XMDP also has distinct advantages for CMS authors, the XSD sitting behind xforms or sxforms to allow data entry into a CMS can be used to generate XMDP and valid Microformats when rendering content. This in theory should make it easier for CMS authors to develop a semantic core around data before exporting to XHTML + Microformats, RDF etc. and/or make data querying via web services a little more straightforward.

Follow up

Having looked into Microformats a little more I realise how bad that example is; however I still feel that placing a schema behind XMDP is a worthwhile exercise. I don't mind spending a little time on this if anyone feels it's a worthwhile exercise, but I'd propose the following:

  • Define a loose set of microformat conventions (i.e. a meta property will be bound to an attribute etc.), and have these defined in a microformat namespace (mf:?).
  • Create a XSD for common microformat fields without structures (dtStart etc.), with XSD typing and mf: rules (i.e. mf:optional-html-attribute-binding="title" or mf:html-attribute-binding="href" - names were never my strong point )
  • Start working towards creating XSD schema including the common schema for agreed specifications

There would still need to be some form of link between the XMDP and the defining XSD (profile attribute or link element?). With these in place it should be possible for an application like tails, or new apps to pick up on any Microformat in a page and display the data, without the application having to be aware of the specific Microformat standard.

Microformats are cool, especially the fact that you don't have to be a rocket scientist to start using them. However if there can be a way of interleaving grassroots microformat adoption into the more complex semantic forms (RDF etc.), through XML then that's got to be a bonus?

more here

ID Attribute

  • A problem that I've had using XDMP is that it requires the use of the ID attribute (e.g. <dt id="foo">foo</dt>) to define the term "foo". As (X)HTML only allows one element with any given ID, this raises problems if you need to define the same term multiple times -- e.g. to define "category" as a class within both hcard and hcalendar, or to define "copyright" as both a class value and a rel value. TobyInk 06:26, 18 Feb 2008 (PST)
    • Two things. First, "category" MUST NOT be different between hCard and hCalendar, and thus it is a feature, not a problem, that there can only be one id="category" between the two of them. Second, for the rel case, this is solved by using ID values prefixed with "rel-" for rel values. E.g. in http://gmpg.org/xmdp/1, rel-profile is defined with id="rel-profile", and the class name "profile" is defined with id="profile". Tantek 17:48, 4 October 2009 (UTC)

automatic parsability enabling

The current XMDP is useful for people to read and learn about a microformat, but of very limited utility to automate parsing microformats/poshformats (simply identification of vocabulary to parse for, and what attributes to parse for them). It would be nice if people could design their own poshformats, create an XMDP profile, and for the poshformat to be thus instantly parsable by machines. Here is the information that I think would need to be added to XMDP for this to be possible:

For each profile defined:

  • What is/are the root class name(s) (as previous brainstormed above: root class name identification) of the microformats being defined by the XMDP (required)
  • What are the properties of each microformat? Or alternatively (and preferably), which microformat(s) may a property be used with? (to handle the common and encouraged case of vocabulary re-use across microformats) (required)

For each property defined:

  • A human-readable description of what the property means (XMDP already has this)
  • Is it a class/rel/id (or rev, but deprecated) value (XMDP already has this)
  • Is it singular or plural? (default: plural)
  • What datatype is it? (e.g. text, URI, email, datetime, duration. default:text)
  • Might it contain a nested poshformat/microformat? If so, then this profile should link to the profile of the nested poshformat /microformat. (Multiple formats could be defined in the same XMDP profile, using ID attributes to link from one to the other.)
  • What nested subproperties might be found within it? Or alternatively (and preferably), whether a property is actually a subproperty, and if so, which properties may it be used inside? (again, to handle the common and encouraged case of vocabulary re-use) (Perhaps this could be indicated using a nested profile.)

We must expect that there will always be some parsing rules (e.g. hAtom's "hunt the author" game) which will not be expressible in a machine readable profile format, but it may be possible to cover 90% of the information a parser should need for most microformats.

Indeed experience has shown that any "real world" semantic markup languages that get significant use requires LOTS of special custom parsing rules (e.g. HTML is not fully parseable simply from the DTD, nor is RSS from the RSS DTD).

Thus while it may make sense to take incremental steps towards capturing more about a microformat in XMDP, full enabling of machine parsability should not be a short-term (nor even medium-term) goal, as others have tried (DTD, RelaxNG, XML Schema) and failed to achieve this.

See Also