xmdp-brainstorming

(Difference between revisions)

Jump to: navigation, search
(Introduction)
(Addressing issues)
Line 45: Line 45:
* Not every microformat has a container element.  Consider [[reltag]] one of the most widely used microformats.
* Not every microformat has a container element.  Consider [[reltag]] one of the most widely used microformats.
* To some extent, using microformats adds to the cost of writing the document.  It's like filling in a form just to write your thoughts.  Putting <a> elements with each microformat adds unwanted links on top of that.
* To some extent, using microformats adds to the cost of writing the document.  It's like filling in a form just to write your thoughts.  Putting <a> elements with each microformat adds unwanted links on top of that.
 +
 +
=== Parsing microformats ===
 +
 +
Parsing user-generated content is challenging.  Frequently, it does not validate and may not even be well formed.  Therefore, microformat discovery mechanisms that depend on documents having even minimal xml properties like well-formedness will often fail.  This is true, in particular, of [http://suda.co.uk/projects/X2V/ Brian Suda's frequently cited X2V hCard and hCalendar discovery and transformation prototypes] which use XSLT.
 +
 +
However, most microformats, which tend to be agnostic about things like exact element type used, typically require that the developer resort to tools like XPATH that assume well-formedness.  Mark Pilgrim's example [http://sourceforge.net/projects/feedparser/ universal feed parser] suggests that it may be possible to sanitize user html to an extent that it is suitable for later processing as xml.
 +
 +
From a pragmatic developer perspective, parsing web pages to discover microformats is likely to be an area of much work.

Revision as of 21:53, 13 July 2005

XMDP Brainstorming

This wiki page offers a location to brainstorm methods for discovering microformats.

Contents


Authors

Bud Gibson

Add your name here if you make significant contributions to this page and wish to take responsibility for them.

Introduction

Tantek Çelik has developed the <a href="http://gmpg.org/xmdp/" title="XHTML Meta-data Profile">XMDP</a> to describe the allowed class attribute values for microformats. A link to a microformat's XMDP in the profile attribute of head element indicates that that microformat may be used in the document. A parser could read the allowed attribute values from the linked XMDP and use their presence in the document to infer that that particular microformat was in use.

There are clearly issues with this approach:

Feel free to add issues here. Keep issues in this list in summary form. Save lengthy discussion and potential solutions for elaboration below.

Addressing issues

These are in no particular order, but an issue should appear in the issues list above if it is addressed here.

Linking to the XMDP

There are at least two additional methods under discussion for linking to the XMDP in addition to the current method of using the profile attribute of the head element:

It should be noted that none of these linking solutions addresses the issue of when exactly the microformat is being used in the document. They only indicate that the microformat may be in use.

Resolving when microformats are actually in use

One solution to this issue is simply to include the <a rel="profile" href="link to XMDP">powered by microformat xyz</a> within the container element for the microformat. The XMDP spec could then specify that when the <a> element is used in this way, it indicates that the microformat is used by the element containing the <a> element.

There are, however, several clear issues with this proposal:

Parsing microformats

Parsing user-generated content is challenging. Frequently, it does not validate and may not even be well formed. Therefore, microformat discovery mechanisms that depend on documents having even minimal xml properties like well-formedness will often fail. This is true, in particular, of Brian Suda's frequently cited X2V hCard and hCalendar discovery and transformation prototypes which use XSLT.

However, most microformats, which tend to be agnostic about things like exact element type used, typically require that the developer resort to tools like XPATH that assume well-formedness. Mark Pilgrim's example universal feed parser suggests that it may be possible to sanitize user html to an extent that it is suitable for later processing as xml.

From a pragmatic developer perspective, parsing web pages to discover microformats is likely to be an area of much work.

xmdp-brainstorming was last modified: Wednesday, December 31st, 1969

Views