[uf-discuss] Microformats are being discussed on Slashdot

Sören Nils 'chucker' Kuklau chucker23n at gmail.com
Tue Jul 11 21:57:52 PDT 2006


On 7/12/06, Tantek Çelik <tantek at cs.stanford.edu> wrote:
> Sounds like a good source for some easily answered FAQs.
>
> Who wants to help dissect, itemize, and answer?

As has been pointed out in the IRC room a few hours ago (mostly by
ryanking, IIRC), some of the example XPath statements in this article
are highly problematic.

For example, this:
> //div[@class='vevent']

is bad in three ways:
1) it doesn't actually start at the root
2) it only checks for div elements -- vevent classes can appear in
different elements, too
3) it doesn't math class attributes that, in addition to vevent, have
other values.

We'll address these one by one:

1) Easy enough. Instead of //, use .//, so you get
.//div[@class='vevent'] (interestingly, the author does this later on
in some parts of the article).
2) Simply use * instead of div. .//*[@class='vevent'] will already
match many more kinds of vevent instances.
3) This one's more tricky. The author uses the contains() function
later on, e.g.: .//*[contains(@class,'vevent')] -- a big improvement.
But this would also match, say, class="somestringwithveventinit". You
don't really want that; you want 'vevent' to be separated from
everything else by spaces.

A further improvement is .//*[matches(@class,'\\bvevent\\b')], which,
to my embarrassment, I used, but ryanking pointed out to me that this
isn't correct either: it will match, for example,
class="somestring-vevent-anotherstring".

Brian Suda's solution in X2V, on the other hand, is ideal. As you can
see in his XSLT file at
<http://suda.co.uk/projects/X2V/xhtml2vcal.xsl>, he uses XPath
statements such as .//*[contains(concat(' ',normalize-space(@class),'
'),' vevent ')]. While a little long, this is, until proven otherwise,
the "proper" way to parse this.

To summarize, we'll go from:

//div[@class='vevent']

to:

.//*[contains(concat(' ',normalize-space(@class),' '),' vevent ')]

This ensures that *any* element, *anywhere* in the document, will be
returned if it contains a class of which *any* space-separated value
is 'vevent'.

Hope this is a useful contribution. :-)

-- 
Sören Kuklau


More information about the microformats-discuss mailing list