parsing-microformats: Difference between revisions
No edit summary |
m (Reverted edit of Us1Iiq, changed back to last version by RyanKing) |
||
Line 29: | Line 29: | ||
=== XSLT example === | === XSLT example === | ||
<code> | <code> | ||
<xsl:if test="contains( | |||
concat ( | |||
' ', | |||
concat(normalize-whitespace(@class),' ') | |||
), | |||
' <strong>vcard</strong> ' | |||
)" > ... | |||
</code> | |||
[http://balloon.hobix.com/xpath-generator xpath generator], to help you generate those long ugly xpath queries. [link broken as of 8 August 2006] | |||
== Parsing rel/rev values == | |||
Parsing rel and rev values is similar to parsing class values except for the following differences: | |||
# rel and rev values should be separated by one space. | |||
# rel and rev values are case insensitive. | |||
See http://www.w3.org/TR/html401/types.html#type-links. | |||
== See Also == | |||
* [[xmdp-brainstorming]] |
Revision as of 16:55, 23 June 2007
Parsing Microformats
Microformat parsing mechanisms that depend on documents having even minimal xml properties like well-formedness may fail when consuming non-well-formed content. Tidy or even better CyberNeko may be a useful work around. In particular X2V uses XSLT, and tidy to clean any non-well-formed input before processing it.
Parsing class values
When parsing class values care must be taken:
- Class attributes may contain multiple class names, e.g:
class="foo vcard bar"
- Class attributes may contain class names which contain the class name used by a microformat, e.g:
class="foovcardbar"
class="foovcard"
,class="vcardbar"
. - Multiple class names are seperated by one or more whitespace charchters.
- Class names are case sensitive.
See http://www.w3.org/TR/html401/struct/global.html#h-7.5.2.
JavaScript example
The Ultimate getElementsByClassName JavaScript function may be useful. Then you can do:
var adrs = document.getElementsByClassName(document, "*", "adr");
or even:
var cities = document.getElementsByClassName(document, "*", "locality");
XSLT example
<xsl:if test="contains(
concat (
' ',
concat(normalize-whitespace(@class),' ')
),
' vcard '
)" > ...
xpath generator, to help you generate those long ugly xpath queries. [link broken as of 8 August 2006]
Parsing rel/rev values
Parsing rel and rev values is similar to parsing class values except for the following differences:
- rel and rev values should be separated by one space.
- rel and rev values are case insensitive.
See http://www.w3.org/TR/html401/types.html#type-links.