[microformats-discuss] funness -> validator
Bud Gibson
bud at thecommunityengine.com
Tue Aug 16 13:40:15 PDT 2005
On Aug 16, 2005, at 15:12, Tantek Çelik wrote:
> The longer answer is yes, that we have XMDP which at least defines the
> vocabulary of a microformat, and the remaining constraints are
> (MUST BE)
> defined in the specifications. As Brian noted, he's working on a
> generic
> XMDP validator, but any validator for a particular format will need
> to have
> hand coded rules for the specific format (just like *every* other
> format
> validator out there, e.g. the HTML, CSS, RSS, Atom validators etc.).
>
>
I've actually created a protean validator for xFolk using javascript
and find it quite useful. As Tantek observes, you have to hand code
rules as there is no schema language (not a knock, just an
observation). I think the need to hand code may actually be a
benefit. It forces you into the realm of real-world coding that
implementers will face.
My "validator" goes through and colors patches identified as xFolk
entries and then their component parts, a different color for each
part. What I have found useful while using the validator is not so
much that it identifies "valid" xFolk as it shows me how my
particular rendition of xfolk on a page will be perceived by
parsers. To say the least, that is eye-opening, and I would suggest
developing such a validator as a general strategy for people writing
microformats. It's not hard, and it is a check on how coherent your
specification really is.
It is impossible to overstate how useful a simple visualization of
the microformat and its component parts in the wild can be,
particularly when you have user-generated data swimming into the mix.
One of the things I have found in developing things for xFolk (most
of which is not currently public) is that DOM-based methods work
well. There are three in order of support:
1. CSS-selectors: not well supported in *programming* tools with
the exception of the behavior.js library.
2. XPath: Good server-side support, but problematic with pages that
are not well-formed xml. Good support in Firefox for HTML even when
not well formed.
3. DOM level 1: Great support in browsers. Also nice because
javascript allows you to mix in regular expressions in your
selectors. Right now, I am focusing here.
My final observation in this observation omnibus is that I have come
to the conclusion that, as much as possible, you should attempt to
preserve the tree structure nature of microformats in harvesting,
storing, and republishing them. Some programmers may find value in
deserializing microformatted content into some sort of data structure
and then reserializing on output, but that just seems to add
complexity to me and may run counter to the extremely flexible nature
of these things. This last is just an off-the-cuff observation. YMMV.
Bud
More information about the microformats-discuss
mailing list