[microformats-discuss] Microformat validation

Bud Gibson bud at thecommunityengine.com
Sun Aug 21 05:23:30 PDT 2005

I have to confess that most of this flew over my head.  What I think  
you are saying is something like this.

"While microformats are not optimized for machine validation out of  
the box, we might be able to get around that by associating them with  
some parallel ontology constructions that are more amenable to  
automated processing.  Then, by analogy with these parallel  
constructions, we could determine whether the page author was using  
microformat components consistent with their semantics."

Sounds like a worthy goal, but I wonder if you might not achieve a  
similar end by dropping the middleman (RDF/OWL?) and writing the XSLT  
rules (or some imperative equivalent) directly.  It would be a bit  
more ad hoc, but as you mention we don't have the RDF for most  
microformats and the machinery for the processing is not there yet.

On Aug 20, 2005, at 8:35, Danny Ayers wrote:

> Hi,
> Such a busy list, it's easy to miss gems. One I did spot is the idea
> of a validator for microformats ([1] and elsewhere). This is an
> excellent idea, IMHO. Some thoughts  -
> A precedent for this being worthwhile is the Feed Validator [2], which
> has not only provided a handy check for Atom/RSS data (and developers
> of tools to produce this, massively amplifying the benefit), it also
> helped act as a mechanism for encouraging consideration of tests when
> developing formats. Overall, I'm pretty sure the validator has had a
> major impact on the quality of syndication data on the Web.
> Regarding validation of microformat docs, the base syntax rules are
> those of XHTML, so that layer is well covered. For validation of a
> single specific format it's not hard to imagine a schema being made
> available for the purpose. Relax NG seems to have emerged as the most
> flexible language for this kind of job. A Relax NG schema could
> (somehow) be associated with each XMDP profile.
> What may be problematic is validation of docs where there are multiple
> microformats in play and interactions that cannot be conveniently
> expressed in something like Relax NG. For example if a blog post
> containing hReview is followed by one containing hCalendar, both
> appear on the blog front page which already uses XFN... My guess is
> that an awful lot could be covered at the syntax level (in the same
> way as XHTML+SVG+MathML can be validated), but suspect that because
> the data is kind-of tunnelled, it might leak a little, especially
> where individual entities could take different roles under different
> profiles.
> I believe there may a fairly straightforward alternate/complementary
> solution, working above the syntax. Using XSLT transformation to
> RDF/XML (i.e. GRDDL), merger with an RDF Schema/OWL ontology which
> effectively contains the rules, then model consistency checking. i.e.
> a semantic validator . This could potentially be fully automated and
> operate on arbitrary docs/microformat combinations. (I've been meaning
> to try the same technique with Atom for a while, never enough time...)
> If a particular microformat does happen to be associated with an RDF
> Schema/OWL ontology then this should be already possible (with a
> little glue). For example there are already RDF mappings for vCard
> [3], iCalendar [4] and a model for reviews [5], so hCard, hCalendar
> and hReview are already candidates. (The extent to which the semantics
> of the schemas/ontologies can encode the domain language rules is
> another matter - my guess is this will again be "mostly").
> What I'm not sure is how best to derive the rules for what is/isn't
> allowed where RDF schemas/OWL ontologies aren't available. This
> shouldn't be a major issue, there isn't a flood of new formats to deal
> with, so creating the schemas as needed isn't unfeasible. (Maybe the
> machine-readable data in XMDP docs can help?) The other issue that
> stands out is how to determine automatically that a given RDF schema
> should be associated with a given microformat doc. Again, this isn't a
> showstopper, the interesting microformat docs will contain the URI of
> their profile in the <head>, so all any application such as a
> validator would need is a table mapping these to schema/ontology URIs,
> something that could be prepared manually if need be.
> Cheers,
> Danny.
> [1] http://microformats.org/discuss/mail/microformats-discuss/2005- 
> July/000306.html
> [2] http://feedvalidator.org/
> [3] http://www.w3.org/TR/vcard-rdf
> [4] http://esw.w3.org/topic/RdfCalendar
> [5] http://www.purl.org/stuff/rev
> http://dannyayers.com
