[uf-discuss] ufXtract - new microformats parser

Glenn Jones glenn.jones at madgex.com
Mon Nov 26 04:58:13 PST 2007


 
Guillaume Lebleu wrote:
> I was wondering what the configuration objects look like. Do you use a
grammar for each uf expressed?

They are c# collections. The plan is that once I have tuned the
components compliancy, I will add Xml serialisation. This will mean that
anyone will be able to defined their own POSH pattern or test new uf
ideas. 

I believe this is similar to how Michael Kaply used JavaScript objects
to defind microformats in Operator. Take a look at hAtom.js on
http://www.kaply.com/weblog/operator-user-scripts/. 

The Xml from a ufXtract configuration objects should look like:

<ufformatdescriber>
	<name>geo</name>
	<description>Location constructed of latitude and
longitude</description>
	<type>geo</type>
	<ufelementdescriber name="geo" attribute="class"
mandatory="false", multiples="true" concatenatevalues="false"
type="text">
		<ufelementdescriber name="latitude" attribute="class"
mandatory="false", multiples="false" concatenatevalues="false"
type="text" />
		<ufelementdescriber name="longitude" attribute="class"
mandatory="false", multiples="false" concatenatevalues="false"
type="text" />
	</ufelementdescriber>
</ufformatdescriber>

This are more complex in real life, but should give you an idea. You can
not define everything this way, there are some rules like hCard implied
'n' optimization which cannot be describe with this type of schemea.
That said it covers most cases without having to add new hardcoded rules
to the parser.  

Glenn Jones	
www.glennjones.net




More information about the microformats-discuss mailing list