[uf-dev] Microformat Parser for .Net
Paul Kinlan
paul.kinlan at gmail.com
Fri Sep 7 13:58:27 PDT 2007
Hi all,
I am new to the development list, but I have been on the uf-discuss list for
a while now. I thought that this list was the best place to announce that I
have created a usable [although beta] release of a generic microformat
parser for .Net.
The project can be found on codeplex at http://www.codeplex.com/microformat.
The current release is Iteration 3.
The parser is stream based and uses an application configuration (see below
for an example) to define the how the parser should parse the html/xml
stream. This flexible configuration means that if a spec changes for a
microformat or a new one is introduced then no code needs to be changed in
the framework to let users of the framework see the changed data.
<configSections>
<section name="MicroformatsSection" type="
Microformats.ConfigurationSections.MicroformatConfigSection, Microformat.net
"/>
</configSections>
<MicroformatsSection>
<Microformats>
<Microformat type="rel-tag" rootType="rel" root="tag" dataType="
System.Uri" />
<Microformat type="hCard" rootType="class" root="vcard" dataType="
System.String">
<Fields>
<Field name="fn" dataType="System.String" plurality="Singular"/>
<Field name="url" dataType="System.Uri" plurality="Singular"/>
<Field name="email" dataType="System.Uri" plurality="Singular"/>
<Field name="adr" dataType="Microformat" plurality="Singular"/>
</Fields>
</Microformat>
<Microformat type="adr" rootType="class" root="adr" dataType="
System.String">
<Fields>
<Field name="post-office-box" dataType="System.String"
plurality="Singular"/>
<Field name="extended-address" dataType="System.String"
plurality="Singular"/>
<Field name="street-address" dataType="System.String"
plurality="Singular"/>
<Field name="locality" dataType="System.String"
plurality="Singular"/>
<Field name="region" dataType="System.String"
plurality="Singular"/>
<Field name="postal-code" dataType="System.String"
plurality="Singular"/>
<Field name="country-name" dataType="System.String"
plurality="Singular"/>
</Fields>
</Microformat>
</Microformats>
</MicroformatsSection>
The above configuration says that the following microformats are to be
searched for: rel-tag, hCard and adr. Each microformat configuration can
also be nested (see the hCard spec that allows an adr to be nested inside
itself). This saves on duplicating configuration information.
(Unfortunately a circular reference in the configuration can be defined and
plurality of elements is not implemented. This will be fixed soon).
Currently in this configuration not all of the hCard spec is defined (this
was done for simplicity of me showing you how the config works), obviously
this means that any parts of a microformat that you are not interested in
you won't see in the output of the framework.
I still have a lot of work to do, however it appears (to me at least) to be
quite flexible. I would greatly appreciate any comments and feedback and if
you use the framework I would love to hear about it. If anyone is
interested in joining the project let me know.
Kind Regards,
Paul Kinlan
Nb. The code is released under the Microsoft permissive licence, this
licence fits best with the sgml reader code that is included in the project
by Chris Lovett.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://microformats.org/discuss/mail/microformats-dev/attachments/20070907/d4cfd69f/attachment.html
More information about the microformats-dev
mailing list