[uf-discuss] Somewhat Universal Microformat Parser
David Janes -- BlogMatrix
davidjanes at blogmatrix.com
Wed Dec 7 06:51:45 PST 2005
Here's what I've been working on for the last couple of days. It's a
service -- actually, a front end onto a Python library/framework -- that
can rip apart microformats into a (hopefully) simpler format that will
be easier for programs to manipulate.
pages:
- the interface [1]
- an example of hAtom parsing [2]
you can paste XHTML fragments in -- try something from the hReview page [3].
microformats supported:
- hatom - pretty good
- hreview - a lot of work is needed
- hcard - pretty good
- rel-tag - actually, a slightly expanded "rel-reviewed-tag" from hreview
I hope to have vCalendar and xEntry in their this afternoon/tomorrow.
Here's what a parser looks like [4]
Regards, etc...
David
http://www.blogmatrix.com
[1] http://www.davidjanes.com/microformats/extract/
[2]
http://www.davidjanes.com/microformats/extract/?uri=http%3A%2F%2Fblog.davidjanes.com%2Fµformat=hatom&submit=Submit
[3] http://microformats.org/wiki/hreview
[4]
class MicroformatHReview(microformat.Microformat):
def __init__(self):
microformat.Microformat.__init__(self, "hreview")
self.CollectClassText('version')
self.CollectClassText('summary', text_type = microformat.TT_XML_INNER)
self.CollectClassText('description', text_type =
microformat.TT_XML_INNER)
self.CollectClassText('type')
self.CollectClassText('dtreviewed', text_type = microformat.TT_ABBR_DT)
self.CollectClassText('info', text_type = microformat.TT_XML_OUTER)
self.CollectClassText('reviewer', text_type = microformat.TT_XML_OUTER)
self.CollectRelAttribute('permalink', 'href')
self.CollectClassText('rating', text_type = microformat.TT_ABBR)
self.CollectClassText('best', text_type = microformat.TT_ABBR)
self.CollectClassText('worst', text_type = microformat.TT_ABBR)
self.CollectClassModifier('item')
self.CollectRelReparse('tag', reltag.MicroformatRelTag())
self.CollectClassReparse('reviewer', hcard.MicroformatHCard())
self.DeclareRepeatingName('reviewer')
self.DeclareRepeatingName('tag')
More information about the microformats-discuss
mailing list