[uf-dev] PHP XPath Extractor

Charl van Niekerk charlvn at charlvn.com
Fri Aug 7 04:41:23 PDT 2009


Hi All,

I wrote a small PHP class called Extractor (for lack of a better name)
recently which makes use of the PHP5 DOM extension to parse an HTML
document and then apply a collection of XPath expressions on that in a
hierarchical fashion:

http://code.google.com/p/hyperdata/source/browse/trunk/extractor/Extractor.php

The expressions and hierarchy is read from a simple XML configuration
file, for example:

http://code.google.com/p/hyperdata/source/browse/trunk/extractor/config.xml

A simple example of use would be:

http://code.google.com/p/hyperdata/source/browse/trunk/extractor/index.php

This is still very much pre-alpha but any feedback in the meantime
would be very welcome!

Thanks,
Charl


More information about the microformats-dev mailing list