From charlvn at charlvn.com Fri Aug 7 04:41:23 2009 From: charlvn at charlvn.com (Charl van Niekerk) Date: Fri Aug 7 04:41:28 2009 Subject: [uf-dev] PHP XPath Extractor Message-ID: <3817f86f0908070441i4d20d340mfad326ae4b34d8c1@mail.gmail.com> Hi All, I wrote a small PHP class called Extractor (for lack of a better name) recently which makes use of the PHP5 DOM extension to parse an HTML document and then apply a collection of XPath expressions on that in a hierarchical fashion: http://code.google.com/p/hyperdata/source/browse/trunk/extractor/Extractor.php The expressions and hierarchy is read from a simple XML configuration file, for example: http://code.google.com/p/hyperdata/source/browse/trunk/extractor/config.xml A simple example of use would be: http://code.google.com/p/hyperdata/source/browse/trunk/extractor/index.php This is still very much pre-alpha but any feedback in the meantime would be very welcome! Thanks, Charl