[uf-discuss] Parsing XFN in PHP
Julian Bond
julian_bond at voidstar.com
Tue Apr 8 05:10:35 PDT 2008
I need some advice about reading rel="me" tags in arbitrary web pages
using PHP. I'm intending to use this to help build a lifestream style
function. The basic intent is to cut down the amount of data entry the
user has to do. When they give me a MyBlogLog, Friendfeed, Plaxo Pulse
page that has lists of links to their profile pages I should be able to
avoid having to ask them for all of them again. So:-
- User gives me a URL for one of their profile pages
- Use Curl to collect the source
- Parse the source looking for links with a rel="me"
- Extract an array of Link URL - Link Text
- Do something useful with the array. (???? followed by Profit!)
I've been searching this morning for a PHP library to do the parsing and
link extraction or PHP examples or example regex to use in
PREG_MATCH_ALL or something/anything, without success. Since the source
data is probably badly written and broken html, I don't think I can use
XML methods as all the XML unserialising code I've used barfs on badly
formed XML. One possibility I suppose is to run it though HTML-Tidy
first but I run the (admittedly small) chance of html-tidy wiping out
some of the links.
So what do people use to consume XFN with PHP?
--
Julian Bond E&MSN: julian_bond at voidstar.com M: +44 (0)77 5907 2173
Webmaster: http://www.ecademy.com/ T: +44 (0)192 0412 433
Personal WebLog: http://www.voidstar.com/ skype:julian.bond?chat
Not Tested On Animals
More information about the microformats-discuss
mailing list