[uf-discuss] Re: Perl microformat parsing

Tatsuhiko Miyagawa miyagawa at gmail.com
Sat Feb 23 02:46:40 PST 2008


On 2/22/08, Takatsugu Shigeta <takatsugu.shigeta at gmail.com> wrote:
> my $url = 'http://diveintomark.org/projects/greasemonkey/hcard/tests/2-4-2-vcard.xhtml';
>
> my $fn = scraper {
>    process '.vcard .fn', 'fn[]' => 'TEXT';
>    process '.vcard .tel', 'tel[]' => 'TEXT';
>    process '.vcard .title', 'title[]' => 'TEXT';
>    result 'fn', 'tel', 'title';
> }->scrape(URI->new($url));

For a better nested output,

use strict;
use Web::Scraper;
use URI;

my $uri = URI->new("http://diveintomark.org/projects/greasemonkey/hcard/tests/2-4-2-vcard.xhtml");

my $scraper = scraper {
    process ".vcard", "vcards[]" => scraper {
        process ".email", email => '@href';
        process ".fn", fullname => "TEXT";
        process ".tel", tel => "TEXT";
        process ".title", title => "TEXT";
    };
};
my $result = $scraper->scrape($uri);

__END__
$VAR1 = {
  'vcards' => [
    {
      'email' => bless( do{\(my $o = 'mailto:jfriday at host.com')},
'URI::mailto' ),
      'tel' => '+1-919-555-7878',
      'fullname' => 'Joe Friday',
      'title' => 'Area Administrator, Assistant'
     },
  ]
};

Well, you get this vard twice because it has nester .vcard but I guess
that's fine :)

Thanks,

-- 
Tatsuhiko Miyagawa


More information about the microformats-discuss mailing list