[uf-discuss] stats on well formed XHTML
kevinmarks at gmail.com
Fri Jan 18 00:06:35 PST 2008
What I found with the technorati crawler was that the atom timestamps
were mroe reliabel than RSS, as RSS timezones were underspecified.
Talking of hAtom, here's a tool that uses it:
Niall told em last night that he's submitted a patch to MT 4.1 that
will make it output hAtom too.
On Jan 17, 2008 4:05 PM, Kevin Burton <burton at tailrank.com> wrote:
> > Not so. The Internet Archive knows the first time they've seen an URL,
> > over the past ten years; they can also tell you when the content has
> > significantly changed. Obviously, there is a bias towards pages (and
> > sites) with higher traffic, but that seems reasonable if you're
> > evaluating standard practices. ~ Derrick Pallas
> Yes... but it would suffer from crawler priority bias.
> If it was a low ranked page it might take a few month to get around to
> crawling it.
> Spinn3r would have better data here because we're real time....
> Observing the URL and hAtom timestamp as I mentioned before would be
> nice but would suffer from bias again.
More information about the microformats-discuss