[uf-discuss] Enumerating Microformats on a Page
haacked at gmail.com
Fri Mar 24 14:33:57 PST 2006
> First of all, welcome to the list, Phil. I encourage you to read
> through the list archives and the wiki, there's a lot of background
> material and previous discussion there.
Thanks Ryan. I'm working through the material, but as you mentioned below,
navigating and indexing content is a problem for the web as well. ;)
>> I suppose if I wanted to help both people and an aggregator find
>> various Microformats of interest, there could be a microformat for a
>> site index. My homepage could include it or simply link to it using
>> some other microformat.
> Hmm, this sounds to me like a theoretical argument. I'd like to hear
> what experience people have had here. Has anyone here worked on
> crawling to index microformats? If so, what challenges did you face?
Well the closest experience I have is writing auto-discovery algorithms for
rss and atom links on a web page. Everybody seems to publish their atom and
rss link in a variety of ways. Some put them right on top. Some in the
footer. Some in a column. Yet others add a link to another page that then
displays a list of available feeds (category feeds, comment feeds, etc...).
As a user, some designs make it difficult to find the link. If there is one
there, I pretty much eventually find it. But I've seen pages where it took a
while to find it. Likewise, if there is no rss link, it can take a while
before I realize that.
However, when I point my auto-discovery algorithm at a page, my aggregator
quickly pops up a list of rss feeds. Sometimes it discovers feeds I didn't
notice on the page, but am interested in. I like this sort of discovery.
Of course this can be solved with having everybody start using cleaner
>> Thus for the human, there would be a simple link to follow <a
>> href="/siteindex/" rel="siteindex">Site Map</a>. Likewise, my
>> would look for this if it didn't find the xmdp profile for a
>> sitemap on the
>> current page.
> Hmm, what's wrong with using the standard 'Contents' [http://
Probably nothing. Just my ignorance. I'll take a look. Thanks!
More information about the microformats-discuss