[uf-discuss] Enumerating Microformats on a Page

Phil Haack haacked at gmail.com
Fri Mar 24 14:33:57 PST 2006

> First of all, welcome to the list, Phil. I encourage you to read  
> through the list archives and the wiki, there's a lot of background  
> material and previous discussion there.

Thanks Ryan. I'm working through the material, but as you mentioned below,
navigating and indexing content is a problem for the web as well. ;)

>> I suppose if I wanted to help both people and an aggregator find 
>> various Microformats of interest, there could be a microformat for a 
>> site index.  My homepage could include it or simply link to it using 
>> some other microformat.

> Hmm, this sounds to me like a theoretical argument. I'd like to hear  
> what experience people have had here. Has anyone here worked on  
> crawling to index microformats? If so, what challenges did you face?

Well the closest experience I have is writing auto-discovery algorithms for
rss and atom links on a web page.  Everybody seems to publish their atom and
rss link in a variety of ways. Some put them right on top. Some in the
footer. Some in a column.  Yet others add a link to another page that then
displays a list of available feeds (category feeds, comment feeds, etc...).

As a user, some designs make it difficult to find the link. If there is one
there, I pretty much eventually find it. But I've seen pages where it took a
while to find it.  Likewise, if there is no rss link, it can take a while
before I realize that.

However, when I point my auto-discovery algorithm at a page, my aggregator
quickly pops up a list of rss feeds.  Sometimes it discovers feeds I didn't
notice on the page, but am interested in.  I like this sort of discovery.

Of course this can be solved with having everybody start using cleaner
design. ;)

>> Thus for the human, there would be a simple link to follow <a
>> href="/siteindex/" rel="siteindex">Site Map</a>.  Likewise, my  
>> aggregator
>> would look for this if it didn't find the xmdp profile for a  
>> sitemap on the
>> current page.

> Hmm, what's wrong with using the standard 'Contents' [http:// 
> www.w3.org/TR/REC-html40/types.html#type-links]?

Probably nothing.  Just my ignorance. I'll take a look. Thanks!

More information about the microformats-discuss mailing list