[uf-discuss] Enumerating Microformats on a Page

Breton Slivka zen at zenpsycho.com
Fri Mar 24 12:40:31 PST 2006


I find it to be an interesting idea, though I strongly suggest that  
such a sitemap should be optional, and user agents should crawl the  
entire site when no such sitemap exists.

Historically, sitemaps serve several very specific purposes:
Provide links to orphan pages
Exclude sites which the author does not want indexed (as in robots.txt)
Provide an index page for users

It would seem to me that a sitemap would not allow significantly more  
rapid discovery of microformats than simply crawling the site  
normally, and looking for supported root classnames. you face the  
problem of a sitemap becoming out of synch with your content, and  
thus missing out on newer or forgotten content due to an out of date  
TOC. To solve that problem you end up maintaining two versions of the  
data, and you've eliminated one of the key benefits of microformats,  
namely only having to maintain one source of data.

On the other hand, looking at it from a user centered, and search  
engine point of view, having a sitemap is good practice anyway, and  
if you're going to maintain one for the benefit of a search engine,  
why not have a standardized "best practice" for marking one up? Such  
an index could not only contain the links to all the pages on your  
site, but also rel="nofollow" links for sites that you don't want  
indexed, links to all the feeds on your site, and some kind of meta  
data format which perhaps indicates whether a link contains  
microformats.   I suggest such data should not be relied upon, but  
should instead inform a weighting mechanism such that pages indicated  
as containing microformats are crawled first in the queue, allowing a  
more responsive experience in any UA which implements this. Another  
possible choice is to use such links to present a menu of options to  
the user, to allow more discriminating selection of microformat content.

To this end, a good place to start would be to look to existing  
sitemaps, including google's sitemap xml markup, and the markup  
contained in various websites accross the net which contain sitemaps.






On Mar 24, 2006, at 1:16 PM, Phil Haack wrote:

> People do read Microformat content directly which I understand.  It  
> fits
> with the "Human First" principle.
>
> But references to the xmdp profiles are in the <head> element which  
> is NOT
> human readable.  So there is precedent for non-human readable
> discoverability mechanism within Microformats.
>
> At Mix06, Tantek pointed out that listing all the xmdp profiles  
> that a site
> used on a homepage could get unwieldy.
>
> I suppose if I wanted to help both people and an aggregator find  
> various
> Microformats of interest, there could be a microformat for a site  
> index.  My
> homepage could include it or simply link to it using some other  
> microformat.
>
> Thus for the human, there would be a simple link to follow <a
> href="/siteindex/" rel="siteindex">Site Map</a>.  Likewise, my  
> aggregator
> would look for this if it didn't find the xmdp profile for a  
> sitemap on the
> current page.
>
> I think this might be useful so aggregators (and users) don't have  
> to crawl
> an entire site.
>
> Has there been any work done in this area? Is it a bad idea?
>
>
> -----Original Message-----
> From: microformats-discuss-bounces at microformats.org
> [mailto:microformats-discuss-bounces at microformats.org] On Behalf Of  
> Scott
> Reynen
> Sent: Friday, March 24, 2006 11:50 AM
> To: Microformats Discuss
> Subject: Re: [uf-discuss] Enumerating Microformats on a Page
>
> Because feed auto-discovery links are in the content, not the headers
> of HTTP responses, aggregators have to download the entire page, and
> most aggregators search first for <link type="alternate" ...> tags,
> and second for something like <a href="something.rss">RSS</a>.  The
> link tag makes more sense here because people don't read feeds
> directly, so it doesn't make a lot of sense to provide human-readable
> <a> links to feeds.  But people *do* read microformat content
> directly, so if it's related to the current page, it should be linked
> from the current page, and any human or machine looking site-wide for
> microformat content (or anything else) should follow links throughout
> the site.
>
> Peace,
> Scott
>
> _______________________________________________
> microformats-discuss mailing list
> microformats-discuss at microformats.org
> http://microformats.org/mailman/listinfo/microformats-discuss



More information about the microformats-discuss mailing list