[uf-new] Tag clouds: [was Microformats support for pagination

Kevin Marks kevinmarks at gmail.com
Thu Jan 15 01:03:28 PST 2009


On Thu, Jan 15, 2009 at 12:30 AM, Brian Suda <brian.suda at gmail.com> wrote:
>
> On 1/14/09, André Luís <andr3.pt at gmail.com> wrote:
> > Hmmm.. can't we use emails? if two hcards have the same email, aren't
> >  they the same entity?
>
> --- yes, but you also have the situation where you have two different
> email addresses and it is the same entity.
>
> >  > I don't think that that's quite André's point. A lot of blogs have tag
> >  > clouds - long lists of perhaps a hundred tags, in various sized fonts which
> >  > act as jumping off points to other parts of the site. They are not tags in
> >  > the rel=tag sense of the word in that they do not describe the content of
> >  > the current page, but of the site as a whole. People should not be marking
> >  > them with rel=tag, but nonetheless some people do. And it means that
> >  > essentially every single page on their site has the same massive set of tags
> >  > - rel=tag becomes useless on the whole site.
> >
> >
> > Exactly. I agree that this is not the purpose of rel-tags but I only
> >  brought it up because out of a very small sample, quite a few examples
> >  popped out. The only way out of this mess that I can think of, is to
> >  create a microformat for tagclouds, like a root element with
> >  class="tagcloud" (the actual name could be based on the most used
> >  term) and that would give parsers the mechanism to either exclude all
> >  rel-tags inside .tagcloud or to grab the rel-tags inside of the
> >  .tagcloud and bail out...
>
> --- OK, now this makes more sense. Yes, there are several ways to get
> around this. One would be to ignore it in the results if it was part
> of a tag cloud. Also, if they are publishing hAtom, you could do the
> inverse and only look at the rel-tags inside an hEntry. Finally, you
> might be able to apple some sort of normalizing algorthim to the data
> set. If every page had the same 15 tags, plus X more, you could drop
> the 15 from every entry thus removing the influence of the tag cloud
> on each page.

The latest HTML5 draft covers tag clouds:

http://www.whatwg.org/specs/web-apps/current-work/#tag-clouds

4.5.13.1 Tag clouds

This specification does not define any markup specifically for marking
up lists of keywords that apply to a group of pages (also known as tag
clouds). In general, authors are encouraged to either mark up such
lists using ul elements with explicit inline counts that are then
hidden and turned into a presentational effect using a style sheet, or
to use SVG.

Here, three tags are included in a short tag cloud:

<style>
@media screen, print, handheld, tv {
  /* should be ignored by non-visual browsers */
  .tag-cloud > li > span { display: none; }
  .tag-cloud > li { display: inline; }
  .tag-cloud-1 { font-size: 0.7em; }
  .tag-cloud-2 { font-size: 0.9em; }
  .tag-cloud-3 { font-size: 1.1em; }
  .tag-cloud-4 { font-size: 1.3em; }
  .tag-cloud-5 { font-size: 1.5em; }
}
</style>
...
<ul class="tag-cloud">
 <li class="tag-cloud-4"><a title="28 instances"
href="/t/apple">apple</a> <span>(popular)</span>
 <li class="tag-cloud-2"><a title="6 instances"
href="/t/kiwi">kiwi</a> <span>(rare)</span>
 <li class="tag-cloud-5"><a title="41 instances"
href="/t/pear">pear</a> <span>(very popular)</span>
</ul>

The actual frequency of each tag is given using the title attribute. A
CSS style sheet is provided to convert the markup into a cloud of
differently-sized words, but for user agents that do not support CSS
or are not visual, the markup contains annotations like "(popular)" or
"(rare)" to categorise the various tags by frequency, thus enabling
all users to benefit from the information.

The ul element is used (rather than ol) because the order is not
particular important: while the list is in fact ordered
alphabetically, it would convey the same information if ordered by,
say, the length of the tag.

The tag rel-keyword is not used on these a elements because they do
not represent tags that apply to the page itself; they are just part
of an index listing the tags themselves.



More information about the microformats-new mailing list