[uf-new] img alt content (was:hAudio implemented on Bitmunk (with onesnag))

Manu Sporny msporny at digitalbazaar.com
Wed Jul 11 07:15:10 PDT 2007


Andy Mabbett wrote:
>> Images for example would have "Copyright CityLife Auckland. Suite at
>> our Auckland hotel accommodation"
> Unless that's the graphical content of the image, which seems
> unlikely, that's an abuse of the alt attribute; such text should be in
> the title attribute. It *does* violate the HTML specs. And how is it
> "pertinent to microformats"?

There seems to be two parts to this discussion:

1. HTML specification violation (alt tag mis-use)
2. How the alt attribute is being used in the real-world

Andy's line of reasoning is sound regarding the HTML specification
violation. I don't think that anybody can state that placing text that
does not match the graphical content of an image tag goes against the
HTML specification.

The second part is how the alt attribute is being used in the
real-world. Tantek has asserted that 'alt' is being mis-used on a wide
scale on the Interwebs.

As Scott has pointed out, the only way to know this is to start
gathering real data. I am in the process of writing an image crawler
(which will hopefully be done by tonight) to gather these statistics.

The crawler will crawl the web for image tags and gather statistics
regarding:

- How many image tags have 'alt' tags specified.
- How many image tags have 'title' tags specified.
- How many image tags have both specified.
- Whether or not the 'alt' tag matches the image being display (I'll
  setup a website for all of us to help in analyzing this data)

I'm assuming 125,626,329,000 unique images on the web (125,626,329
unique sites on the web - 1000 unique images per site).

Statistically, I think we would only need around 385 unique site samples
to get a 95% confidence interval with a 5% error rate (somebody correct
me if this is wrong). To be safe, I'll collect 100,000 unique image tags
, 1 per site to get our initial sample set.

Any objections to this method of data collection?

-- manu



More information about the microformats-new mailing list