[uf-new] img alt content statistics
Derrick Lyndon Pallas
derrick at pallas.us
Sat Jul 14 13:42:18 PDT 2007
Manu Sporny wrote:
> "59% of most websites are complying with the HTML 4.01 specification
> regarding usage of 'alt' with image tags."
>
> I used the terminology "most websites" because the data gathered is,
> statistically speaking, overkill. Assuming 125,626,329 websites (per
> Netcraft) we would need a sample set of 384 websites to get a 95%
> confidence level with an interval of 5%.
>
> So, we needed 384 samples - we got 224,671 across 14,077 websites.
That's assuming that any given page from a website is representative of
that website. What you really want are examples of <img/> usage on the
web; the number of samples you need is based on usages/page *
pages/unique site * unique sites/internet.
For what it's worth, I actually did start an analysis but haven't had
time to do much with the data. I took a random chunk of our archive,
looked for every <a/>, storing the content of the anchor so I could look
for lonely <img/>s with @alt text.
The proof run found 1.4M <a/> on 14k pages. Of these anchors,
* 240k contain at least one <img/>
* 228k start with an <img/>
* 152k contain at least one <img/> with an @alt
* 121k contain at least one <img/> with a non-empty @alt
* 25k contain at least one <img/> with a @title
* 24k contain at least one <img/> with a non-empty @title
A total of 247k <img/> were found in anchors. Of these images,
* 151k contain an @alt
* 120k contain a non-empty @alt
* 25k contain a @title
* 23k contain a non-empty @title
* 11k have a garbage phrase (e.g. "click here", "use the right mouse
button to save", etc.) in @alt or @title
Of the 228k starting <img/>s,
* 142k contain an @alt
* 114k contain a non-empty @alt
* 24k contain a @title
* 22k contain a non-empty @title
* 11k have a garbage phrase in @alt or @title
The non-proof run is looking at 50x as many pages. All of this was
gleaned from the services at <http://tinyurl.com/23czqt> ~ Derrick Pallas
More information about the microformats-new
mailing list