From arash.amiri at researchstudio.at Wed Jul 4 05:18:38 2007
From: arash.amiri at researchstudio.at (Arash Amiri)
Date: Wed Jul 4 05:19:07 2007
Subject: [uf-new] micorformat for a shopping task
Message-ID: <468B901E.5080607@researchstudio.at>
Hi!
I was wondering if it makes sense to create some microformat for a
shopping task. This microformat describes what you want to buy, until
when you want to have it (deadline?), and maybe some more things...
The reason why I mention this it to find some "portable description" of
things you need. For example, you walk passed a supermarket and get
reminded that you need some bread. There is probably no format
encapsulating this "I need this until then..."-thing.
any comments (or is that just out of scope of the idea?)
From scott at makedatamakesense.com Wed Jul 4 07:52:36 2007
From: scott at makedatamakesense.com (Scott Reynen)
Date: Wed Jul 4 07:52:41 2007
Subject: [uf-new] micorformat for a shopping task
In-Reply-To: <468B901E.5080607@researchstudio.at>
References: <468B901E.5080607@researchstudio.at>
Message-ID: <5C5B4EA3-B828-4FC5-8079-0A87C6E68DEC@makedatamakesense.com>
On Jul 4, 2007, at 6:18 AM, Arash Amiri wrote:
> I was wondering if it makes sense to create some microformat for a
> shopping task. This microformat describes what you want to buy,
> until when you want to have it (deadline?), and maybe some more
> things...
>
> The reason why I mention this it to find some "portable
> description" of things you need. For example, you walk passed a
> supermarket and get reminded that you need some bread. There is
> probably no format encapsulating this "I need this until then..."-
> thing.
>
> any comments (or is that just out of scope of the idea?)
Have you tried applying hListing to this?
http://microformats.org/wiki/hlisting
--
Scott Reynen
MakeDataMakeSense.com
From msporny at digitalbazaar.com Sun Jul 8 13:39:33 2007
From: msporny at digitalbazaar.com (Manu Sporny)
Date: Sun Jul 8 13:39:37 2007
Subject: [uf-new] hAudio implemented on Bitmunk (with one snag)
Message-ID: <46914B85.4040903@digitalbazaar.com>
We've gone through and implemented hAudio on Bitmunk.com (one of our
service websites). David Lehn, one of our semantic web guys, has also
created an hAudio plug-in for Operator. Mike Kaply, author of Operator,
said that he will make it available via the Operator download section
within the next week or two. To view some hAudio compliant markup, you
can go to the following link:
http://www.bitmunk.com/view/media/6011098
There are over 850,000 songs that have been marked up on the website. We
are in the process of talking our partners, colleagues and competitors
into using hAudio to mark up their audio content as well. So, good
progress is being made in implementing hAudio.
However, we've hit a snag when it comes to usability with hAudio and
Operator/Firefox 3.
Problem Description:
It is quite often that a site uses an image instead of a text link to
present actions. For example: Instead of using the text "Download", they
will use a graphic image with a downward-facing arrow.
In other words, if we have this:
Download:
How do we present this option to a human being in a non-web-page UI?
How it relates to the Examples:
We (Bitmunk.com) has this problem with 'rel-sample', 'rel-enclosure',
and 'rel-payment'. Most of the examples also contain images instead of
text for samples, downloads and purchase links. This is a demonstrable,
widespread problem.
The problem with Operator and screen readers:
If there is no text to display, then how does one place the item into a
menu/display for Operator/Firefox? Grabbing the image and placing it in
a UI is a difficult argument to make - there are a variety of image
sizes that might not do well in the Operator UI (or Firefox 3 UI).
Proposed solution:
We have a fix for Operator that uses the link title text if there is no
internal text. This fixes the problem for both Operator menu display,
Firefox 3 UI display and for screen readers. Here's how the site author
would change the text above:
Download:
This approach is beneficial for the following reasons:
1. It POSH-ifies the website.
2. It works well with Operator, Firefox 3 and other uF parsers/UIs.
3. It fixes the accessibility/screen reader problem.
We need feedback/consensus from the uF community before submitting the
patch for inclusion into Operator/Firefox 3. Is there anybody that
disagrees with this approach or has a better approach?
-- manu
From andy at pigsonthewing.org.uk Sun Jul 8 14:10:23 2007
From: andy at pigsonthewing.org.uk (Andy Mabbett)
Date: Sun Jul 8 14:10:30 2007
Subject: [uf-new] hAudio implemented on Bitmunk (with one snag)
In-Reply-To: <46914B85.4040903@digitalbazaar.com>
References: <46914B85.4040903@digitalbazaar.com>
Message-ID:
In message <46914B85.4040903@digitalbazaar.com>, Manu Sporny
writes
>Problem Description:
>
>It is quite often that a site uses an image instead of a text link to
>present actions. For example: Instead of using the text "Download",
>they will use a graphic image with a downward-facing arrow.
>
>In other words, if we have this:
>
>Download:
>
>
>
>
>How do we present this option to a human being in a non-web-page UI?
The HTML is invalid, lacking the alt attribute which should fix this
problem.
--
Andy Mabbett
From msporny at digitalbazaar.com Sun Jul 8 14:30:12 2007
From: msporny at digitalbazaar.com (Manu Sporny)
Date: Sun Jul 8 14:30:15 2007
Subject: [uf-new] Mapping the hAudio Microformat to hAudio RDFa
Message-ID: <46915764.1070900@digitalbazaar.com>
What is the process for mapping the hAudio Microformat to RDFa? The
reason that we need to do this is because the hAlbum/hVideo specs that
we've been researching internally have some nasty long-term design
issues due to the no-namespace/scope-less approach that uFs have adopted.
We'd like to make hAudio a standard across "semantic languages". We
don't want to go through the same arduous process that everybody had to
go through on here concerning hAudio with another community... it would
be a waste of everybody's time. The hard work (research, examples and
analysis) was accomplished via the Microformats process.
Implementing hAudio using the Microformats approach isn't going to work
for complicated/nested audio/video/image structures. We have a very
large database that we would like to semantic-ify and we would like to
do it in a standards-compliant way. How can we propose an RDFa standard
for hAudio that is scrutinized and adopted by this community?
In other words - Microformats did a good job with the design. We'd like
to give people the option to implement using either hAudio uF or hAudio
RDFa.
-- manu
From microformats at kaply.com Mon Jul 9 05:37:49 2007
From: microformats at kaply.com (Mike Kaply)
Date: Mon Jul 9 05:37:53 2007
Subject: [uf-new] hAudio implemented on Bitmunk (with one snag)
In-Reply-To:
References: <46914B85.4040903@digitalbazaar.com>
Message-ID:
Actually the alt attribute WON'T fix this problem. Because the
microformat attribute is on the anchor tag, not the image.
Microformats grab the text in the tag. They only grab the image alt
text if the microformat class is on the image itself. Here's a
different example:
I realize this is a little contrived, but you get the idea.
In this case, the fn is empty.
Mike Kaply
On 7/8/07, Andy Mabbett wrote:
> In message <46914B85.4040903@digitalbazaar.com>, Manu Sporny
> writes
>
> >Problem Description:
> >
> >It is quite often that a site uses an image instead of a text link to
> >present actions. For example: Instead of using the text "Download",
> >they will use a graphic image with a downward-facing arrow.
> >
> >In other words, if we have this:
> >
> >Download:
> >
> >
> >
> >
> >How do we present this option to a human being in a non-web-page UI?
>
> The HTML is invalid, lacking the alt attribute which should fix this
> problem.
>
> --
> Andy Mabbett
> _______________________________________________
> microformats-new mailing list
> microformats-new@microformats.org
> http://microformats.org/mailman/listinfo/microformats-new
>
From andy at pigsonthewing.org.uk Mon Jul 9 07:14:12 2007
From: andy at pigsonthewing.org.uk (Andy Mabbett)
Date: Mon Jul 9 07:14:16 2007
Subject: [uf-new] hAudio implemented on Bitmunk (with one snag)
In-Reply-To:
References: <46914B85.4040903@digitalbazaar.com>
Message-ID: <18788.80.86.36.97.1183990452.squirrel@www.gradwell.com>
On Mon, July 9, 2007 13:37, Mike Kaply wrote:
> On 7/8/07, Andy Mabbett wrote:
>
>> In message <46914B85.4040903@digitalbazaar.com>, Manu Sporny
>> writes
>>> if we have this:
>>> Download:
>>>
>>>
>>>
>>> How do we present this option to a human being in a non-web-page UI?
>> The HTML is invalid, lacking the alt attribute which should fix this
>> problem.
> Actually the alt attribute WON'T fix this problem. Because the
> microformat attribute is on the anchor tag, not the image. Microformats
> grab the text in the tag. They only grab the image alt text if the
> microformat class is on the image itself. Here's a different example:
>
>
alt="Mike Kaply">
>
> I realize this is a little contrived, but you get the idea.
> In this case, the fn is empty.
My argument is that the fn should /not/ be empty; the "alt" attribute
contains the text equivalent of the image. To discount it as you suggest
is to ignore the semantics of the mark-up presented to you.
--
Andy Mabbett
** via webmail **
From scott at makedatamakesense.com Mon Jul 9 07:37:49 2007
From: scott at makedatamakesense.com (Scott Reynen)
Date: Mon Jul 9 07:38:02 2007
Subject: [uf-new] img alt content (was:hAudio implemented on Bitmunk (with
one snag))
In-Reply-To: <18788.80.86.36.97.1183990452.squirrel@www.gradwell.com>
References: <46914B85.4040903@digitalbazaar.com>
<18788.80.86.36.97.1183990452.squirrel@www.gradwell.com>
Message-ID:
On Jul 9, 2007, at 8:14 AM, Andy Mabbett wrote:
>> They only grab the image alt text if the
>> microformat class is on the image itself. Here's a different example:
>>
>>
> alt="Mike Kaply">
>>
>> I realize this is a little contrived, but you get the idea.
>
>> In this case, the fn is empty.
>
> My argument is that the fn should /not/ be empty; the "alt" attribute
> contains the text equivalent of the image.
I agree this matches the semantics of the alt attribute; however, I
suspect few publishers are currently using this attribute
appropriately, so I think we should do more research into the likely
ramifications of such a change before making it.
--
Scott Reynen
MakeDataMakeSense.com
From derrick at pallas.us Mon Jul 9 08:42:58 2007
From: derrick at pallas.us (Derrick Lyndon Pallas)
Date: Mon Jul 9 08:42:59 2007
Subject: [uf-new] img alt content
In-Reply-To:
References: <46914B85.4040903@digitalbazaar.com> <18788.80.86.36.97.1183990452.squirrel@www.gradwell.com>
Message-ID: <46925782.3010006@pallas.us>
Actually, I can probably be of help here, having written the Alexa Image
Search indexer. While I can't divulge too much about what goes into
building the index, I'll see if I can find some time to take a look at
the usage of img/@alt inside hcard/fn some time this week. Is there
anything specific anyone would like me to look for? ~D
Scott Reynen wrote:
> On Jul 9, 2007, at 8:14 AM, Andy Mabbett wrote:
>
>>> They only grab the image alt text if the
>>> microformat class is on the image itself. Here's a different example:
>>>
>>>
>> alt="Mike Kaply">
>>>
>>> I realize this is a little contrived, but you get the idea.
>>
>>> In this case, the fn is empty.
>>
>> My argument is that the fn should /not/ be empty; the "alt" attribute
>> contains the text equivalent of the image.
>
> I agree this matches the semantics of the alt attribute; however, I
> suspect few publishers are currently using this attribute
> appropriately, so I think we should do more research into the likely
> ramifications of such a change before making it.
>
> --
> Scott Reynen
> MakeDataMakeSense.com
>
>
> _______________________________________________
> microformats-new mailing list
> microformats-new@microformats.org
> http://microformats.org/mailman/listinfo/microformats-new
From bewest at gmail.com Mon Jul 9 12:34:22 2007
From: bewest at gmail.com (Benjamin West)
Date: Mon Jul 9 12:34:24 2007
Subject: [uf-new] hAudio implemented on Bitmunk (with one snag)
In-Reply-To: <46914B85.4040903@digitalbazaar.com>
References: <46914B85.4040903@digitalbazaar.com>
Message-ID: <8ad71be30707091234r17640ce3jcf5bd36d3abe6fb9@mail.gmail.com>
One possible solution is to use an image replacement technique. Also,
you may choose to send content in a and then hide it when it's
unnecessary.
-Ben
On 7/8/07, Manu Sporny wrote:
> We've gone through and implemented hAudio on Bitmunk.com (one of our
> service websites). David Lehn, one of our semantic web guys, has also
> created an hAudio plug-in for Operator. Mike Kaply, author of Operator,
> said that he will make it available via the Operator download section
> within the next week or two. To view some hAudio compliant markup, you
> can go to the following link:
>
> http://www.bitmunk.com/view/media/6011098
>
> There are over 850,000 songs that have been marked up on the website. We
> are in the process of talking our partners, colleagues and competitors
> into using hAudio to mark up their audio content as well. So, good
> progress is being made in implementing hAudio.
>
> However, we've hit a snag when it comes to usability with hAudio and
> Operator/Firefox 3.
>
> Problem Description:
>
> It is quite often that a site uses an image instead of a text link to
> present actions. For example: Instead of using the text "Download", they
> will use a graphic image with a downward-facing arrow.
>
> In other words, if we have this:
>
> Download:
>
>
>
>
> How do we present this option to a human being in a non-web-page UI?
>
> How it relates to the Examples:
>
> We (Bitmunk.com) has this problem with 'rel-sample', 'rel-enclosure',
> and 'rel-payment'. Most of the examples also contain images instead of
> text for samples, downloads and purchase links. This is a demonstrable,
> widespread problem.
>
> The problem with Operator and screen readers:
>
> If there is no text to display, then how does one place the item into a
> menu/display for Operator/Firefox? Grabbing the image and placing it in
> a UI is a difficult argument to make - there are a variety of image
> sizes that might not do well in the Operator UI (or Firefox 3 UI).
>
> Proposed solution:
>
> We have a fix for Operator that uses the link title text if there is no
> internal text. This fixes the problem for both Operator menu display,
> Firefox 3 UI display and for screen readers. Here's how the site author
> would change the text above:
>
> Download:
> href="http://my.site.com/download/MySong.mp3">
>
>
>
> This approach is beneficial for the following reasons:
>
> 1. It POSH-ifies the website.
> 2. It works well with Operator, Firefox 3 and other uF parsers/UIs.
> 3. It fixes the accessibility/screen reader problem.
>
> We need feedback/consensus from the uF community before submitting the
> patch for inclusion into Operator/Firefox 3. Is there anybody that
> disagrees with this approach or has a better approach?
>
> -- manu
> _______________________________________________
> microformats-new mailing list
> microformats-new@microformats.org
> http://microformats.org/mailman/listinfo/microformats-new
>
From tantek at cs.stanford.edu Mon Jul 9 12:36:54 2007
From: tantek at cs.stanford.edu (Tantek =?ISO-8859-1?B?xw==?=elik)
Date: Mon Jul 9 12:37:03 2007
Subject: [uf-new] img alt content (was:hAudio implemented on Bitmunk
(with one snag))
In-Reply-To:
Message-ID:
On 7/9/07 7:37 AM, "Scott Reynen" wrote:
> On Jul 9, 2007, at 8:14 AM, Andy Mabbett wrote:
>
>>> They only grab the image alt text if the
>>> microformat class is on the image itself. Here's a different example:
>>>
>>>
>> alt="Mike Kaply">
>>>
>>> I realize this is a little contrived, but you get the idea.
>>
>>> In this case, the fn is empty.
>>
>> My argument is that the fn should /not/ be empty; the "alt" attribute
>> contains the text equivalent of the image.
>
> I agree this matches the semantics of the alt attribute; however, I
> suspect few publishers are currently using this attribute
> appropriately, so I think we should do more research into the likely
> ramifications of such a change before making it.
This was deliberately rejected at the creation of hCard to give publishers
more control.
All too often there is "garbage" (or just extra unwanted text) in alt
attributes for a variety of publisher reasons.
Thus only if the publisher explicitly *wants* the text from the alt
attribute do they add the respective class value to get it.
I've added this to the hCard FAQ as well:
http://microformats.org/wiki/hcard-faq#Why_is_IMG_alt_not_being_picked_up
Tantek
From microformats at kaply.com Mon Jul 9 13:25:16 2007
From: microformats at kaply.com (Mike Kaply)
Date: Mon Jul 9 13:25:21 2007
Subject: [uf-new] hAudio implemented on Bitmunk (with one snag)
In-Reply-To: <18788.80.86.36.97.1183990452.squirrel@www.gradwell.com>
References: <46914B85.4040903@digitalbazaar.com>
<18788.80.86.36.97.1183990452.squirrel@www.gradwell.com>
Message-ID:
OK, let's try a different example:
Michael
Kaply
Mike Kaply
On 7/9/07, Andy Mabbett wrote:
> On Mon, July 9, 2007 13:37, Mike Kaply wrote:
> > On 7/8/07, Andy Mabbett wrote:
> >
> >> In message <46914B85.4040903@digitalbazaar.com>, Manu Sporny
> >> writes
>
>
> >>> if we have this:
>
> >>> Download:
> >>>
> >>>
> >>>
>
> >>> How do we present this option to a human being in a non-web-page UI?
>
> >> The HTML is invalid, lacking the alt attribute which should fix this
> >> problem.
>
> > Actually the alt attribute WON'T fix this problem. Because the
> > microformat attribute is on the anchor tag, not the image. Microformats
> > grab the text in the tag. They only grab the image alt text if the
> > microformat class is on the image itself. Here's a different example:
> >
> >
> alt="Mike Kaply">
> >
> > I realize this is a little contrived, but you get the idea.
>
> > In this case, the fn is empty.
>
> My argument is that the fn should /not/ be empty; the "alt" attribute
> contains the text equivalent of the image. To discount it as you suggest
> is to ignore the semantics of the mark-up presented to you.
>
> --
> Andy Mabbett
> ** via webmail **
>
> _______________________________________________
> microformats-new mailing list
> microformats-new@microformats.org
> http://microformats.org/mailman/listinfo/microformats-new
>
From andy at pigsonthewing.org.uk Mon Jul 9 13:51:37 2007
From: andy at pigsonthewing.org.uk (Andy Mabbett)
Date: Mon Jul 9 13:53:18 2007
Subject: [uf-new] hAudio implemented on Bitmunk (with one snag)
In-Reply-To:
References: <46914B85.4040903@digitalbazaar.com>
<18788.80.86.36.97.1183990452.squirrel@www.gradwell.com>
Message-ID:
In message
, Mike
Kaply writes
>OK, let's try a different example:
>
>Michael
src="foo.jpg" alt="Aaron"> Kaply
Under what circumstances would "Aaron" be appropriate alt text? What
would the picture show?
--
Andy Mabbett
From chris at placenamehere.com Mon Jul 9 16:41:58 2007
From: chris at placenamehere.com (Chris Casciano)
Date: Mon Jul 9 16:42:15 2007
Subject: [uf-new] hAudio implemented on Bitmunk (with one snag)
In-Reply-To:
References: <46914B85.4040903@digitalbazaar.com>
<18788.80.86.36.97.1183990452.squirrel@www.gradwell.com>
Message-ID:
On Jul 9, 2007, at 4:51 PM, Andy Mabbett wrote:
> In message
> , Mike
> Kaply writes
>
>> OK, let's try a different example:
>>
>> Michael
> src="foo.jpg" alt="Aaron"> Kaply
>
> Under what circumstances would "Aaron" be appropriate alt text?
> What would the picture show?
how about some stylized first letter if styling a whole name doesn't
float your boat
ichael Aaron Kaply
[i hope this doesn't start turning into a discussion of image
replacement methods]
Another case I've run across has been one of listing of vendors or
associates, some with logos some without...
Company B
So the question becomes are the above two items functionally
equivalent or are they not?
And if they are functionally different does that mean that my CMS or
authoring tool or other templating logic need to be smart enough to
move the classes around to different elements depending on the data
provided for the entry?
--
[ Chris Casciano ]
[ chris@placenamehere.com ] [ http://placenamehere.com ]
From msporny at digitalbazaar.com Mon Jul 9 19:06:46 2007
From: msporny at digitalbazaar.com (Manu Sporny)
Date: Mon Jul 9 19:06:49 2007
Subject: [uf-new] img alt content
In-Reply-To: <46925782.3010006@pallas.us>
References: <46914B85.4040903@digitalbazaar.com> <18788.80.86.36.97.1183990452.squirrel@www.gradwell.com>
<46925782.3010006@pallas.us>
Message-ID: <4692E9B6.9020905@digitalbazaar.com>
Derrick Lyndon Pallas wrote:
> Actually, I can probably be of help here, having written the Alexa Image
> Search indexer. While I can't divulge too much about what goes into
> building the index, I'll see if I can find some time to take a look at
> the usage of img/@alt inside hcard/fn some time this week. Is there
> anything specific anyone would like me to look for? ~D
Derrick, thanks for offering to help. It would be a great help if you
could give us the stats on the following:
How often is ONLY an image used as the target of a link in
hAtom/hReview/hCard/hCalendar? In other words, how often does this happen:
How often is 'alt' defined for those images?
How often is 'title' defined for those images?
Does the alt/title usually match what the image is depicting?
-- manu
From msporny at digitalbazaar.com Mon Jul 9 19:30:38 2007
From: msporny at digitalbazaar.com (Manu Sporny)
Date: Mon Jul 9 19:30:41 2007
Subject: [uf-new] img alt content (was:hAudio implemented on Bitmunk (with
one snag))
In-Reply-To: <18788.80.86.36.97.1183990452.squirrel@www.gradwell.com>
References: <46914B85.4040903@digitalbazaar.com>
<18788.80.86.36.97.1183990452.squirrel@www.gradwell.com>
Message-ID: <4692EF4E.60709@digitalbazaar.com>
Andy Mabbett wrote:
> My argument is that the fn should /not/ be empty; the "alt" attribute
> contains the text equivalent of the image. To discount it as you suggest
> is to ignore the semantics of the mark-up presented to you.
I believe Andy and Scott are referring to Section 13.2 of the HTML 4.01
specification:
http://www.w3.org/TR/html4/struct/objects.html#h-13.2
alt %Text; #REQUIRED -- short description --
and Section 13.8 of the HTML 4.01 specification:
http://www.w3.org/TR/html4/struct/objects.html#h-13.8
Specifically, the sections that state the following concerning alternate
text for images:
* Do not specify irrelevant alternate text when including images
intended to format a page, for instance, alt="red ball" would be
inappropriate for an image that adds a red ball for decorating a heading
or paragraph. In such cases, the alternate text should be the empty
string (""). Authors are in any case advised to avoid using images to
format pages; style sheets should be used instead.
* Do not specify meaningless alternate text (e.g., "dummy text").
Not only will this frustrate users, it will slow down user agents that
must convert text to speech or braille output.
I think Andy and Scott have the correct approach to this problem. All
one must do is view the following in a text-based browser, such as
Lynx... or in Firefox/Opera/etc and the answer becomes much clearer:
Test of link with image with alt text
Here's a link with an image with alt test:
It!
The text that is displayed as a link in Lynx is : "Microformat It!"
The text that is displayed as a link in Firefox is: "Microformat It!"
Mike, would it be possible to write a parseTagTextFromImages() function
that would extract the 'alt' text from images? Therefore, running it
over the following HTML:
Kaply
Would yield the text "Michael Kaply" for 'fn'. Using this approach would
also solve the hAudio problem as well as the problems that have been
raised thus far in this thread.
-- manu
From tantek at cs.stanford.edu Mon Jul 9 19:40:52 2007
From: tantek at cs.stanford.edu (Tantek =?ISO-8859-1?B?xw==?=elik)
Date: Mon Jul 9 19:40:52 2007
Subject: [uf-new] img alt content (was:hAudio implemented on Bitmunk
(with one snag))
In-Reply-To: <4692EF4E.60709@digitalbazaar.com>
Message-ID:
On 7/9/07 7:30 PM, "Manu Sporny" wrote:
> Mike, would it be possible to write a parseTagTextFromImages() function
> that would extract the 'alt' text from images? Therefore, running it
> over the following HTML:
>
>
>
Kaply
>
>
> Would yield the text "Michael Kaply" for 'fn'. Using this approach would
> also solve the hAudio problem as well as the problems that have been
> raised thus far in this thread.
This would be non-compliant with hCard parsing and thus should be AVOIDED.
http://microformats.org/wiki/hcard-parsing
See the recent FAQ for more details.
http://microformats.org/wiki/hcard-faq#Why_is_IMG_alt_not_being_picked_up
Thanks,
Tantek
From msporny at digitalbazaar.com Mon Jul 9 19:51:10 2007
From: msporny at digitalbazaar.com (Manu Sporny)
Date: Mon Jul 9 19:51:15 2007
Subject: [uf-new] img alt content (was:hAudio implemented on Bitmunk (with
one snag))
In-Reply-To:
References:
Message-ID: <4692F41E.2010503@digitalbazaar.com>
Tantek ?elik wrote:
> This was deliberately rejected at the creation of hCard to give publishers
> more control.
>
> All too often there is "garbage" (or just extra unwanted text) in alt
> attributes for a variety of publisher reasons.
Doesn't doing this go against the HTML 4.01 specification? You aren't
supposed to have anything in the 'alt' attribute of the image tag that
isn't pertinent:
http://www.w3.org/TR/html4/struct/objects.html#h-13.8
> I've added this to the hCard FAQ as well:
>
> http://microformats.org/wiki/hcard-faq#Why_is_IMG_alt_not_being_picked_up
The above link states:
"In addition all too often there is "garbage" (or just extra unwanted
text) in alt attributes for a variety of publisher reasons, and that
extraneous text would pollute otherwise clean property values in
numerous existing sites."
I couldn't find a reference to the analysis that lead to this
conclusion? What constitutes "garbage"? What reasons would a publisher
have to do this? If they're doing this, aren't they quite blatantly
violating the HTML 4.01 and XHTML 1.0 specification?
The link stated above also says:
"Finally, it is simpler and more predictable for publishers if they know
that for images and other such URL related elements (a, object, etc.)
that whether they are specifying a URL property (like "email", "photo",
"url", etc.) or a text property (like "fn", "nickname", etc.) in either
case directly specifying the property on the element is the way to do it."
If we were to adopt this approach, I don't see how we could ever get the
following chunk of HTML working for hAudio:
Sample, as defined by hAudio is:
rel-sample. optional. sample file/stream using rel-design-pattern with
'sample' as the mf-rel-value.
Rel-patterns are only available on links... thus the "move the URL
property such that it is specified directly" approach doesn't work for
any Microformat that uses the rel-design-pattern.
-- manu
From tantek at cs.stanford.edu Mon Jul 9 20:11:11 2007
From: tantek at cs.stanford.edu (Tantek =?ISO-8859-1?B?xw==?=elik)
Date: Mon Jul 9 20:11:11 2007
Subject: [uf-new] img alt content (was:hAudio implemented on Bitmunk
(with one snag))
In-Reply-To: <4692F41E.2010503@digitalbazaar.com>
Message-ID:
On 7/9/07 7:51 PM, "Manu Sporny" wrote:
> Tantek ?elik wrote:
>> This was deliberately rejected at the creation of hCard to give publishers
>> more control.
>>
>> All too often there is "garbage" (or just extra unwanted text) in alt
>> attributes for a variety of publisher reasons.
>
> Doesn't doing this go against the HTML 4.01 specification? You aren't
> supposed to have anything in the 'alt' attribute of the image tag that
> isn't pertinent:
>
> http://www.w3.org/TR/html4/struct/objects.html#h-13.8
Many publishers go against many aspects of the HTML 4.01 specification yes,
not in the least by publishing invalid content.
>> I've added this to the hCard FAQ as well:
>>
>> http://microformats.org/wiki/hcard-faq#Why_is_IMG_alt_not_being_picked_up
>
> The above link states:
>
> "In addition all too often there is "garbage" (or just extra unwanted
> text) in alt attributes for a variety of publisher reasons, and that
> extraneous text would pollute otherwise clean property values in
> numerous existing sites."
>
> I couldn't find a reference to the analysis that lead to this
> conclusion?
We didn't capture it at the time unfortunately, and we're being more
thorough now. We did actually try it the other way first (including all
nested "alternative" content) and found it worked worse across a variety of
existing real world sites, not just 1-2 examples but LOTS.
> What constitutes "garbage"?
In this case things like duplicated text, text for chrome/UI etc.
> What reasons would a publisher
> have to do this? If they're doing this, aren't they quite blatantly
> violating the HTML 4.01 and XHTML 1.0 specification?
Not necessarily.
> The link stated above also says:
>
> "Finally, it is simpler and more predictable for publishers if they know
> that for images and other such URL related elements (a, object, etc.)
> that whether they are specifying a URL property (like "email", "photo",
> "url", etc.) or a text property (like "fn", "nickname", etc.) in either
> case directly specifying the property on the element is the way to do it."
>
> If we were to adopt this approach, I don't see how we could ever get the
> following chunk of HTML working for hAudio:
>
>
>
>
>
> Sample, as defined by hAudio is:
>
> rel-sample. optional. sample file/stream using rel-design-pattern with
> 'sample' as the mf-rel-value.
>
> Rel-patterns are only available on links... thus the "move the URL
> property such that it is specified directly" approach doesn't work for
> any Microformat that uses the rel-design-pattern.
rel does not apply to
therefore this is not a problem.
Tantek
From scott at makedatamakesense.com Mon Jul 9 20:26:39 2007
From: scott at makedatamakesense.com (Scott Reynen)
Date: Mon Jul 9 20:26:51 2007
Subject: [uf-new] img alt content (was:hAudio implemented on Bitmunk (with
one snag))
In-Reply-To: <4692EF4E.60709@digitalbazaar.com>
References: <46914B85.4040903@digitalbazaar.com>
<18788.80.86.36.97.1183990452.squirrel@www.gradwell.com>
<4692EF4E.60709@digitalbazaar.com>
Message-ID:
On Jul 9, 2007, at 8:30 PM, Manu Sporny wrote:
> I think Andy and Scott have the correct approach to this problem. All
> one must do is view the following in a text-based browser, such as
> Lynx... or in Firefox/Opera/etc and the answer becomes much clearer:
Um, that's not really my approach to this problem at all. I
suggested more research was required before making any changes, not
more hypothetical markup to support a predetermined conclusion. And
I suggested more research because I suspect that "red ball" section
was included in the HTML spec specifically as a result of many
publishers using such alt values, which aren't really content. I
prefer to follow the semantics defined in the spec, but I do not
think we should do that with complete disregard to how people
actually use HTML.
--
Scott Reynen
MakeDataMakeSense.com
From andy at pigsonthewing.org.uk Tue Jul 10 00:10:43 2007
From: andy at pigsonthewing.org.uk (Andy Mabbett)
Date: Tue Jul 10 00:11:58 2007
Subject: [uf-new] img alt content (was:hAudio implemented on Bitmunk (with
one snag))
In-Reply-To:
References: <4692EF4E.60709@digitalbazaar.com>
Message-ID:
In message , Tantek ?elik
writes
>On 7/9/07 7:30 PM, "Manu Sporny" wrote:
>
>> Mike, would it be possible to write a parseTagTextFromImages() function
>> that would extract the 'alt' text from images? Therefore, running it
>> over the following HTML:
>>
>>
>>
Kaply
>>
>>
>> Would yield the text "Michael Kaply" for 'fn'. Using this approach would
>> also solve the hAudio problem as well as the problems that have been
>> raised thus far in this thread.
>
>This would be non-compliant with hCard parsing and thus should be
>AVOIDED.
>
> http://microformats.org/wiki/hcard-parsing
In other words, the microformat parsing rules are non-compliant with the
HTML specification.
I think that's something which should be fixed.
--
Andy Mabbett
From andy at pigsonthewing.org.uk Tue Jul 10 13:16:09 2007
From: andy at pigsonthewing.org.uk (Andy Mabbett)
Date: Tue Jul 10 13:17:44 2007
Subject: [uf-new] img alt content (was:hAudio implemented on Bitmunk (with one
snag))
In-Reply-To:
References: <4692F41E.2010503@digitalbazaar.com>
Message-ID:
In message , Tantek ?elik
writes
>On 7/9/07 7:51 PM, "Manu Sporny" wrote:
>
>> Tantek ?elik wrote:
>>> This was deliberately rejected at the creation of hCard to give publishers
>>> more control.
>>>
>>> All too often there is "garbage" (or just extra unwanted text) in alt
>>> attributes for a variety of publisher reasons.
>>
>> Doesn't doing this go against the HTML 4.01 specification? You aren't
>> supposed to have anything in the 'alt' attribute of the image tag that
>> isn't pertinent:
>>
>> http://www.w3.org/TR/html4/struct/objects.html#h-13.8
>
>Many publishers go against many aspects of the HTML 4.01 specification
>yes, not in the least by publishing invalid content.
Is the best way to encourage "POSH" to adhere to standards, or to pander
to those who break them?
>>> I've added this to the hCard FAQ as well:
>>>
>>> http://microformats.org/wiki/hcard-faq#Why_is_IMG_alt_not_being_picked_up
>>
>> The above link states:
>>
>> "In addition all too often there is "garbage" (or just extra unwanted
>> text) in alt attributes for a variety of publisher reasons, and that
>> extraneous text would pollute otherwise clean property values in
>> numerous existing sites."
>>
>> I couldn't find a reference to the analysis that lead to this
>> conclusion?
>
>We didn't capture it at the time unfortunately, and we're being more
>thorough now. We did actually try it the other way first (including
>all nested "alternative" content) and found it worked worse across a
>variety of existing real world sites, not just 1-2 examples but LOTS.
It is indeed unfortunate that such evidence hasn't been captured;
especially given your strong advocacy of evidence-based working and a
"scientific" process. Someone cynical might think it hypocritical of you
to then assert something without providing evidence for it. Perhaps it
would be a good idea if you could provide at least a minimum amount of
such evidence; preferably with URLs; per:
Use real world examples
People often invent completely fictitious (and
theoretical) examples in order to try to make a point
they are trying to make. Microformats themselves are
based on studying real world examples and designing for
real world examples. Thus arguments based on theoretical
examples hold much less weight in microformats
discussions and are apt to be ignored. Please avoid
posting arguments / questions based solely on
theoretical examples.
Ask for real world examples
If someone discusses or provides arguments based on
theoretical examples, ask them to provide a real world
example and point them to the above guideline.
Use URLs to examples
Please provide URLs to real world examples when
possible. This helps to validate that such examples
truly are "real world" as they are on the public Web,
and provides additional context around the example which
might be crucial to understanding it or answering
questions about it.
Ask for URLs to examples
When people do not provide a specific URL to a test case
or example, then especially as a developer, PLEASE ask
them to provide a specific URL (and cite the previous
guideline) rather than attempting to work out how an
inline snippet of code might work.
(which I believe you wrote) to forestall such criticism?
>> What constitutes "garbage"?
>
>In this case things like duplicated text, text for chrome/UI etc.
>
>> What reasons would a publisher
>> have to do this? If they're doing this, aren't they quite blatantly
>> violating the HTML 4.01 and XHTML 1.0 specification?
>
>Not necessarily.
Can you give a real world example of someone publishing such "garbage"
alt text, pertinent to microformats (and again with URLs as above),
which does not violate the HTML specs, please?
--
Andy Mabbett
From paul_wilkins at xtra.co.nz Tue Jul 10 14:52:26 2007
From: paul_wilkins at xtra.co.nz (Paul Wilkins)
Date: Tue Jul 10 14:52:35 2007
Subject: [uf-new] img alt content (was:hAudio implemented on Bitmunk (with
onesnag))
References: <4692F41E.2010503@digitalbazaar.com>
Message-ID: <003501c7c33c$9b2e10d0$bc08a8c0@nzto22>
From: "Andy Mabbett"
>>> What reasons would a publisher
>>> have to do this?
[garbage in alt attributes]
>>> If they're doing this, aren't they quite blatantly
>>> violating the HTML 4.01 and XHTML 1.0 specification?
>>
>>Not necessarily.
>
> Can you give a real world example of someone publishing such "garbage"
> alt text, pertinent to microformats (and again with URLs as above),
> which does not violate the HTML specs, please?
I can.
Our website uses feature pages for our cleints to help improve their
visibility to the general public through search engines. One of the ways of
doing this is to load the page with specific keywords and phrases for our
clients.
Images for example would have "Copyright CityLife Auckland. Suite at our
Auckland hotel accommodation"
A google search for "auckland hotel accommodation" results in their feature
page being the third result.
http://www.google.co.nz/search?q=auckland+hotel+accommodation&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a
In terms of the page and ensuring high visibility, this is the right thing
to do, but in terms of microformats and providing the information that's
required, using this alt information is the wrong thing to do.
As far as my boss is concerned, microformats are a tiny blip on our radar
and are not worth his time. I believe that he is wrong there, and am
steadily massaging our information so that microformats can be applied as
easily as possible when the time comes.
However, as a business we have a commitment to our clients to provide them
the best results that we can. When the time comes, microformats will need to
take such issues into account before we apply them, because they must not
reduce the effectiveness of our results. Our alt tags will contain whatever
they must to maintain their high search engine placements and anything that
interferes with that will get fallen by the wayside.
--
Paul Wilkins
From scott at makedatamakesense.com Tue Jul 10 15:10:14 2007
From: scott at makedatamakesense.com (Scott Reynen)
Date: Tue Jul 10 15:10:29 2007
Subject: [uf-new] img alt content (was:hAudio implemented on Bitmunk (with
one snag))
In-Reply-To:
References: <4692F41E.2010503@digitalbazaar.com>
Message-ID: <531BF2FB-496D-4999-B0D0-2F56471684C6@makedatamakesense.com>
On Jul 10, 2007, at 2:16 PM, Andy Mabbett wrote:
>> Many publishers go against many aspects of the HTML 4.01
>> specification
>> yes, not in the least by publishing invalid content.
>
> Is the best way to encourage "POSH" to adhere to standards, or to
> pander
> to those who break them?
If we had more control of web publishing, I would support pushing
complete adherence to HTML specs. But we don't, so we have to
balance de jure standards with de facto standards. We treat all alt
values as content at the risk of discouraging publishers who use alt
values for non-content from using microformats. Whether or not
that's a worthwhile trade-off depends on how many publishers we're
talking about.
> Perhaps it
> would be a good idea if you could provide at least a minimum amount of
> such evidence; preferably with URLs
Indeed, we should collect more examples of how alt is used in
practice, because that's a very important factor in deciding how we
should treat them. But if we're just collecting such examples with
an eye toward supporting pre-determined conclusions, there's really
no point.
> Can you give a real world example of someone publishing such "garbage"
> alt text, pertinent to microformats (and again with URLs as above),
> which does not violate the HTML specs, please?
While the HTML specs are a very important consideration, they are not
the only consideration. While encouraging adherence to HTML, we need
to recognize that such adherence is quite rare in practice. How many
of us even have perfectly valid websites? Complete adherence to HTML
is simply not a practical criteria to apply without concession on
today's web. We should push it where we can and choose those battles
carefully.
--
Scott Reynen
MakeDataMakeSense.com
From andy at pigsonthewing.org.uk Wed Jul 11 01:07:26 2007
From: andy at pigsonthewing.org.uk (Andy Mabbett)
Date: Wed Jul 11 01:09:00 2007
Subject: [uf-new] img alt content (was:hAudio implemented on Bitmunk (with
onesnag))
In-Reply-To: <003501c7c33c$9b2e10d0$bc08a8c0@nzto22>
References: <4692F41E.2010503@digitalbazaar.com>
<003501c7c33c$9b2e10d0$bc08a8c0@nzto22>
Message-ID:
In message <003501c7c33c$9b2e10d0$bc08a8c0@nzto22>, Paul Wilkins
writes
>> Can you give a real world example of someone publishing such "garbage"
>> alt text, pertinent to microformats (and again with URLs as above),
>> which does not violate the HTML specs, please?
>
>I can.
>
>Our website uses feature pages for our cleints to help improve their
>visibility to the general public through search engines. One of the
>ways of doing this is to load the page with specific keywords and
>phrases for our clients.
>
>Images for example would have "Copyright CityLife Auckland. Suite at
>our Auckland hotel accommodation"
Unless that's the graphical content of the image, which seems unlikely,
that's an abuse of the alt attribute; such text should be in the title
attribute. It *does* violate the HTML specs. And how is it "pertinent to
microformats"?
>as a business we have a commitment to our clients to provide them the
>best results that we can.
What about their responsibility to their customers, some of whom will
have a visual disability?
--
Andy Mabbett
From andy at pigsonthewing.org.uk Wed Jul 11 01:12:56 2007
From: andy at pigsonthewing.org.uk (Andy Mabbett)
Date: Wed Jul 11 01:17:06 2007
Subject: [uf-new] img alt content (was:hAudio implemented on Bitmunk
(with one snag))
In-Reply-To: <531BF2FB-496D-4999-B0D0-2F56471684C6@makedatamakesense.com>
References: <4692F41E.2010503@digitalbazaar.com>
<531BF2FB-496D-4999-B0D0-2F56471684C6@makedatamakesense.com>
Message-ID:
In message <531BF2FB-496D-4999-B0D0-2F56471684C6@makedatamakesense.com>,
Scott Reynen writes
>> Can you give a real world example of someone publishing such "garbage"
>> alt text, pertinent to microformats (and again with URLs as above),
>> which does not violate the HTML specs, please?
>
>While the HTML specs are a very important consideration, they are not
>the only consideration. While encouraging adherence to HTML, we need
>to recognize that such adherence is quite rare in practice. How many
>of us even have perfectly valid websites? Complete adherence to HTML
>is simply not a practical criteria to apply without concession on
>today's web. We should push it where we can and choose those battles
>carefully.
If that's true - which I dispute - then who's going to re-write:
The first rule of POSH is that you must validate your POSH.
accordingly?
--
Andy Mabbett
From msporny at digitalbazaar.com Wed Jul 11 07:15:10 2007
From: msporny at digitalbazaar.com (Manu Sporny)
Date: Wed Jul 11 07:15:14 2007
Subject: [uf-new] img alt content (was:hAudio implemented on Bitmunk (with
onesnag))
In-Reply-To:
References: <4692F41E.2010503@digitalbazaar.com> <003501c7c33c$9b2e10d0$bc08a8c0@nzto22>
Message-ID: <4694E5EE.2060902@digitalbazaar.com>
Andy Mabbett wrote:
>> Images for example would have "Copyright CityLife Auckland. Suite at
>> our Auckland hotel accommodation"
> Unless that's the graphical content of the image, which seems
> unlikely, that's an abuse of the alt attribute; such text should be in
> the title attribute. It *does* violate the HTML specs. And how is it
> "pertinent to microformats"?
There seems to be two parts to this discussion:
1. HTML specification violation (alt tag mis-use)
2. How the alt attribute is being used in the real-world
Andy's line of reasoning is sound regarding the HTML specification
violation. I don't think that anybody can state that placing text that
does not match the graphical content of an image tag goes against the
HTML specification.
The second part is how the alt attribute is being used in the
real-world. Tantek has asserted that 'alt' is being mis-used on a wide
scale on the Interwebs.
As Scott has pointed out, the only way to know this is to start
gathering real data. I am in the process of writing an image crawler
(which will hopefully be done by tonight) to gather these statistics.
The crawler will crawl the web for image tags and gather statistics
regarding:
- How many image tags have 'alt' tags specified.
- How many image tags have 'title' tags specified.
- How many image tags have both specified.
- Whether or not the 'alt' tag matches the image being display (I'll
setup a website for all of us to help in analyzing this data)
I'm assuming 125,626,329,000 unique images on the web (125,626,329
unique sites on the web - 1000 unique images per site).
Statistically, I think we would only need around 385 unique site samples
to get a 95% confidence interval with a 5% error rate (somebody correct
me if this is wrong). To be safe, I'll collect 100,000 unique image tags
, 1 per site to get our initial sample set.
Any objections to this method of data collection?
-- manu
From scott at makedatamakesense.com Wed Jul 11 07:49:25 2007
From: scott at makedatamakesense.com (Scott Reynen)
Date: Wed Jul 11 07:49:42 2007
Subject: [uf-new] img alt content
In-Reply-To:
References: <4692F41E.2010503@digitalbazaar.com>
<531BF2FB-496D-4999-B0D0-2F56471684C6@makedatamakesense.com>
Message-ID:
On Jul 11, 2007, at 2:12 AM, Andy Mabbett wrote:
>> Complete adherence to HTML
>> is simply not a practical criteria to apply without concession on
>> today's web.
>
> If that's true - which I dispute - then who's going to re-write:
>
>
>
> The first rule of POSH is that you must validate your POSH.
>
> accordingly?
Validation and adherence to the HTML spec are not exactly the same
thing. All spec-adherent websites are valid, but not all valid sites
are spec-adherent. So full adherence to the spec is more work to ask
of publishers than simple validation. Ironically, I think the HTML
validator actually encourages poor use of the alt attribute because
it returns an error on missing alt attributes, but doesn't make any
mention that alt should be empty for non-content images. So
publishers who leave out alt on non-content images see this error and
end up adding alt attributes with exactly the kind of "red ball"
values the HTML spec discourages.
I completely agree such publishers should be encouraged to stop doing
this; I just doubt whether such encouragement should come from the
microformats community. I see our goal as a bit more specific than
general encouragement of better HTML: making better HTML publishing
more appealing by establishing practical benefits. And I think the
best way to do this is to focus on areas where better HTML results in
maximum practical benefits with minimum cost to publishers.
In this case specifically, I suspect the best way to accomplish that
goal would not be to encourage everyone publishing non-content alt
attributes to change, but rather to encourage everyone publishing
content in alt attributes to insert such content as more accessible
text, and use style sheets to apply more stylized images, which I
think is what Ben was suggesting (see [1]). This solution, I think,
makes better HTML more useful without making microformats any more
difficult to publish for those who aren't up to spec.
[1] http://www.stopdesign.com/articles/replace_text/
--
Scott Reynen
MakeDataMakeSense.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://microformats.org/discuss/mail/microformats-new/attachments/20070711/b8948dcf/attachment.html
From paul_wilkins at xtra.co.nz Wed Jul 11 17:05:59 2007
From: paul_wilkins at xtra.co.nz (Paul Wilkins)
Date: Wed Jul 11 17:06:02 2007
Subject: Fw: [uf-new] img alt content (was:hAudio implemented on Bitmunk (with
onesnag))
Message-ID: <001801c7c418$6d636c40$bc08a8c0@nzto22>
From: "Andy Mabbett"
>>> What reasons would a publisher
>>> have to do this?
[garbage in alt attributes]
>>> If they're doing this, aren't they quite blatantly
>>> violating the HTML 4.01 and XHTML 1.0 specification?
>>
>>Not necessarily.
>
> Can you give a real world example of someone publishing such "garbage"
> alt text, pertinent to microformats (and again with URLs as above),
> which does not violate the HTML specs, please?
I can.
Our website uses feature pages for our cleints to help improve their
visibility to the general public through search engines. One of the ways of
doing this is to load the page with specific keywords and phrases for our
clients.
Images for example would have "Copyright CityLife Auckland. Suite at our
Auckland hotel accommodation"
A google search for "auckland hotel accommodation" results in their feature
page being the third result.
http://www.google.co.nz/search?q=auckland+hotel+accommodation&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a
In terms of the page and ensuring high visibility, this is the right thing
to do, but in terms of microformats and providing the information that's
required, using this alt information is the wrong thing to do.
As far as my boss is concerned, microformats are a tiny blip on our radar
and are not worth his time. I believe that he is wrong there, and am
steadily massaging our information so that microformats can be applied as
easily as possible when the time comes.
However, as a business we have a commitment to our clients to provide them
the best results that we can. When the time comes, microformats will need to
take such issues into account before we apply them, because they must not
reduce the effectiveness of our results. Our alt tags will contain whatever
they must to maintain their high search engine placements and anything that
interferes with that will get fallen by the wayside.
--
Paul Wilkins
From regine at regine-heidorn.de Thu Jul 12 04:50:01 2007
From: regine at regine-heidorn.de (Regine Heidorn)
Date: Thu Jul 12 04:47:57 2007
Subject: [uf-new] MicroFormats for (Music-)TopLists, htop-list?
Message-ID: <946BF4B8-E31D-48D3-9A73-CD4E4EBFC89C@regine-heidorn.de>
Hi All,
first let me introduce myself: I'm a 35-year-old Webdeveloper living
in Berlin, Germany. I care a lot about semantic stuff and CSS, so I
feel kind of addicted to that MicroFormat-Thing.
I didn't use it a lot up to now though, but there is an idea I would
like to materialize. In the last months I cared a lot about Open
Music, Music published under cc-licence and netlabels offering that
stuff. Since I'm running a blog I got the idea of creating a top-5-
list of the tracks I like best from the different labels. To make the
thing more communicating I thought it funny to give the netlabels the
opportunity to grab my top-lists including the resulting ratings. If
more bloggers or community-sites feel like doing this, the labels
could establish something like the blogosphere-web2.0-community-
BillBoards.
So I contacted one of the netlabels, they're quite interested in this
idea, so I thought of how to form this idea in the most simple and
effective way and stumbled across the microformat-thing. I looked
around the blog and the wiki to find out if something would match my
idea and what I can see for the moment is the vision of a top-list
microformat (for audio), including the haudio-Format and the hreview-
format to form something like this:
-
rechazamos el ahora
Christian Dittmann
emporio
thinner
thn 092
cc-by-nc-nd
My thoughts up to this point are:
- Did I use the microformats right, did I get the idea properly?
- Since it's a top-list the use of is the semantic correct way,
so to establish the format it should be convention to start with
something like ?
- Is the licence correctly included with an ?
- How can the rating be included? Does it make sense to work with
hreview here?
- A top-list can be seen as a review, might it be better to
straighten the whole thing out to the hreview thing? So it could also
be used for top-lists not regarding audio-tracks. But it also seems
logical to use haudio if it's audio-material. So especially for audio-
top-lists would it be OK to make that clear by the use of - ?
- If a top-list is also a review: should it be extended by the
hreview-possibilities? Like for example if I write a review about a
track like maybe on my blog or somewhere else, it would be
semantically interesting to paste this information together with my
top-list.
Second is: how can those informations be collected by let's say the
netlabels? As to now I have the idea of a php-script writing the data
in a database and thus creating the netlabel-toplist consisting of
the data from participating Blogs, community-sites and whatever.
Lots of thoughts, hope one can follow at least ;-)
Greez,
Regine Heidorn
From hkraft at gmail.com Fri Jul 13 02:46:05 2007
From: hkraft at gmail.com (Henrik Kraft)
Date: Fri Jul 13 02:46:09 2007
Subject: [uf-new] Microformat for article/document?
Message-ID: <68005cb10707130246tfc2929di22fc429950244ea8@mail.gmail.com>
Hello
Ive been looking but cant seem to find a mf for a document.
I think it should contain something like,
Does this makes sense to anyone else or have I misunderstood what the
mf should do?
/Henrik
From davidjanes at blogmatrix.com Fri Jul 13 03:14:19 2007
From: davidjanes at blogmatrix.com (David Janes)
Date: Fri Jul 13 03:14:23 2007
Subject: [uf-new] Microformat for article/document?
In-Reply-To: <68005cb10707130246tfc2929di22fc429950244ea8@mail.gmail.com>
References: <68005cb10707130246tfc2929di22fc429950244ea8@mail.gmail.com>
Message-ID: <21e523c20707130314h749c8b19ue782598eb98da9a9@mail.gmail.com>
On 7/13/07, Henrik Kraft wrote:
> Ive been looking but cant seem to find a mf for a document.
>
> I think it should contain something like,
>
>
Header
>
Text
>
Bodytext
>
H1 and P are pretty good in and of themselves. Another level of
granularity can be provided by hAtom, using respectively hentry,
entry-title, summary & content.
--
David Janes
Founder, BlogMatrix
http://www.blogmatrix.com
http://blogmatrix.blogmatrix.com
From supercanadian at gmail.com Fri Jul 13 13:42:27 2007
From: supercanadian at gmail.com (Charles Iliya Krempeaux)
Date: Fri Jul 13 13:42:32 2007
Subject: [uf-new] Microformat for article/document?
In-Reply-To: <68005cb10707130246tfc2929di22fc429950244ea8@mail.gmail.com>
References: <68005cb10707130246tfc2929di22fc429950244ea8@mail.gmail.com>
Message-ID: <84ce626f0707131342q1e8068fbn270db513b9fc5cbf@mail.gmail.com>
Hello Henrik,
On 7/13/07, Henrik Kraft wrote:
> Hello
> Ive been looking but cant seem to find a mf for a document.
>
> I think it should contain something like,
>
> Header
>
Text
>
Bodytext
>
Perhaps I'm missing the point, but... isn't considered to be a document.
And thus is your "article". is your "header". And you
can include some class names on various elements inside of for
your "preamble" and "bodytext".
See ya
--
Charles Iliya Krempeaux, B.Sc.
All the Vlogging News on One Page
http://vlograzor.com/
From andy at pigsonthewing.org.uk Fri Jul 13 16:54:00 2007
From: andy at pigsonthewing.org.uk (Andy Mabbett)
Date: Fri Jul 13 16:55:26 2007
Subject: [uf-new] Microformat for article/document?
In-Reply-To: <84ce626f0707131342q1e8068fbn270db513b9fc5cbf@mail.gmail.com>
References: <68005cb10707130246tfc2929di22fc429950244ea8@mail.gmail.com>
<84ce626f0707131342q1e8068fbn270db513b9fc5cbf@mail.gmail.com>
Message-ID: <$tZ1COMYCBmGFwHd@pigsonthewing.org.uk>
In message
<84ce626f0707131342q1e8068fbn270db513b9fc5cbf@mail.gmail.com>, Charles
Iliya Krempeaux writes
>On 7/13/07, Henrik Kraft wrote:
>> Ive been looking but cant seem to find a mf for a document.
>Perhaps I'm missing the point, but... isn't considered to be a document.
>
>And thus is your "article". is your "header". And you
>can include some class names on various elements inside of for
>your "preamble" and "bodytext".
That's one way of looking at it; but in:
for example, the 2006 article (i.e. the whole page) contains and
describes a 1948 article.
Then again, the latter, and the original enquirer's document, could,
perhaps, by wrapped in a "citation" microformat.
--
Andy Mabbett
From msporny at digitalbazaar.com Sat Jul 14 08:18:08 2007
From: msporny at digitalbazaar.com (Manu Sporny)
Date: Sat Jul 14 08:18:12 2007
Subject: [uf-new] MicroFormats for (Music-)TopLists, htop-list?
In-Reply-To: <946BF4B8-E31D-48D3-9A73-CD4E4EBFC89C@regine-heidorn.de>
References: <946BF4B8-E31D-48D3-9A73-CD4E4EBFC89C@regine-heidorn.de>
Message-ID: <4698E930.1000502@digitalbazaar.com>
Regine Heidorn wrote:
> - Did I use the microformats right, did I get the idea properly?
Yes, you seem to have grasped and implemented the concept and markup of
hAudio correctly. Nicely done :)
> - Is the licence correctly included with an ?
Yes, it is.
> - How can the rating be included? Does it make sense to work with
> hreview here?
hAudio was intended to be embedded in hReview. So yes, you could put an
hAudio in hReview and add rating to it that way.
Keep in mind that I don't know of anybody that has implemented hAudio +
hReview, so it would be good for the list to see an example and figure
out if it works.
> - A top-list can be seen as a review, might it be better to straighten
> the whole thing out to the hreview thing? So it could also be used for
> top-lists not regarding audio-tracks. But it also seems logical to use
> haudio if it's audio-material. So especially for audio-top-lists would
> it be OK to make that clear by the use of - ?
This isn't that clear... we would have to understand what you mean by
"top-list". You would have to differentiate the following from each
other (here's a hint: Some of them could be viewed as very similar to
one another):
- "top-list"
- "playlist"
- hreview of a playlist
- hreview of an audio collection or audio album
Another way of looking at it is: How is a top-list any different from a
playlist?
> - If a top-list is also a review: should it be extended by the
> hreview-possibilities? Like for example if I write a review about a
> track like maybe on my blog or somewhere else, it would be semantically
> interesting to paste this information together with my top-list.
You will have to elaborate on this as your idea could be interpreted in
a number of different ways.
> Second is: how can those informations be collected by let's say the
> netlabels? I have the idea of a php-script writing the data in
> database and thus creating the netlabel-toplist consisting of the data
> from participating Blogs, community-sites and whatever.
You're talking about crawling, indexing and website implementation.
While this community is interested in this stuff... they are
implementation details that don't really have much to do with creating
the Microformat you are talking about.
> Since it's a top-list the use of
is the semantic correct way, so
> to establish the format it should be convention to start with
> something like ?
Perhaps. What we really need to find out is how prevalent "top-lists"
are on the Internet. I think everybody on here will agree that they do
exist, but it will be your job to gather data to prove that they do
exist. This is one of the first steps in the Microformats process -
demonstrate, using hard data, that your problem exists. You will need to
answer the following questions:
- What problem is 'top-list' attempting to address?
- How many sites have "top-list" type information?
- What kind of information should be placed in a top-list?
- Is there any way that we can combine haudio + hreview to solve the
problem?
-- manu
From scott at makedatamakesense.com Sat Jul 14 08:19:37 2007
From: scott at makedatamakesense.com (Scott Reynen)
Date: Sat Jul 14 08:19:55 2007
Subject: [uf-new] Fwd: [uf-discuss] Error messages
References: <4698B342.90204@prodromou.name>
Message-ID: <6922977F-26B9-4878-AB46-C3B006D0F1FA@makedatamakesense.com>
Begin forwarded message:
> From: Evan Prodromou
> Date: July 14, 2007 5:28:02 AM MDT
> To: microformats-discuss@microformats.org
> Subject: [uf-discuss] Error messages
> Reply-To: Microformats Discuss
>
> One of the most common HTML patterns in Web applications is error
> messages. We see them all the time on the Web: login errors, form
> validation errors, backend errors and user input errors. But what if
> this common pattern was standardized?
>
> If HTML error messages all followed a similar format, we could have
> browser plugins that recorded and analyzed the errors that come up.
> They
> could either feed back this structured error data when we needed it --
> say, when filing a bug report or talking to a tech support rep --
> or use
> the error data to help us find workarounds or documentation online.
>
> I brought the idea up in the #microformats channel on Freenode, and it
> got a good response, so I took the next step and created a list of
> examples and a brainstorming page on the microformats wiki.
>
> http://microformats.org/wiki/error-message-examples
> http://microformats.org/wiki/error-message-brainstorming
>
> I'd greatly appreciate the help of people on this list in collecting
> error messages from the wild, and hopefully in building up a draft
> microformat.
>
> ~Evan
>
> --
> Evan Prodromou - evan@prodromou.name - http://evan.prodromou.name/
--
Scott Reynen
MakeDataMakeSense.com
From msporny at digitalbazaar.com Sat Jul 14 09:29:37 2007
From: msporny at digitalbazaar.com (Manu Sporny)
Date: Sat Jul 14 09:29:40 2007
Subject: [uf-new] img alt content statistics
In-Reply-To: <4694E5EE.2060902@digitalbazaar.com>
References: <4692F41E.2010503@digitalbazaar.com> <003501c7c33c$9b2e10d0$bc08a8c0@nzto22>
<4694E5EE.2060902@digitalbazaar.com>
Message-ID: <4698F9F1.1060409@digitalbazaar.com>
Manu Sporny wrote:
> As Scott has pointed out, the only way to know this is to start
> gathering real data. I am in the process of writing an image crawler
> (which will hopefully be done by tonight) to gather these statistics.
The first run of the img tag analysis has been completed, here are the
results:
Total websites crawled : 14077
Total img tags analyzed: 224671
The percentages below are the percentages of img tags that contained
non-empty attributes:
src: 99%
height: 66%
width: 66%
alt: 41%
title: 5%
id: 4%
In general, only 41% of 'img' tags list non-empty 'alt' attributes. In
other words - most websites are not using 'alt' attributes for 'img' tags.
The next step of the analysis process will examine how the sites that
ARE using 'alt' tags use them.
-- manu
From andy at pigsonthewing.org.uk Sat Jul 14 12:05:15 2007
From: andy at pigsonthewing.org.uk (Andy Mabbett)
Date: Sat Jul 14 12:06:40 2007
Subject: [uf-new] img alt content statistics
In-Reply-To: <4698F9F1.1060409@digitalbazaar.com>
References: <4692F41E.2010503@digitalbazaar.com>
<003501c7c33c$9b2e10d0$bc08a8c0@nzto22>
<4694E5EE.2060902@digitalbazaar.com>
<4698F9F1.1060409@digitalbazaar.com>
Message-ID:
In message <4698F9F1.1060409@digitalbazaar.com>, Manu Sporny
writes
>The percentages below are the percentages of img tags that contained
>non-empty attributes:
>
>src: 99%
>height: 66%
>width: 66%
>alt: 41%
>title: 5%
>id: 4%
>
>In general, only 41% of 'img' tags list non-empty 'alt' attributes. In
>other words - most websites are not using 'alt' attributes for 'img'
>tags.
That's a bogus conclusion - empty "alt" attributes are perfectly valid,
and are appropriate in many cases; and you're counting tags but making
conclusions about "most websites".
--
Andy Mabbett
From msporny at digitalbazaar.com Sat Jul 14 12:36:02 2007
From: msporny at digitalbazaar.com (Manu Sporny)
Date: Sat Jul 14 12:36:05 2007
Subject: [uf-new] img alt content statistics
In-Reply-To:
References: <4692F41E.2010503@digitalbazaar.com> <003501c7c33c$9b2e10d0$bc08a8c0@nzto22> <4694E5EE.2060902@digitalbazaar.com> <4698F9F1.1060409@digitalbazaar.com>
Message-ID: <469925A2.3080902@digitalbazaar.com>
Andy Mabbett wrote:
> In message <4698F9F1.1060409@digitalbazaar.com>, Manu Sporny
> writes
>
>> The percentages below are the percentages of img tags that contained
>> non-empty attributes:
>>
>> src: 99%
>> height: 66%
>> width: 66%
>> alt: 41%
>> title: 5%
>> id: 4%
>>
>> In general, only 41% of 'img' tags list non-empty 'alt' attributes. In
>> other words - most websites are not using 'alt' attributes for 'img'
>> tags.
>
> That's a bogus conclusion - empty "alt" attributes are perfectly valid,
> and are appropriate in many cases; and you're counting tags but making
> conclusions about "most websites".
I agree with you, Andy... it seems my statement wasn't clear. Perhaps it
should have read:
"In other words - most websites are using empty 'alt' attributes."
or
"59% of most websites are complying with the HTML 4.01 specification
regarding usage of 'alt' with image tags."
I used the terminology "most websites" because the data gathered is,
statistically speaking, overkill. Assuming 125,626,329 websites (per
Netcraft) we would need a sample set of 384 websites to get a 95%
confidence level with an interval of 5%.
So, we needed 384 samples - we got 224,671 across 14,077 websites.
If you want to sift through the data yourself, I'll have it up tomorrow.
I'll also be providing all of the source code to crawl, index and
analyze the data.
-- manu
From derrick at pallas.us Sat Jul 14 13:42:18 2007
From: derrick at pallas.us (Derrick Lyndon Pallas)
Date: Sat Jul 14 13:42:17 2007
Subject: [uf-new] img alt content statistics
In-Reply-To: <469925A2.3080902@digitalbazaar.com>
References: <4692F41E.2010503@digitalbazaar.com> <003501c7c33c$9b2e10d0$bc08a8c0@nzto22> <4694E5EE.2060902@digitalbazaar.com> <4698F9F1.1060409@digitalbazaar.com>
<469925A2.3080902@digitalbazaar.com>
Message-ID: <4699352A.4090204@pallas.us>
Manu Sporny wrote:
> "59% of most websites are complying with the HTML 4.01 specification
> regarding usage of 'alt' with image tags."
>
> I used the terminology "most websites" because the data gathered is,
> statistically speaking, overkill. Assuming 125,626,329 websites (per
> Netcraft) we would need a sample set of 384 websites to get a 95%
> confidence level with an interval of 5%.
>
> So, we needed 384 samples - we got 224,671 across 14,077 websites.
That's assuming that any given page from a website is representative of
that website. What you really want are examples of
usage on the
web; the number of samples you need is based on usages/page *
pages/unique site * unique sites/internet.
For what it's worth, I actually did start an analysis but haven't had
time to do much with the data. I took a random chunk of our archive,
looked for every , storing the content of the anchor so I could look
for lonely
s with @alt text.
The proof run found 1.4M on 14k pages. Of these anchors,
* 240k contain at least one
* 228k start with an
* 152k contain at least one
with an @alt
* 121k contain at least one
with a non-empty @alt
* 25k contain at least one
with a @title
* 24k contain at least one
with a non-empty @title
A total of 247k
were found in anchors. Of these images,
* 151k contain an @alt
* 120k contain a non-empty @alt
* 25k contain a @title
* 23k contain a non-empty @title
* 11k have a garbage phrase (e.g. "click here", "use the right mouse
button to save", etc.) in @alt or @title
Of the 228k starting
s,
* 142k contain an @alt
* 114k contain a non-empty @alt
* 24k contain a @title
* 22k contain a non-empty @title
* 11k have a garbage phrase in @alt or @title
The non-proof run is looking at 50x as many pages. All of this was
gleaned from the services at ~ Derrick Pallas
From bhawkeslewis at googlemail.com Sat Jul 14 15:52:57 2007
From: bhawkeslewis at googlemail.com (Benjamin Hawkes-Lewis)
Date: Sat Jul 14 15:53:05 2007
Subject: [uf-new] img alt content statistics
In-Reply-To: <469925A2.3080902@digitalbazaar.com>
References: <4692F41E.2010503@digitalbazaar.com> <003501c7c33c$9b2e10d0$bc08a8c0@nzto22> <4694E5EE.2060902@digitalbazaar.com> <4698F9F1.1060409@digitalbazaar.com>
<469925A2.3080902@digitalbazaar.com>
Message-ID: <469953C9.6000301@googlemail.com>
I'm increasingly sceptical about non-qualitative statistical exercises
of this sort. They need to be interpreted with great caution. For
example, alt="" may be compliant with the (X)HTML specifications, or it
may not be. You just can't tell without looking at the page in question.
I'm not sure why mass use or abuse of @alt, treating all webpages as
equals, is deterministic for hCard parsing. Doesn't there need to be a
subsample containing only pages with markup that would be interpreted by
a microformat parser as an hCard?
--
Benjamin Hawkes-Lewis
Manu Sporny wrote:
> Andy Mabbett wrote:
>> In message <4698F9F1.1060409@digitalbazaar.com>, Manu Sporny
>> writes
>>
>>> The percentages below are the percentages of img tags that contained
>>> non-empty attributes:
>>>
>>> src: 99%
>>> height: 66%
>>> width: 66%
>>> alt: 41%
>>> title: 5%
>>> id: 4%
>>>
>>> In general, only 41% of 'img' tags list non-empty 'alt' attributes. In
>>> other words - most websites are not using 'alt' attributes for 'img'
>>> tags.
>> That's a bogus conclusion - empty "alt" attributes are perfectly valid,
>> and are appropriate in many cases; and you're counting tags but making
>> conclusions about "most websites".
>
> I agree with you, Andy... it seems my statement wasn't clear. Perhaps it
> should have read:
>
> "In other words - most websites are using empty 'alt' attributes."
>
> or
>
> "59% of most websites are complying with the HTML 4.01 specification
> regarding usage of 'alt' with image tags."
>
> I used the terminology "most websites" because the data gathered is,
> statistically speaking, overkill. Assuming 125,626,329 websites (per
> Netcraft) we would need a sample set of 384 websites to get a 95%
> confidence level with an interval of 5%.
>
> So, we needed 384 samples - we got 224,671 across 14,077 websites.
>
> If you want to sift through the data yourself, I'll have it up tomorrow.
> I'll also be providing all of the source code to crawl, index and
> analyze the data.
>
> -- manu
> _______________________________________________
> microformats-new mailing list
> microformats-new@microformats.org
> http://microformats.org/mailman/listinfo/microformats-new
>
From msporny at digitalbazaar.com Sun Jul 15 11:09:26 2007
From: msporny at digitalbazaar.com (Manu Sporny)
Date: Sun Jul 15 11:09:30 2007
Subject: [uf-new] Use of img in rel-* (with analyzed data)
Message-ID: <469A62D6.9020805@digitalbazaar.com>
I'm starting a new thread as the "*img alt content*" discussion seems to
be getting unfocused. Please familiarize yourself with the following
thread, as this discussion is a more focused continuation of it:
http://microformats.org/discuss/mail/microformats-new/2007-July/000590.html
All of the tools and data that were used for this analysis, including
source code released under the GPL, is available from the following URL:
http://www.zenmachine.org/downloads/microformats/dbuft-0.3.tar.bz2
The Problem
-----------
It is quite often that a site uses an image instead of a text link to
present actions. For example: Instead of using the text "Download", they
will use a graphic image with a downward-facing arrow pointing at a disk.
In other words, if we have this:
Download:
How do we present this option to a human being in a non-web-page UI?
This problem is applicable to any 'rel-*' pattern. Currently, it is
affecting the implementation of hAudio because Operator does not extract
ALT or TITLE attributes for IMG tags, thus when an image-only rel-* link
is presented to the user, it is blank.
The Argument Thus Far
---------------------
Andy Mabbett proposed that Operator should use the ALT attribute from
the IMG tag, as that is HTML/XHTML compliant[1]. Tantek ?elik raised the
point that web authors often mis-use the ALT attribute[2]. Scott Reynen
noted that we would need examples to more accurately make an informed
decision, as no data had been collected as of yet[3].
The Data Collected So Far
-------------------------
The first set of data collected attempted to determine the number of IMG
tags that used 'alt', 'title' and 'id':
Total websites crawled : 14077
Total img tags analyzed: 224671
@alt: 41%
@title: 5%
@id: 4%
The second set of data collected came from Derrick Pallas. We are still
waiting for analysis to be performed by him and that analysis posted to
the mailing list.
The third set of data collected looks at image-only anchors. In other
words, it collects only links that look like the following:
The data was analyzed by a human being to ensure that the ALT text
matched the image. The following criteria was used to categorize images:
Valid @alt - If the ALT text displayed to the user matched the image
displayed, the image was marked as VALID. The ALT text was also marked
as valid if it was blank.
Unknown @alt - If the ALT text was in another language or was in UTF-8
(not displayable), the image was marked as UNKNOWN.
Garbage @alt - If the ALT text was clearly not applicable to the image,
such as "click here", "red ball", or "blog" when the image was a
shopping cart, etc.
This analysis required human interaction, thus the sample size is small
(but still statistically significant). A small GUI displayed an image to
a person and asked them to select if the image matched the ALT tag. This
is the first time this data is being presented:
Total websites crawled : 1721
Total img-only anchors analyzed: 1166
Valid @alt : 77.3%
Unknown @alt: 5.8%
Garbage @alt: 16.9%
As mentioned previously, all of the tools and data that were used for
this analysis, including source code, is available from the following URL:
http://www.zenmachine.org/downloads/microformats/dbuft-0.3.tar.bz2
-- manu
[1]http://microformats.org/discuss/mail/microformats-new/2007-July/000594.html
[2]http://microformats.org/discuss/mail/microformats-new/2007-July/000598.html
[3]http://microformats.org/discuss/mail/microformats-new/2007-July/000595.html
From tantek at cs.stanford.edu Sun Jul 15 11:43:52 2007
From: tantek at cs.stanford.edu (Tantek =?ISO-8859-1?B?xw==?=elik)
Date: Sun Jul 15 11:44:10 2007
Subject: [uf-new] Use of img in rel-* (with analyzed data)
In-Reply-To: <469A62D6.9020805@digitalbazaar.com>
Message-ID:
On 7/15/07 11:09 AM, "Manu Sporny" wrote:
> Tantek ?elik raised the
> point that web authors often mis-use the ALT attribute[2].
To be clear, the conclusion from this is that publishers should be given the
detailed *choice* of whether or not the alt text in their pages is included
in microformats property values (rather than being forced to by *always*
using it in contained properties).
Thus the alt (or src for that matter) attribute of an
element is
*only* included on a property value if the property is set directly on the
OR via a class="value" construct.
Our experience with this in practice has been quite good, and in fact, this
is the first that *anyone* has raised any issues with it (in over two years
of it functioning this way - that is it's not that no one's written it down
yet - unlike some of the existing issues), so given experience to date, I
would assert that we have the 80/20 (or far more than even) case covered,
and that new cases regarding this being raised now have the burden of
proof[1].
Thanks,
Tantek
[1] http://microformats.org/wiki/brainstorming#Burden_of_Proof
From chris at placenamehere.com Sun Jul 15 14:35:56 2007
From: chris at placenamehere.com (Chris Casciano)
Date: Sun Jul 15 14:36:11 2007
Subject: [uf-new] img alt content statistics
In-Reply-To: <469953C9.6000301@googlemail.com>
References: <4692F41E.2010503@digitalbazaar.com> <003501c7c33c$9b2e10d0$bc08a8c0@nzto22> <4694E5EE.2060902@digitalbazaar.com> <4698F9F1.1060409@digitalbazaar.com>
<469925A2.3080902@digitalbazaar.com>
<469953C9.6000301@googlemail.com>
Message-ID: <11001486-61EA-4CFC-8DA7-D7BE9D3E663D@placenamehere.com>
On Jul 14, 2007, at 6:52 PM, Benjamin Hawkes-Lewis wrote:
> I'm increasingly sceptical about non-qualitative statistical
> exercises of this sort. They need to be interpreted with great
> caution. For example, alt="" may be compliant with the (X)HTML
> specifications, or it may not be. You just can't tell without
> looking at the page in question.
>
> I'm not sure why mass use or abuse of @alt, treating all webpages
> as equals, is deterministic for hCard parsing. Doesn't there need
> to be a subsample containing only pages with markup that would be
> interpreted by a microformat parser as an hCard?
One thing I hope we don't lose sight of is that while we as a
community should be promoting standards and other best practices in
all web development and design fronts, if the microformat specs take
a hard line on issues such as this where there is some regular use of
a variety of techniques it may hurt both adoption on a case b case
basis as well as how the movement as a whole is viewed in terms of
practicality.
Image replacement techniques, bowing to CSS, when an image is
considered "content" or not are ALL areas where reasonable people
have reasonable arguments for pros and cons and I think its the job
of the microformats spec writers to /wherever/ possible to support
common coding practices, because for the most part which technique is
appropriate is determined by one two word rule: "it depends".
Just my thought on the matter, anyway.
--
[ Chris Casciano ]
[ chris@placenamehere.com ] [ http://placenamehere.com ]
From msporny at digitalbazaar.com Sun Jul 15 14:45:04 2007
From: msporny at digitalbazaar.com (Manu Sporny)
Date: Sun Jul 15 14:45:09 2007
Subject: [uf-new] Use of img in rel-* (with analyzed data)
In-Reply-To:
References:
Message-ID: <469A9560.4020706@digitalbazaar.com>
Tantek ?elik wrote:
> On 7/15/07 11:09 AM, "Manu Sporny" wrote:
>> Tantek ?elik raised the
>> point that web authors often mis-use the ALT attribute[2].
>
> To be clear, the conclusion from this is that publishers should be given the
> detailed *choice* of whether or not the alt text in their pages is included
> in microformats property values (rather than being forced to by *always*
> using it in contained properties).
Tantek, I don't quite follow the logic here. Publishers aren't given the
option on whether or not their ALT text shows up in a text-based
browser. They are also not given the option on whether their ALT text is
read out loud when using a screen reader.
Why, then, are we giving them the option on how ALT will be handled with
regards to Microformats? Or rather, why are we giving them the option to
hide data?
> Thus the alt (or src for that matter) attribute of an
element is
> *only* included on a property value if the property is set directly on the
>
OR via a class="value" construct.
You don't have the option of setting "rel-*" properties on images. That
is the whole point of this discussion. Your "just set it on the
element" argument doesn't work for "rel-*". rel-* always go on anchor
elements ().
As for class="value", that is a potential solution... thank you for
identifying it. However, I ask again - why are we giving publishers the
choice of violating the HTML specification? Of hiding data? Where are
the real world examples of why we need to provide that option?
> Our experience with this in practice has been quite good, and in fact, this
> is the first that *anyone* has raised any issues with it (in over two years
> of it functioning this way - that is it's not that no one's written it down
> yet - unlike some of the existing issues), so given experience to date, I
> would assert that we have the 80/20 (or far more than even) case covered
Since you are asserting that the community has 80/20, could you please
provide some data to back up that claim? How many people use images
inside hCard/hCalendar/hAtom and hResume? How many of those people have
@alt specified correctly? Incorrectly? How many examples of images used
in rel-* do we have?
We have collected quite a bit of data (and continue to do so) that shows
that mis-use of @alt isn't as wide-spread as previously asserted. In
fact, it falls quite short of the Microformat community's 80/20 rule. If
I wasn't clear about that previously, here's a re-cap:
As of right now, it looks as though roughly 80-90% of websites are using
@alt correctly, either by not specifying a value or by specifying valid
data in the attribute.
If you'd like me to demonstrate that figure further, I would be more
than happy to do so - using hard data that is available to everybody on
this mailing list.
-- manu
From tantek at cs.stanford.edu Sun Jul 15 14:52:07 2007
From: tantek at cs.stanford.edu (Tantek =?ISO-8859-1?B?xw==?=elik)
Date: Sun Jul 15 14:52:12 2007
Subject: [uf-new] Use of img in rel-* (with analyzed data)
In-Reply-To: <469A9560.4020706@digitalbazaar.com>
Message-ID:
On 7/15/07 2:45 PM, "Manu Sporny" wrote:
> You don't have the option of setting "rel-*" properties on images. That
> is the whole point of this discussion. Your "just set it on the
> element" argument doesn't work for "rel-*". rel-* always go on anchor
> elements ().
rel-* never applies for image content anyway because rel-* semantics are
always between one URL and another URL which you must *hyperlink* to.
It's the HTML 4.01 specification that provides this restriction, not
microformats.
Thus rel-* and
is a non-issue.
Tantek
From andy at pigsonthewing.org.uk Sun Jul 15 14:52:30 2007
From: andy at pigsonthewing.org.uk (Andy Mabbett)
Date: Sun Jul 15 14:53:46 2007
Subject: [uf-new] Use of img in rel-* (with analyzed data)
In-Reply-To:
References: <469A62D6.9020805@digitalbazaar.com>
Message-ID:
In message , Tantek ?elik
writes
>Our experience with this in practice has been quite good, and in fact,
>this is the first that *anyone* has raised any issues with it
I've raised the matter previously.
>I would assert
[...]
> that new cases regarding this being raised now have the burden of
>proof
Surely, by your own standards, the burden is on you, to provide evidence
to support your assertions? I note that you have not replied to my
previous suggestion that you do so.
--
Andy Mabbett
From tantek at cs.stanford.edu Sun Jul 15 14:56:07 2007
From: tantek at cs.stanford.edu (Tantek =?ISO-8859-1?B?xw==?=elik)
Date: Sun Jul 15 14:56:11 2007
Subject: [uf-new] Use of img in rel-* (with analyzed data)
In-Reply-To: <469A9560.4020706@digitalbazaar.com>
Message-ID:
On 7/15/07 2:45 PM, "Manu Sporny" wrote:
> We have collected quite a bit of data (and continue to do so) that shows
> that mis-use of @alt isn't as wide-spread as previously asserted. In
> fact, it falls quite short of the Microformat community's 80/20 rule. If
> I wasn't clear about that previously, here's a re-cap:
>
> As of right now, it looks as though roughly 80-90% of websites are using
> @alt correctly, either by not specifying a value or by specifying valid
> data in the attribute.
Actually no. As several others have pointed out, the methodologies you used
to gather "quite a bit of data" and the conclusions you reached are
seriously flawed for a number of reasons.
You cannot determine that they are "specifying valid data in the attribute"
unless you inspect the value of the attribute and the page itself *by hand*
to determine whether from a human perspective proper semantics are being
followed. I believe other folks (some with accessibility expertise) have
already pointed this out.
Tantek
From tantek at cs.stanford.edu Sun Jul 15 14:58:07 2007
From: tantek at cs.stanford.edu (Tantek =?ISO-8859-1?B?xw==?=elik)
Date: Sun Jul 15 14:58:11 2007
Subject: [uf-new] Use of img in rel-* (with analyzed data)
In-Reply-To: <469A9560.4020706@digitalbazaar.com>
Message-ID:
On 7/15/07 2:45 PM, "Manu Sporny" wrote:
> However, I ask again - why are we giving publishers the
> choice of violating the HTML specification?
That has never been demonstrated via an actual example with URL and citation
of the clause in the spec with URL that is allegedly being violated, and
reasoning applied to the actual example as such. It's only been asserted in
hand-waving.
> Of hiding data?
No one is advocating that AFAIK.
> Where are
> the real world examples of why we need to provide that option?
Manu, please check the other responses on this thread, there has already
been at least one publisher that has responded and demonstrated as such.
Thanks,
Tantek
From tantek at cs.stanford.edu Sun Jul 15 15:05:17 2007
From: tantek at cs.stanford.edu (Tantek =?ISO-8859-1?B?xw==?=elik)
Date: Sun Jul 15 15:05:21 2007
Subject: [admin] [EoT request] was Re: [uf-new] Use of img in rel-* (with
analyzed data)
In-Reply-To: <469A9560.4020706@digitalbazaar.com>
Message-ID:
This thread is quickly repeating itself, dominating the email discussions on
the list, and thus becoming more noise than signal for most.
Thus I'm going to ask folks who have an agenda of pushing change here to
please STOP repeating themselves (especially when those asking for change
are ignoring criticisms brought forth by the community).
In addition this is a general admin request for those mentioned (Manu and
Andy in particular) to STOP posting on this thread in the list for at least
7 days (to reduce list noise) or until they've documented concrete proposals
*and* the criticisms brought up in the email thread using the wiki.
Thanks,
Tantek
From andy at pigsonthewing.org.uk Sun Jul 15 15:52:29 2007
From: andy at pigsonthewing.org.uk (Andy Mabbett)
Date: Sun Jul 15 15:53:47 2007
Subject: [admin] [EoT request] was Re: [uf-new] Use of img in rel-* (with
analyzed data)
In-Reply-To:
References: <469A9560.4020706@digitalbazaar.com>
Message-ID:
In message , Tantek ?elik
writes
>this is a general admin request for those mentioned (Manu and Andy in
>particular) to STOP posting on this thread in the list for at least 7
>days
Is that a request, or an instruction?
--
Andy Mabbett
From joe at andrieu.net Sun Jul 15 15:58:13 2007
From: joe at andrieu.net (Joe Andrieu)
Date: Sun Jul 15 15:57:57 2007
Subject: [admin] [EoT request] was Re: [uf-new] Use of img in rel-*
(withanalyzed data)
In-Reply-To:
Message-ID: <000201c7c733$a034e8b0$0501a8c0@andrieuhome>
Tantek ? elik wrote (Sunday, July 15, 2007 3:05 PM)
> This thread is quickly repeating itself, dominating the email
> discussions on the list, and thus becoming more noise than
> signal for most.
>
> Thus I'm going to ask folks who have an agenda of pushing
> change here to please STOP repeating themselves (especially
> when those asking for change are ignoring criticisms brought
> forth by the community).
>
> In addition this is a general admin request for those
> mentioned (Manu and Andy in particular) to STOP posting on
> this thread in the list for at least 7 days (to reduce list
> noise) or until they've documented concrete proposals
> *and* the criticisms brought up in the email thread using the wiki.
Tantek,
It is pretty unsporting of you to cut off all discussion after you post your own points to the list.
I agree the conversation is going in circles... And probably could use some time off. However, requesting as admin that those who
disagree with you should quiet down--after you make your own points--comes across as a heavy-handed way to get the last word in.
I would also like to see some of the back-and-forth move to concrete proposals on the wiki. Including your own points, Tantek. The
most popular uFs, such as hCard and hCalendar never went through the uF process with the same documentation and rigor that new
proposals face. You yourself have acknowledged the lack of documentation before. As a result, I'd say the burden of proof exists, as
usual, for everyone making a case. And in my opinion, it is even greater for those defending the status quo, if simply because the
incumbant have the benefit of possession of the standard.
[other thoughts presented in the non-admin thread]
-j
--
Joe Andrieu
SwitchBook Software
http://www.switchbook.com
joe@switchbook.com
+1 (805) 705-8651
From joe at andrieu.net Sun Jul 15 16:00:21 2007
From: joe at andrieu.net (Joe Andrieu)
Date: Sun Jul 15 16:00:06 2007
Subject: [uf-new] Use of img in rel-* (with analyzed data)
In-Reply-To:
Message-ID: <000301c7c733$ecd39fe0$0501a8c0@andrieuhome>
Tantek ? elik wrote (Sunday, July 15, 2007 2:58 PM)
> On 7/15/07 2:45 PM, "Manu Sporny" wrote:
>
> > However, I ask again - why are we giving publishers the choice of
> > violating the HTML specification?
>
> That has never been demonstrated via an actual example with
> URL and citation of the clause in the spec with URL that is
> allegedly being violated, and reasoning applied to the actual
> example as such. It's only been asserted in hand-waving.
>
>
> > Of hiding data?
>
> No one is advocating that AFAIK.
>
>
> > Where are
> > the real world examples of why we need to provide that option?
>
> Manu, please check the other responses on this thread, there
> has already been at least one publisher that has responded
> and demonstrated as such.
I don't understand the use of examples in this debate.
If we were to examine contact information in the wild, we wouldn't suggest that class="title" is bad because nobody is using it or
because people use it outside of hcards. We aren't about to go scan every ALT value for semantic data; the suggestion, as I
understand it, is to allow ALT values as a source of data /within/ uFs.
The issue, imo, seems to be that /when authoring uFs/, can or should data be placed in alt tags so that authors can specify data
that otherwise might be burried in an image?
That becomes two questions.
1. Is it semantically valid within common HTML usage? Or put another way, images sometimes contain human readable data that are not
(easily) machine readable. Is the alt tag an appropriate way to specify that data in a machine-readable way?
2. Does it break existing uFs? And that means specifications and usage; Whether or not it breaks parsers is a different issue. In
this case, it may make sense to evaluate the use of IMG ALT tags within uFs to see if uF-using authors have adopted widespread
practices that would break. However, evaluating random selections of IMG tags doesn't really help us understand anything about
current uF usage and how this change to the spec might cause problems with existing uFs.
-j
--
Joe Andrieu
SwitchBook Software
http://www.switchbook.com
joe@switchbook.com
+1 (805) 705-8651
From tantek at cs.stanford.edu Sun Jul 15 17:23:24 2007
From: tantek at cs.stanford.edu (Tantek =?ISO-8859-1?B?xw==?=elik)
Date: Sun Jul 15 17:23:30 2007
Subject: [admin] [EoT request] was Re: [uf-new] Use of img in rel-*
(with analyzed data)
In-Reply-To:
Message-ID:
On 7/15/07 3:52 PM, "Andy Mabbett" wrote:
> In message , Tantek ?elik
> writes
>
>> this is a general admin request for those mentioned (Manu and Andy in
>> particular) to STOP posting on this thread in the list for at least 7
>> days
>
> Is that a request, or an instruction?
To be clear, a request.
Thanks Andy,
Tantek
From tantek at cs.stanford.edu Sun Jul 15 17:48:59 2007
From: tantek at cs.stanford.edu (Tantek =?ISO-8859-1?B?xw==?=elik)
Date: Sun Jul 15 17:49:07 2007
Subject: [admin] [EoT request] was Re: [uf-new] Use of img in rel-*
(withanalyzed data)
In-Reply-To: <000201c7c733$a034e8b0$0501a8c0@andrieuhome>
Message-ID:
On 7/15/07 3:58 PM, "Joe Andrieu" wrote:
> Tantek ? elik wrote (Sunday, July 15, 2007 3:05 PM)
>> This thread is quickly repeating itself, dominating the email
>> discussions on the list, and thus becoming more noise than
>> signal for most.
>>
>> Thus I'm going to ask folks who have an agenda of pushing
>> change here to please STOP repeating themselves (especially
>> when those asking for change are ignoring criticisms brought
>> forth by the community).
>>
>> In addition this is a general admin request for those
>> mentioned (Manu and Andy in particular) to STOP posting on
>> this thread in the list for at least 7 days (to reduce list
>> noise) or until they've documented concrete proposals
>> *and* the criticisms brought up in the email thread using the wiki.
>
> Tantek,
>
> It is pretty unsporting of you to cut off all discussion after you post your
> own points to the list.
Joe,
Apologies as I do realize it came across like that.
I realized shortly after posting my own recent emails that I wasn't helping
the problem either and thus put on my admin hat and posted my request
regarding the thread which I as well will stick to until those advocating
changes do the requested work on the wiki.
> I agree the conversation is going in circles... And probably could use some
> time off. However, requesting as admin that those who
> disagree with you should quiet down--after you make your own points--comes
> across as a heavy-handed way to get the last word in.
The admin request applies to thread as a whole, but especially to those who
have posted most often in the thread as the high-frequency of posting (and
thus apparent noise on the list) is due to a few, not everyone.
> I would also like to see some of the back-and-forth move to concrete proposals
> on the wiki.
Thanks Joe.
> Including your own points, Tantek.
I'm wiking everything I can, in priority order per my to-do list on the
wiki:
http://microformats.org/wiki/to-do#Tantek
I encourage you to add a section for yourself on the to-do page as well for
the things you want to get done in the microformats community.
> The
> most popular uFs, such as hCard and hCalendar never went through the uF
> process with the same documentation and rigor that new
> proposals face.
They did go through various checks and balances similar to those in the
process (in fact, much of the process was written as a result of documenting
the methodology developed *while* developing hCard and hCalendar).
Your request for more specific history is reasonable, and will certainly
benefit both out existing microformats, and those looking to understand the
development of microformats in general. I've added it to my personal to-do
list.
> You yourself have acknowledged the lack of documentation
> before. As a result, I'd say the burden of proof exists, as
> usual, for everyone making a case.
It is not the same for everyone no. There is what is established and thus
works today, and there are proposals for change. The proposals for change
have burden of proof. The documentation at this point for those that
actually worked on it is a matter of historical documentation, not process.
> And in my opinion, it is even greater for
> those defending the status quo, if simply because the
> incumbant have the benefit of possession of the standard.
We will simply have to choose to disagree on this point then.
The burden of proof is always on those who wish to change or modify what
already "works" to a great extent today. This principle is actually in use
all over microformats, such as re-using existing implied schemas and looking
at existing widely interoperable standards as a basis for vocabulary for
microformats.
Thus it could be said that a key principle of microformats in general "doing
what already works" (i.e. re-use) is greatly valued over "changing
everything and starting from scratch" (i.e. re-invention).
Thanks,
Tantek
From alasdairking at gmail.com Mon Jul 16 00:14:23 2007
From: alasdairking at gmail.com (Alasdair King)
Date: Mon Jul 16 00:14:27 2007
Subject: [uf-new] img alt content statistics
In-Reply-To: <11001486-61EA-4CFC-8DA7-D7BE9D3E663D@placenamehere.com>
References: <4692F41E.2010503@digitalbazaar.com>
<003501c7c33c$9b2e10d0$bc08a8c0@nzto22>
<4694E5EE.2060902@digitalbazaar.com>
<4698F9F1.1060409@digitalbazaar.com>
<469925A2.3080902@digitalbazaar.com> <469953C9.6000301@googlemail.com>
<11001486-61EA-4CFC-8DA7-D7BE9D3E663D@placenamehere.com>
Message-ID: <7df2c90b0707160014t2702c65dy7e353319f50573d2@mail.gmail.com>
I develop a free web browser for blind people called WebbIE (
http://www.webbie.org.uk). The use of images in links is a problem for my
users too.
I use the alt tag content as text for the link: if the alt tag is blank or
missing I use the filename of the target, so
http://www.mypage.com/contact.htm
becomes
Link 1: contact.htm
(And on looking at it how it should probably just go to just "contact"...!)
Offered as a real-world example (insignificant numbers vs IE, significant
numbers vs blind people).
--
Alasdair King
WebbIE
http://www.webbie.org.uk
alasdair@webbie.org.uk
On 7/15/07, Chris Casciano wrote:
>
>
> On Jul 14, 2007, at 6:52 PM, Benjamin Hawkes-Lewis wrote:
>
> > I'm increasingly sceptical about non-qualitative statistical
> > exercises of this sort. They need to be interpreted with great
> > caution. For example, alt="" may be compliant with the (X)HTML
> > specifications, or it may not be. You just can't tell without
> > looking at the page in question.
> >
> > I'm not sure why mass use or abuse of @alt, treating all webpages
> > as equals, is deterministic for hCard parsing. Doesn't there need
> > to be a subsample containing only pages with markup that would be
> > interpreted by a microformat parser as an hCard?
>
>
> One thing I hope we don't lose sight of is that while we as a
> community should be promoting standards and other best practices in
> all web development and design fronts, if the microformat specs take
> a hard line on issues such as this where there is some regular use of
> a variety of techniques it may hurt both adoption on a case b case
> basis as well as how the movement as a whole is viewed in terms of
> practicality.
>
>
> Image replacement techniques, bowing to CSS, when an image is
> considered "content" or not are ALL areas where reasonable people
> have reasonable arguments for pros and cons and I think its the job
> of the microformats spec writers to /wherever/ possible to support
> common coding practices, because for the most part which technique is
> appropriate is determined by one two word rule: "it depends".
>
>
> Just my thought on the matter, anyway.
>
> --
> [ Chris Casciano ]
> [ chris@placenamehere.com ] [ http://placenamehere.com ]
>
> _______________________________________________
> microformats-new mailing list
> microformats-new@microformats.org
> http://microformats.org/mailman/listinfo/microformats-new
>
--
Alasdair King
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://microformats.org/discuss/mail/microformats-new/attachments/20070716/9def9b91/attachment.html
From msporny at digitalbazaar.com Mon Jul 16 09:29:48 2007
From: msporny at digitalbazaar.com (Manu Sporny)
Date: Mon Jul 16 09:29:51 2007
Subject: [admin] [EoT request] was Re: [uf-new] Use of img in rel-* (with
analyzed data)
In-Reply-To:
References:
Message-ID: <469B9CFC.2090409@digitalbazaar.com>
Tantek ?elik wrote:
> In addition this is a general admin request for those mentioned (Manu and
> Andy in particular) to STOP posting on this thread in the list for at least
> 7 days (to reduce list noise) or until they've documented concrete proposals
> *and* the criticisms brought up in the email thread using the wiki.
Out of respect for the community, I'll stop posting for 7 days. I'll
spend time documenting this argument on the wiki and will point everyone
to that page once it is updated along with a more detailed explanation
as to how the analysis was completed and why the data is pertinent. If
there is any other information that people would like included on the
page, please let me know off-list.
-- manu
From cgriego at gmail.com Mon Jul 16 12:24:31 2007
From: cgriego at gmail.com (Chris Griego)
Date: Mon Jul 16 12:24:33 2007
Subject: [uf-new] Use of img in rel-* (with analyzed data)
In-Reply-To:
References: <469A62D6.9020805@digitalbazaar.com>
Message-ID: <15996c030707161224j16eb5f18k157f21a434766922@mail.gmail.com>
On 7/15/07, Tantek ?elik wrote:
> Our experience with this in practice has been quite good, and in fact, this
> is the first that *anyone* has raised any issues with it (in over two years
> of it functioning this way - that is it's not that no one's written it down
> yet - unlike some of the existing issues), so given experience to date, I
> would assert that we have the 80/20 (or far more than even) case covered,
> and that new cases regarding this being raised now have the burden of
> proof[1].
I have raised this issue before in IRC directly in conversation with
you, Tantek. It also came up during Twitter's adoption of microformats
because their usage assumed that the alt text was considered part of
the microformat output without specifying anything specific.
--
Chris Griego
From andy at pigsonthewing.org.uk Mon Jul 16 13:19:23 2007
From: andy at pigsonthewing.org.uk (Andy Mabbett)
Date: Mon Jul 16 13:21:03 2007
Subject: [uf-new] Use of img in rel-* (with analyzed data)
In-Reply-To: <15996c030707161224j16eb5f18k157f21a434766922@mail.gmail.com>
References: <469A62D6.9020805@digitalbazaar.com>
<15996c030707161224j16eb5f18k157f21a434766922@mail.gmail.com>
Message-ID:
In message
<15996c030707161224j16eb5f18k157f21a434766922@mail.gmail.com>, Chris
Griego writes
>On 7/15/07, Tantek ?elik wrote:
>> Our experience with this in practice has been quite good, and in fact, this
>> is the first that *anyone* has raised any issues with it (in over two years
>> of it functioning this way - that is it's not that no one's written it down
>> yet - unlike some of the existing issues), so given experience to date, I
>> would assert that we have the 80/20 (or far more than even) case covered,
>> and that new cases regarding this being raised now have the burden of
>> proof[1].
>
>I have raised this issue before in IRC directly in conversation with
>you, Tantek. It also came up during Twitter's adoption of microformats
>because their usage assumed that the alt text was considered part of
>the microformat output without specifying anything specific.
Also:
(reformatted and edited for readability)
# [22:05:44] one question ...
# [22:05:44] should work?
# [22:06:05] should
# [22:06:12] yes, as long as there is a
around it
and this mailing list thread:
This thread has some useful background:
--
Andy Mabbett
From Leif_Storset at intuit.com Mon Jul 16 15:06:06 2007
From: Leif_Storset at intuit.com (Storset, Leif)
Date: Mon Jul 16 15:06:08 2007
Subject: [uf-new] Receipt microformat
References: <657A9BE009D3504AAE29BD8E8C2DD61E07303B24@SDGEXEVS02.corp.intuit.net>
Message-ID: <657A9BE009D3504AAE29BD8E8C2DD61E0735809A@SDGEXEVS02.corp.intuit.net>
Fellow microformat enthusiasts,
I work in Intuit's Technology Innovation Group, which explores new and
emerging technologies and helps Intuit product teams adopt them. (For
those outside North America: Intuit is the leading vendor of financial
and tax software for individuals and small business. In America, our
products Quicken, QuickBooks and TurboTax are household names.)
Our group is interested in microformats - specifically the possibility
of a receipt microformat. We believe that a receipt format for online
stores could significantly reduce data entry for our users.
Following the "why a new microformat" process:
The PROBLEM: our users currently enter expenses into our software
manually, even when the information is available in digital form. This
is done in lump sums, which hinders further analysis and categorization.
All this can be automated. (Indeed it is already automated through
screen scraping, but this is unreliable, error-prone and not
future-proof.)
Is there a SIMPLER PROBLEM? Some components of a receipt are simpler
problems that have been solved. Billing address and delivery address are
obviously vCards; price could use the proposed hCurrency. But data such
as the line items (Product X in quantity N at price Y) would not make
sense out of the context of a "receipt". (hProduct and hListing
obviously come close and might possibly be integrated somehow.) In
short, we don't see a simpler problem to solve, since we already have
some microformats in place.
Has the problem been SOLVED? As far as we can tell, no. Martin Owens and
Joe Osowski exchanged ideas on this microformat earlier, and we'd like
to build on their work.
(http://microformats.org/discuss/mail/microformats-new/2007-May/000394.h
tml)
In case the purpose of the microformat is not clear, imagine the
following use case: The customer, a Quicken user with the (hypothetical)
Quicken Browser Toolbar, is shopping at Amazon.com and is ready for
checkout. After paying for the purchase, the customer wishes to enter
the data into Quicken. Instead of manually typing everything into
Quicken, the customer selects "Save receipt" from the Quicken Browser
Toolbar, which imports the expense into Quicken. Another possibility is
to use a JavaScript-powered button to copy the receipt to the clipboard
and support pasting the microformat from within Quicken.
We are looking forward to your input and participation. Has the problem
been solved before? Are there other useful microformats already in
existence that could be included in a receipt format?
Thanks,
Leif Arne Storset
Technology Innovation Group, Intuit
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://microformats.org/discuss/mail/microformats-new/attachments/20070716/a13965d7/attachment.html
From joe at andrieu.net Mon Jul 16 22:29:16 2007
From: joe at andrieu.net (Joe Andrieu)
Date: Mon Jul 16 22:28:57 2007
Subject: [admin] [EoT request] was Re: [uf-new] Use of img in
rel-*(withanalyzed data)
In-Reply-To:
Message-ID: <006c01c7c833$6c72f930$0501a8c0@andrieuhome>
Tantek ? elik wrote (Sunday, July 15, 2007 5:49 PM):
> On 7/15/07 3:58 PM, "Joe Andrieu" wrote:
>
> > I would also like to see some of the back-and-forth move to
> concrete
> > proposals on the wiki.
>
> Thanks Joe.
>
>
> > Including your own points, Tantek.
>
> I'm wiking everything I can, in priority order per my to-do
> list on the
> wiki:
>
> http://microformats.org/wiki/to-do#Tantek
>
> I encourage you to add a section for yourself on the to-do
> page as well for the things you want to get done in the
> microformats community.
Tantek,
I have been specifically requested by Rohit /not/ to put my issues on the wiki. And since he has removed them, spoken with me
personally, and promised some sort of progress behind the scenes, I will continue to respect that.
However, until that progress is visible and my concerns about governance, ownership, and IP are resolved satisfactorily, I will
continue to refrain from substantial contributions to the wiki, just as I will be careful about using uF in my own work. Where I
can, I will contribue in email discussions and participate in the community with hopes that it will eventually evolve from a private
cabal of individual uF owners to a true open source community.
> > You yourself have acknowledged the lack of documentation
> before. As a
> > result, I'd say the burden of proof exists, as usual, for everyone
> > making a case.
>
> It is not the same for everyone no. There is what is
> established and thus works today, and there are proposals for
> change. The proposals for change have burden of proof. The
> documentation at this point for those that actually worked on
> it is a matter of historical documentation, not process.
>
> > And in my opinion, it is even greater for
> > those defending the status quo, if simply because the
> incumbant have
> > the benefit of possession of the standard.
>
> We will simply have to choose to disagree on this point then.
>
> The burden of proof is always on those who wish to change or
> modify what already "works" to a great extent today. This
> principle is actually in use all over microformats, such as
> re-using existing implied schemas and looking at existing
> widely interoperable standards as a basis for vocabulary for
> microformats.
>
> Thus it could be said that a key principle of microformats in
> general "doing what already works" (i.e. re-use) is greatly
> valued over "changing everything and starting from scratch"
> (i.e. re-invention).
Respectfully, please avoid hyperbole if you would like to have a constructive conversation. Nobody has suggested "changing
everything and starting from scratch". It would be easier to do that outside of microformats.org if it were appropriate.
Based on my own experience and informal research of significant cultural, historical, and philosophical essays on the issue of
authority, I reject the argument that things should stay the same just "because that's the way we've always done it". Time and
again, questioning the status quo has repeatedly generated improvements, even when catalytic of disruptive change.
If there are documented and well-found reasons for a standing decision, the simple response is to point to those reasons and suggest
to those who would suggest something new, that they address those reasons explicitly in any new proposals. Clearly articulated and
well argued foundations for decisions can stand the test of time... but that fact should not be taken as license to reject
suggestions out of hand simply because they are new or "not invented here". The foundation of technical authority must lie in the
merit of the technology, not in the legacy of authorship.
Just because a handful of smart guys documented semantic HTML representations of vCard and iCalendar does not make those
specifications "holy" or unchangeable. This community has severe limitations on growth, especially in the area of versioning. As
respectfully as possible, I suggest it is largely because the original authors often react defensively when changes are proposed.
Supporting that emotional dynamic, there is no change control process. Either the original author deems it appropriate and updates
the spec--such as when you added "places" to the semantics of vCard--or the original authors fight tooth and nail until
well-intentioned suggestions are bludgeoned to death. In contrast, new proposals go through a brutal review process where every
last detail is examined and debated. For those proposals that survive the gauntlet, the outcome promises to be a solid, robust
microformat.
Perhaps it would be more constructive if proposed changes to existing standards had some sort of agreed upon process for
documentation, evaluation, and acceptance/rejection.
-j
--
Joe Andrieu
SwitchBook Software
http://www.switchbook.com
joe@switchbook.com
+1 (805) 705-8651
From bhawkeslewis at googlemail.com Tue Jul 17 01:14:55 2007
From: bhawkeslewis at googlemail.com (Benjamin Hawkes-Lewis)
Date: Tue Jul 17 01:15:01 2007
Subject: [uf-new] Use of img in rel-* (with analyzed data)
In-Reply-To: <469A62D6.9020805@digitalbazaar.com>
References: <469A62D6.9020805@digitalbazaar.com>
Message-ID: <469C7A7F.8010308@googlemail.com>
Manu Sporny wrote:
> This analysis required human interaction, thus the sample size is small
> (but still statistically significant). A small GUI displayed an image to
> a person and asked them to select if the image matched the ALT tag. This
> is the first time this data is being presented:
I think your analysis has done a good job of showing that @alt usage is
better than I at least would have generally assumed, and it's great that
you actually tested @alt with humans. But I do think there is a
methodological flaw with how you did this. @alt text does not exist in a
vacuum, but in the context of a page. @alt does not match image, but the
use of an image within a given context. For example:
Help
would be better than no @alt, but would still be misguided. In context,
the correct alternative text would actually be alt="". And such errors
would matter for microformat parsing, e.g.:
Benjamin Hawkes-Lewis
So it would be better to present at least the immediate context of the
image to human testers, not just the image itself.
(Note I /strongly/ agree that microformat parsers should treat @alt text
just like other text as per the HTML specification and WCAG; I'm making
a purely methodological point about your statistical approach here.)
--
Benjamin Hawkes-Lewis
From Leif_Storset at intuit.com Tue Jul 17 10:08:36 2007
From: Leif_Storset at intuit.com (Storset, Leif)
Date: Tue Jul 17 10:08:43 2007
Subject: [uf-new] Receipt microformat
In-Reply-To: <657A9BE009D3504AAE29BD8E8C2DD61E0735809A@SDGEXEVS02.corp.intuit.net>
References: <657A9BE009D3504AAE29BD8E8C2DD61E07303B24@SDGEXEVS02.corp.intuit.net>
<657A9BE009D3504AAE29BD8E8C2DD61E0735809A@SDGEXEVS02.corp.intuit.net>
Message-ID: <657A9BE009D3504AAE29BD8E8C2DD61E073CBA3F@SDGEXEVS02.corp.intuit.net>
Hello again,
Regarding the receipt microformat, I thought I would link to Joe
Osowski's
(http://microformats.org/discuss/mail/microformats-new/2007-May/000394.h
tml) and Martin Owens's
(http://microformats.org/discuss/mail/microformats-discuss/2007-January/
008033.html) proposals for your reference.
I have collected a few samples that I will upload as soon as we agree
that a wiki page is warranted.
Looking forward to hearing from you!
Leif Arne Storset
Technology Innovation Group, Intuit
________________________________
From: microformats-new-bounces@microformats.org
[mailto:microformats-new-bounces@microformats.org] On Behalf Of Storset,
Leif
Sent: Monday, July 16, 2007 3:06 PM
To: microformats-new@microformats.org
Subject: [uf-new] Receipt microformat
Fellow microformat enthusiasts,
I work in Intuit's Technology Innovation Group, which explores new and
emerging technologies and helps Intuit product teams adopt them. (For
those outside North America: Intuit is the leading vendor of financial
and tax software for individuals and small business. In America, our
products Quicken, QuickBooks and TurboTax are household names.)
Our group is interested in microformats - specifically the possibility
of a receipt microformat. We believe that a receipt format for online
stores could significantly reduce data entry for our users.
Following the "why a new microformat" process:
The PROBLEM: our users currently enter expenses into our software
manually, even when the information is available in digital form. This
is done in lump sums, which hinders further analysis and categorization.
All this can be automated. (Indeed it is already automated through
screen scraping, but this is unreliable, error-prone and not
future-proof.)
Is there a SIMPLER PROBLEM? Some components of a receipt are simpler
problems that have been solved. Billing address and delivery address are
obviously vCards; price could use the proposed hCurrency. But data such
as the line items (Product X in quantity N at price Y) would not make
sense out of the context of a "receipt". (hProduct and hListing
obviously come close and might possibly be integrated somehow.) In
short, we don't see a simpler problem to solve, since we already have
some microformats in place.
Has the problem been SOLVED? As far as we can tell, no. Martin Owens and
Joe Osowski exchanged ideas on this microformat earlier, and we'd like
to build on their work.
(http://microformats.org/discuss/mail/microformats-new/2007-May/000394.h
tml)
In case the purpose of the microformat is not clear, imagine the
following use case: The customer, a Quicken user with the (hypothetical)
Quicken Browser Toolbar, is shopping at Amazon.com and is ready for
checkout. After paying for the purchase, the customer wishes to enter
the data into Quicken. Instead of manually typing everything into
Quicken, the customer selects "Save receipt" from the Quicken Browser
Toolbar, which imports the expense into Quicken. Another possibility is
to use a JavaScript-powered button to copy the receipt to the clipboard
and support pasting the microformat from within Quicken.
We are looking forward to your input and participation. Has the problem
been solved before? Are there other useful microformats already in
existence that could be included in a receipt format?
Thanks,
Leif Arne Storset
Technology Innovation Group, Intuit
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://microformats.org/discuss/mail/microformats-new/attachments/20070717/2626380b/attachment-0001.html
From msporny at digitalbazaar.com Tue Jul 17 10:51:46 2007
From: msporny at digitalbazaar.com (Manu Sporny)
Date: Tue Jul 17 10:51:50 2007
Subject: [uf-new] Receipt microformat
In-Reply-To: <657A9BE009D3504AAE29BD8E8C2DD61E0735809A@SDGEXEVS02.corp.intuit.net>
References: <