From arash.amiri at researchstudio.at Wed Jul 4 05:18:38 2007 From: arash.amiri at researchstudio.at (Arash Amiri) Date: Wed Jul 4 05:19:07 2007 Subject: [uf-new] micorformat for a shopping task Message-ID: <468B901E.5080607@researchstudio.at> Hi! I was wondering if it makes sense to create some microformat for a shopping task. This microformat describes what you want to buy, until when you want to have it (deadline?), and maybe some more things... The reason why I mention this it to find some "portable description" of things you need. For example, you walk passed a supermarket and get reminded that you need some bread. There is probably no format encapsulating this "I need this until then..."-thing. any comments (or is that just out of scope of the idea?) From scott at makedatamakesense.com Wed Jul 4 07:52:36 2007 From: scott at makedatamakesense.com (Scott Reynen) Date: Wed Jul 4 07:52:41 2007 Subject: [uf-new] micorformat for a shopping task In-Reply-To: <468B901E.5080607@researchstudio.at> References: <468B901E.5080607@researchstudio.at> Message-ID: <5C5B4EA3-B828-4FC5-8079-0A87C6E68DEC@makedatamakesense.com> On Jul 4, 2007, at 6:18 AM, Arash Amiri wrote: > I was wondering if it makes sense to create some microformat for a > shopping task. This microformat describes what you want to buy, > until when you want to have it (deadline?), and maybe some more > things... > > The reason why I mention this it to find some "portable > description" of things you need. For example, you walk passed a > supermarket and get reminded that you need some bread. There is > probably no format encapsulating this "I need this until then..."- > thing. > > any comments (or is that just out of scope of the idea?) Have you tried applying hListing to this? http://microformats.org/wiki/hlisting -- Scott Reynen MakeDataMakeSense.com From msporny at digitalbazaar.com Sun Jul 8 13:39:33 2007 From: msporny at digitalbazaar.com (Manu Sporny) Date: Sun Jul 8 13:39:37 2007 Subject: [uf-new] hAudio implemented on Bitmunk (with one snag) Message-ID: <46914B85.4040903@digitalbazaar.com> We've gone through and implemented hAudio on Bitmunk.com (one of our service websites). David Lehn, one of our semantic web guys, has also created an hAudio plug-in for Operator. Mike Kaply, author of Operator, said that he will make it available via the Operator download section within the next week or two. To view some hAudio compliant markup, you can go to the following link: http://www.bitmunk.com/view/media/6011098 There are over 850,000 songs that have been marked up on the website. We are in the process of talking our partners, colleagues and competitors into using hAudio to mark up their audio content as well. So, good progress is being made in implementing hAudio. However, we've hit a snag when it comes to usability with hAudio and Operator/Firefox 3. Problem Description: It is quite often that a site uses an image instead of a text link to present actions. For example: Instead of using the text "Download", they will use a graphic image with a downward-facing arrow. In other words, if we have this: Download: How do we present this option to a human being in a non-web-page UI? How it relates to the Examples: We (Bitmunk.com) has this problem with 'rel-sample', 'rel-enclosure', and 'rel-payment'. Most of the examples also contain images instead of text for samples, downloads and purchase links. This is a demonstrable, widespread problem. The problem with Operator and screen readers: If there is no text to display, then how does one place the item into a menu/display for Operator/Firefox? Grabbing the image and placing it in a UI is a difficult argument to make - there are a variety of image sizes that might not do well in the Operator UI (or Firefox 3 UI). Proposed solution: We have a fix for Operator that uses the link title text if there is no internal text. This fixes the problem for both Operator menu display, Firefox 3 UI display and for screen readers. Here's how the site author would change the text above: Download: This approach is beneficial for the following reasons: 1. It POSH-ifies the website. 2. It works well with Operator, Firefox 3 and other uF parsers/UIs. 3. It fixes the accessibility/screen reader problem. We need feedback/consensus from the uF community before submitting the patch for inclusion into Operator/Firefox 3. Is there anybody that disagrees with this approach or has a better approach? -- manu From andy at pigsonthewing.org.uk Sun Jul 8 14:10:23 2007 From: andy at pigsonthewing.org.uk (Andy Mabbett) Date: Sun Jul 8 14:10:30 2007 Subject: [uf-new] hAudio implemented on Bitmunk (with one snag) In-Reply-To: <46914B85.4040903@digitalbazaar.com> References: <46914B85.4040903@digitalbazaar.com> Message-ID: In message <46914B85.4040903@digitalbazaar.com>, Manu Sporny writes >Problem Description: > >It is quite often that a site uses an image instead of a text link to >present actions. For example: Instead of using the text "Download", >they will use a graphic image with a downward-facing arrow. > >In other words, if we have this: > >Download: > > > > >How do we present this option to a human being in a non-web-page UI? The HTML is invalid, lacking the alt attribute which should fix this problem. -- Andy Mabbett From msporny at digitalbazaar.com Sun Jul 8 14:30:12 2007 From: msporny at digitalbazaar.com (Manu Sporny) Date: Sun Jul 8 14:30:15 2007 Subject: [uf-new] Mapping the hAudio Microformat to hAudio RDFa Message-ID: <46915764.1070900@digitalbazaar.com> What is the process for mapping the hAudio Microformat to RDFa? The reason that we need to do this is because the hAlbum/hVideo specs that we've been researching internally have some nasty long-term design issues due to the no-namespace/scope-less approach that uFs have adopted. We'd like to make hAudio a standard across "semantic languages". We don't want to go through the same arduous process that everybody had to go through on here concerning hAudio with another community... it would be a waste of everybody's time. The hard work (research, examples and analysis) was accomplished via the Microformats process. Implementing hAudio using the Microformats approach isn't going to work for complicated/nested audio/video/image structures. We have a very large database that we would like to semantic-ify and we would like to do it in a standards-compliant way. How can we propose an RDFa standard for hAudio that is scrutinized and adopted by this community? In other words - Microformats did a good job with the design. We'd like to give people the option to implement using either hAudio uF or hAudio RDFa. -- manu From microformats at kaply.com Mon Jul 9 05:37:49 2007 From: microformats at kaply.com (Mike Kaply) Date: Mon Jul 9 05:37:53 2007 Subject: [uf-new] hAudio implemented on Bitmunk (with one snag) In-Reply-To: References: <46914B85.4040903@digitalbazaar.com> Message-ID: Actually the alt attribute WON'T fix this problem. Because the microformat attribute is on the anchor tag, not the image. Microformats grab the text in the tag. They only grab the image alt text if the microformat class is on the image itself. Here's a different example: Mike Kaply I realize this is a little contrived, but you get the idea. In this case, the fn is empty. Mike Kaply On 7/8/07, Andy Mabbett wrote: > In message <46914B85.4040903@digitalbazaar.com>, Manu Sporny > writes > > >Problem Description: > > > >It is quite often that a site uses an image instead of a text link to > >present actions. For example: Instead of using the text "Download", > >they will use a graphic image with a downward-facing arrow. > > > >In other words, if we have this: > > > >Download: > > > > > > > > > >How do we present this option to a human being in a non-web-page UI? > > The HTML is invalid, lacking the alt attribute which should fix this > problem. > > -- > Andy Mabbett > _______________________________________________ > microformats-new mailing list > microformats-new@microformats.org > http://microformats.org/mailman/listinfo/microformats-new > From andy at pigsonthewing.org.uk Mon Jul 9 07:14:12 2007 From: andy at pigsonthewing.org.uk (Andy Mabbett) Date: Mon Jul 9 07:14:16 2007 Subject: [uf-new] hAudio implemented on Bitmunk (with one snag) In-Reply-To: References: <46914B85.4040903@digitalbazaar.com> Message-ID: <18788.80.86.36.97.1183990452.squirrel@www.gradwell.com> On Mon, July 9, 2007 13:37, Mike Kaply wrote: > On 7/8/07, Andy Mabbett wrote: > >> In message <46914B85.4040903@digitalbazaar.com>, Manu Sporny >> writes >>> if we have this: >>> Download: >>> >>> >>> >>> How do we present this option to a human being in a non-web-page UI? >> The HTML is invalid, lacking the alt attribute which should fix this >> problem. > Actually the alt attribute WON'T fix this problem. Because the > microformat attribute is on the anchor tag, not the image. Microformats > grab the text in the tag. They only grab the image alt text if the > microformat class is on the image itself. Here's a different example: > > alt="Mike Kaply"> > > I realize this is a little contrived, but you get the idea. > In this case, the fn is empty. My argument is that the fn should /not/ be empty; the "alt" attribute contains the text equivalent of the image. To discount it as you suggest is to ignore the semantics of the mark-up presented to you. -- Andy Mabbett ** via webmail ** From scott at makedatamakesense.com Mon Jul 9 07:37:49 2007 From: scott at makedatamakesense.com (Scott Reynen) Date: Mon Jul 9 07:38:02 2007 Subject: [uf-new] img alt content (was:hAudio implemented on Bitmunk (with one snag)) In-Reply-To: <18788.80.86.36.97.1183990452.squirrel@www.gradwell.com> References: <46914B85.4040903@digitalbazaar.com> <18788.80.86.36.97.1183990452.squirrel@www.gradwell.com> Message-ID: On Jul 9, 2007, at 8:14 AM, Andy Mabbett wrote: >> They only grab the image alt text if the >> microformat class is on the image itself. Here's a different example: >> >> > alt="Mike Kaply"> >> >> I realize this is a little contrived, but you get the idea. > >> In this case, the fn is empty. > > My argument is that the fn should /not/ be empty; the "alt" attribute > contains the text equivalent of the image. I agree this matches the semantics of the alt attribute; however, I suspect few publishers are currently using this attribute appropriately, so I think we should do more research into the likely ramifications of such a change before making it. -- Scott Reynen MakeDataMakeSense.com From derrick at pallas.us Mon Jul 9 08:42:58 2007 From: derrick at pallas.us (Derrick Lyndon Pallas) Date: Mon Jul 9 08:42:59 2007 Subject: [uf-new] img alt content In-Reply-To: References: <46914B85.4040903@digitalbazaar.com> <18788.80.86.36.97.1183990452.squirrel@www.gradwell.com> Message-ID: <46925782.3010006@pallas.us> Actually, I can probably be of help here, having written the Alexa Image Search indexer. While I can't divulge too much about what goes into building the index, I'll see if I can find some time to take a look at the usage of img/@alt inside hcard/fn some time this week. Is there anything specific anyone would like me to look for? ~D Scott Reynen wrote: > On Jul 9, 2007, at 8:14 AM, Andy Mabbett wrote: > >>> They only grab the image alt text if the >>> microformat class is on the image itself. Here's a different example: >>> >>> >> alt="Mike Kaply"> >>> >>> I realize this is a little contrived, but you get the idea. >> >>> In this case, the fn is empty. >> >> My argument is that the fn should /not/ be empty; the "alt" attribute >> contains the text equivalent of the image. > > I agree this matches the semantics of the alt attribute; however, I > suspect few publishers are currently using this attribute > appropriately, so I think we should do more research into the likely > ramifications of such a change before making it. > > -- > Scott Reynen > MakeDataMakeSense.com > > > _______________________________________________ > microformats-new mailing list > microformats-new@microformats.org > http://microformats.org/mailman/listinfo/microformats-new From bewest at gmail.com Mon Jul 9 12:34:22 2007 From: bewest at gmail.com (Benjamin West) Date: Mon Jul 9 12:34:24 2007 Subject: [uf-new] hAudio implemented on Bitmunk (with one snag) In-Reply-To: <46914B85.4040903@digitalbazaar.com> References: <46914B85.4040903@digitalbazaar.com> Message-ID: <8ad71be30707091234r17640ce3jcf5bd36d3abe6fb9@mail.gmail.com> One possible solution is to use an image replacement technique. Also, you may choose to send content in a and then hide it when it's unnecessary. -Ben On 7/8/07, Manu Sporny wrote: > We've gone through and implemented hAudio on Bitmunk.com (one of our > service websites). David Lehn, one of our semantic web guys, has also > created an hAudio plug-in for Operator. Mike Kaply, author of Operator, > said that he will make it available via the Operator download section > within the next week or two. To view some hAudio compliant markup, you > can go to the following link: > > http://www.bitmunk.com/view/media/6011098 > > There are over 850,000 songs that have been marked up on the website. We > are in the process of talking our partners, colleagues and competitors > into using hAudio to mark up their audio content as well. So, good > progress is being made in implementing hAudio. > > However, we've hit a snag when it comes to usability with hAudio and > Operator/Firefox 3. > > Problem Description: > > It is quite often that a site uses an image instead of a text link to > present actions. For example: Instead of using the text "Download", they > will use a graphic image with a downward-facing arrow. > > In other words, if we have this: > > Download: > > > > > How do we present this option to a human being in a non-web-page UI? > > How it relates to the Examples: > > We (Bitmunk.com) has this problem with 'rel-sample', 'rel-enclosure', > and 'rel-payment'. Most of the examples also contain images instead of > text for samples, downloads and purchase links. This is a demonstrable, > widespread problem. > > The problem with Operator and screen readers: > > If there is no text to display, then how does one place the item into a > menu/display for Operator/Firefox? Grabbing the image and placing it in > a UI is a difficult argument to make - there are a variety of image > sizes that might not do well in the Operator UI (or Firefox 3 UI). > > Proposed solution: > > We have a fix for Operator that uses the link title text if there is no > internal text. This fixes the problem for both Operator menu display, > Firefox 3 UI display and for screen readers. Here's how the site author > would change the text above: > > Download: > href="http://my.site.com/download/MySong.mp3"> > > > > This approach is beneficial for the following reasons: > > 1. It POSH-ifies the website. > 2. It works well with Operator, Firefox 3 and other uF parsers/UIs. > 3. It fixes the accessibility/screen reader problem. > > We need feedback/consensus from the uF community before submitting the > patch for inclusion into Operator/Firefox 3. Is there anybody that > disagrees with this approach or has a better approach? > > -- manu > _______________________________________________ > microformats-new mailing list > microformats-new@microformats.org > http://microformats.org/mailman/listinfo/microformats-new > From tantek at cs.stanford.edu Mon Jul 9 12:36:54 2007 From: tantek at cs.stanford.edu (Tantek =?ISO-8859-1?B?xw==?=elik) Date: Mon Jul 9 12:37:03 2007 Subject: [uf-new] img alt content (was:hAudio implemented on Bitmunk (with one snag)) In-Reply-To: Message-ID: On 7/9/07 7:37 AM, "Scott Reynen" wrote: > On Jul 9, 2007, at 8:14 AM, Andy Mabbett wrote: > >>> They only grab the image alt text if the >>> microformat class is on the image itself. Here's a different example: >>> >>> >> alt="Mike Kaply"> >>> >>> I realize this is a little contrived, but you get the idea. >> >>> In this case, the fn is empty. >> >> My argument is that the fn should /not/ be empty; the "alt" attribute >> contains the text equivalent of the image. > > I agree this matches the semantics of the alt attribute; however, I > suspect few publishers are currently using this attribute > appropriately, so I think we should do more research into the likely > ramifications of such a change before making it. This was deliberately rejected at the creation of hCard to give publishers more control. All too often there is "garbage" (or just extra unwanted text) in alt attributes for a variety of publisher reasons. Thus only if the publisher explicitly *wants* the text from the alt attribute do they add the respective class value to get it. I've added this to the hCard FAQ as well: http://microformats.org/wiki/hcard-faq#Why_is_IMG_alt_not_being_picked_up Tantek From microformats at kaply.com Mon Jul 9 13:25:16 2007 From: microformats at kaply.com (Mike Kaply) Date: Mon Jul 9 13:25:21 2007 Subject: [uf-new] hAudio implemented on Bitmunk (with one snag) In-Reply-To: <18788.80.86.36.97.1183990452.squirrel@www.gradwell.com> References: <46914B85.4040903@digitalbazaar.com> <18788.80.86.36.97.1183990452.squirrel@www.gradwell.com> Message-ID: OK, let's try a different example: Michael Aaron Kaply Mike Kaply On 7/9/07, Andy Mabbett wrote: > On Mon, July 9, 2007 13:37, Mike Kaply wrote: > > On 7/8/07, Andy Mabbett wrote: > > > >> In message <46914B85.4040903@digitalbazaar.com>, Manu Sporny > >> writes > > > >>> if we have this: > > >>> Download: > >>> > >>> > >>> > > >>> How do we present this option to a human being in a non-web-page UI? > > >> The HTML is invalid, lacking the alt attribute which should fix this > >> problem. > > > Actually the alt attribute WON'T fix this problem. Because the > > microformat attribute is on the anchor tag, not the image. Microformats > > grab the text in the tag. They only grab the image alt text if the > > microformat class is on the image itself. Here's a different example: > > > > > alt="Mike Kaply"> > > > > I realize this is a little contrived, but you get the idea. > > > In this case, the fn is empty. > > My argument is that the fn should /not/ be empty; the "alt" attribute > contains the text equivalent of the image. To discount it as you suggest > is to ignore the semantics of the mark-up presented to you. > > -- > Andy Mabbett > ** via webmail ** > > _______________________________________________ > microformats-new mailing list > microformats-new@microformats.org > http://microformats.org/mailman/listinfo/microformats-new > From andy at pigsonthewing.org.uk Mon Jul 9 13:51:37 2007 From: andy at pigsonthewing.org.uk (Andy Mabbett) Date: Mon Jul 9 13:53:18 2007 Subject: [uf-new] hAudio implemented on Bitmunk (with one snag) In-Reply-To: References: <46914B85.4040903@digitalbazaar.com> <18788.80.86.36.97.1183990452.squirrel@www.gradwell.com> Message-ID: In message , Mike Kaply writes >OK, let's try a different example: > >Michael src="foo.jpg" alt="Aaron"> Kaply Under what circumstances would "Aaron" be appropriate alt text? What would the picture show? -- Andy Mabbett From chris at placenamehere.com Mon Jul 9 16:41:58 2007 From: chris at placenamehere.com (Chris Casciano) Date: Mon Jul 9 16:42:15 2007 Subject: [uf-new] hAudio implemented on Bitmunk (with one snag) In-Reply-To: References: <46914B85.4040903@digitalbazaar.com> <18788.80.86.36.97.1183990452.squirrel@www.gradwell.com> Message-ID: On Jul 9, 2007, at 4:51 PM, Andy Mabbett wrote: > In message > , Mike > Kaply writes > >> OK, let's try a different example: >> >> Michael > src="foo.jpg" alt="Aaron"> Kaply > > Under what circumstances would "Aaron" be appropriate alt text? > What would the picture show? how about some stylized first letter if styling a whole name doesn't float your boat Aichael Aaron Kaply [i hope this doesn't start turning into a discussion of image replacement methods] Another case I've run across has been one of listing of vendors or associates, some with logos some without...
  • Company A
  • Company B
  • So the question becomes are the above two items functionally equivalent or are they not? And if they are functionally different does that mean that my CMS or authoring tool or other templating logic need to be smart enough to move the classes around to different elements depending on the data provided for the entry? -- [ Chris Casciano ] [ chris@placenamehere.com ] [ http://placenamehere.com ] From msporny at digitalbazaar.com Mon Jul 9 19:06:46 2007 From: msporny at digitalbazaar.com (Manu Sporny) Date: Mon Jul 9 19:06:49 2007 Subject: [uf-new] img alt content In-Reply-To: <46925782.3010006@pallas.us> References: <46914B85.4040903@digitalbazaar.com> <18788.80.86.36.97.1183990452.squirrel@www.gradwell.com> <46925782.3010006@pallas.us> Message-ID: <4692E9B6.9020905@digitalbazaar.com> Derrick Lyndon Pallas wrote: > Actually, I can probably be of help here, having written the Alexa Image > Search indexer. While I can't divulge too much about what goes into > building the index, I'll see if I can find some time to take a look at > the usage of img/@alt inside hcard/fn some time this week. Is there > anything specific anyone would like me to look for? ~D Derrick, thanks for offering to help. It would be a great help if you could give us the stats on the following: How often is ONLY an image used as the target of a link in hAtom/hReview/hCard/hCalendar? In other words, how often does this happen: Foo How often is 'alt' defined for those images? How often is 'title' defined for those images? Does the alt/title usually match what the image is depicting? -- manu From msporny at digitalbazaar.com Mon Jul 9 19:30:38 2007 From: msporny at digitalbazaar.com (Manu Sporny) Date: Mon Jul 9 19:30:41 2007 Subject: [uf-new] img alt content (was:hAudio implemented on Bitmunk (with one snag)) In-Reply-To: <18788.80.86.36.97.1183990452.squirrel@www.gradwell.com> References: <46914B85.4040903@digitalbazaar.com> <18788.80.86.36.97.1183990452.squirrel@www.gradwell.com> Message-ID: <4692EF4E.60709@digitalbazaar.com> Andy Mabbett wrote: > My argument is that the fn should /not/ be empty; the "alt" attribute > contains the text equivalent of the image. To discount it as you suggest > is to ignore the semantics of the mark-up presented to you. I believe Andy and Scott are referring to Section 13.2 of the HTML 4.01 specification: http://www.w3.org/TR/html4/struct/objects.html#h-13.2 alt %Text; #REQUIRED -- short description -- and Section 13.8 of the HTML 4.01 specification: http://www.w3.org/TR/html4/struct/objects.html#h-13.8 Specifically, the sections that state the following concerning alternate text for images: * Do not specify irrelevant alternate text when including images intended to format a page, for instance, alt="red ball" would be inappropriate for an image that adds a red ball for decorating a heading or paragraph. In such cases, the alternate text should be the empty string (""). Authors are in any case advised to avoid using images to format pages; style sheets should be used instead. * Do not specify meaningless alternate text (e.g., "dummy text"). Not only will this frustrate users, it will slow down user agents that must convert text to speech or braille output. I think Andy and Scott have the correct approach to this problem. All one must do is view the following in a text-based browser, such as Lynx... or in Firefox/Opera/etc and the answer becomes much clearer: Test of link with image with alt text Here's a link with an image with alt test: Microformat It! The text that is displayed as a link in Lynx is : "Microformat It!" The text that is displayed as a link in Firefox is: "Microformat It!" Mike, would it be possible to write a parseTagTextFromImages() function that would extract the 'alt' text from images? Therefore, running it over the following HTML: Michael Kaply Would yield the text "Michael Kaply" for 'fn'. Using this approach would also solve the hAudio problem as well as the problems that have been raised thus far in this thread. -- manu From tantek at cs.stanford.edu Mon Jul 9 19:40:52 2007 From: tantek at cs.stanford.edu (Tantek =?ISO-8859-1?B?xw==?=elik) Date: Mon Jul 9 19:40:52 2007 Subject: [uf-new] img alt content (was:hAudio implemented on Bitmunk (with one snag)) In-Reply-To: <4692EF4E.60709@digitalbazaar.com> Message-ID: On 7/9/07 7:30 PM, "Manu Sporny" wrote: > Mike, would it be possible to write a parseTagTextFromImages() function > that would extract the 'alt' text from images? Therefore, running it > over the following HTML: > > > Michael Kaply > > > Would yield the text "Michael Kaply" for 'fn'. Using this approach would > also solve the hAudio problem as well as the problems that have been > raised thus far in this thread. This would be non-compliant with hCard parsing and thus should be AVOIDED. http://microformats.org/wiki/hcard-parsing See the recent FAQ for more details. http://microformats.org/wiki/hcard-faq#Why_is_IMG_alt_not_being_picked_up Thanks, Tantek From msporny at digitalbazaar.com Mon Jul 9 19:51:10 2007 From: msporny at digitalbazaar.com (Manu Sporny) Date: Mon Jul 9 19:51:15 2007 Subject: [uf-new] img alt content (was:hAudio implemented on Bitmunk (with one snag)) In-Reply-To: References: Message-ID: <4692F41E.2010503@digitalbazaar.com> Tantek ?elik wrote: > This was deliberately rejected at the creation of hCard to give publishers > more control. > > All too often there is "garbage" (or just extra unwanted text) in alt > attributes for a variety of publisher reasons. Doesn't doing this go against the HTML 4.01 specification? You aren't supposed to have anything in the 'alt' attribute of the image tag that isn't pertinent: http://www.w3.org/TR/html4/struct/objects.html#h-13.8 > I've added this to the hCard FAQ as well: > > http://microformats.org/wiki/hcard-faq#Why_is_IMG_alt_not_being_picked_up The above link states: "In addition all too often there is "garbage" (or just extra unwanted text) in alt attributes for a variety of publisher reasons, and that extraneous text would pollute otherwise clean property values in numerous existing sites." I couldn't find a reference to the analysis that lead to this conclusion? What constitutes "garbage"? What reasons would a publisher have to do this? If they're doing this, aren't they quite blatantly violating the HTML 4.01 and XHTML 1.0 specification? The link stated above also says: "Finally, it is simpler and more predictable for publishers if they know that for images and other such URL related elements (a, object, etc.) that whether they are specifying a URL property (like "email", "photo", "url", etc.) or a text property (like "fn", "nickname", etc.) in either case directly specifying the property on the element is the way to do it." If we were to adopt this approach, I don't see how we could ever get the following chunk of HTML working for hAudio: Sample Sneaking Sally Sample, as defined by hAudio is: rel-sample. optional. sample file/stream using rel-design-pattern with 'sample' as the mf-rel-value. Rel-patterns are only available on links... thus the "move the URL property such that it is specified directly" approach doesn't work for any Microformat that uses the rel-design-pattern. -- manu From tantek at cs.stanford.edu Mon Jul 9 20:11:11 2007 From: tantek at cs.stanford.edu (Tantek =?ISO-8859-1?B?xw==?=elik) Date: Mon Jul 9 20:11:11 2007 Subject: [uf-new] img alt content (was:hAudio implemented on Bitmunk (with one snag)) In-Reply-To: <4692F41E.2010503@digitalbazaar.com> Message-ID: On 7/9/07 7:51 PM, "Manu Sporny" wrote: > Tantek ?elik wrote: >> This was deliberately rejected at the creation of hCard to give publishers >> more control. >> >> All too often there is "garbage" (or just extra unwanted text) in alt >> attributes for a variety of publisher reasons. > > Doesn't doing this go against the HTML 4.01 specification? You aren't > supposed to have anything in the 'alt' attribute of the image tag that > isn't pertinent: > > http://www.w3.org/TR/html4/struct/objects.html#h-13.8 Many publishers go against many aspects of the HTML 4.01 specification yes, not in the least by publishing invalid content. >> I've added this to the hCard FAQ as well: >> >> http://microformats.org/wiki/hcard-faq#Why_is_IMG_alt_not_being_picked_up > > The above link states: > > "In addition all too often there is "garbage" (or just extra unwanted > text) in alt attributes for a variety of publisher reasons, and that > extraneous text would pollute otherwise clean property values in > numerous existing sites." > > I couldn't find a reference to the analysis that lead to this > conclusion? We didn't capture it at the time unfortunately, and we're being more thorough now. We did actually try it the other way first (including all nested "alternative" content) and found it worked worse across a variety of existing real world sites, not just 1-2 examples but LOTS. > What constitutes "garbage"? In this case things like duplicated text, text for chrome/UI etc. > What reasons would a publisher > have to do this? If they're doing this, aren't they quite blatantly > violating the HTML 4.01 and XHTML 1.0 specification? Not necessarily. > The link stated above also says: > > "Finally, it is simpler and more predictable for publishers if they know > that for images and other such URL related elements (a, object, etc.) > that whether they are specifying a URL property (like "email", "photo", > "url", etc.) or a text property (like "fn", "nickname", etc.) in either > case directly specifying the property on the element is the way to do it." > > If we were to adopt this approach, I don't see how we could ever get the > following chunk of HTML working for hAudio: > > > Sample Sneaking Sally > > > Sample, as defined by hAudio is: > > rel-sample. optional. sample file/stream using rel-design-pattern with > 'sample' as the mf-rel-value. > > Rel-patterns are only available on links... thus the "move the URL > property such that it is specified directly" approach doesn't work for > any Microformat that uses the rel-design-pattern. rel does not apply to therefore this is not a problem. Tantek From scott at makedatamakesense.com Mon Jul 9 20:26:39 2007 From: scott at makedatamakesense.com (Scott Reynen) Date: Mon Jul 9 20:26:51 2007 Subject: [uf-new] img alt content (was:hAudio implemented on Bitmunk (with one snag)) In-Reply-To: <4692EF4E.60709@digitalbazaar.com> References: <46914B85.4040903@digitalbazaar.com> <18788.80.86.36.97.1183990452.squirrel@www.gradwell.com> <4692EF4E.60709@digitalbazaar.com> Message-ID: On Jul 9, 2007, at 8:30 PM, Manu Sporny wrote: > I think Andy and Scott have the correct approach to this problem. All > one must do is view the following in a text-based browser, such as > Lynx... or in Firefox/Opera/etc and the answer becomes much clearer: Um, that's not really my approach to this problem at all. I suggested more research was required before making any changes, not more hypothetical markup to support a predetermined conclusion. And I suggested more research because I suspect that "red ball" section was included in the HTML spec specifically as a result of many publishers using such alt values, which aren't really content. I prefer to follow the semantics defined in the spec, but I do not think we should do that with complete disregard to how people actually use HTML. -- Scott Reynen MakeDataMakeSense.com From andy at pigsonthewing.org.uk Tue Jul 10 00:10:43 2007 From: andy at pigsonthewing.org.uk (Andy Mabbett) Date: Tue Jul 10 00:11:58 2007 Subject: [uf-new] img alt content (was:hAudio implemented on Bitmunk (with one snag)) In-Reply-To: References: <4692EF4E.60709@digitalbazaar.com> Message-ID: In message , Tantek ?elik writes >On 7/9/07 7:30 PM, "Manu Sporny" wrote: > >> Mike, would it be possible to write a parseTagTextFromImages() function >> that would extract the 'alt' text from images? Therefore, running it >> over the following HTML: >> >> >> Michael Kaply >> >> >> Would yield the text "Michael Kaply" for 'fn'. Using this approach would >> also solve the hAudio problem as well as the problems that have been >> raised thus far in this thread. > >This would be non-compliant with hCard parsing and thus should be >AVOIDED. > > http://microformats.org/wiki/hcard-parsing In other words, the microformat parsing rules are non-compliant with the HTML specification. I think that's something which should be fixed. -- Andy Mabbett From andy at pigsonthewing.org.uk Tue Jul 10 13:16:09 2007 From: andy at pigsonthewing.org.uk (Andy Mabbett) Date: Tue Jul 10 13:17:44 2007 Subject: [uf-new] img alt content (was:hAudio implemented on Bitmunk (with one snag)) In-Reply-To: References: <4692F41E.2010503@digitalbazaar.com> Message-ID: In message , Tantek ?elik writes >On 7/9/07 7:51 PM, "Manu Sporny" wrote: > >> Tantek ?elik wrote: >>> This was deliberately rejected at the creation of hCard to give publishers >>> more control. >>> >>> All too often there is "garbage" (or just extra unwanted text) in alt >>> attributes for a variety of publisher reasons. >> >> Doesn't doing this go against the HTML 4.01 specification? You aren't >> supposed to have anything in the 'alt' attribute of the image tag that >> isn't pertinent: >> >> http://www.w3.org/TR/html4/struct/objects.html#h-13.8 > >Many publishers go against many aspects of the HTML 4.01 specification >yes, not in the least by publishing invalid content. Is the best way to encourage "POSH" to adhere to standards, or to pander to those who break them? >>> I've added this to the hCard FAQ as well: >>> >>> http://microformats.org/wiki/hcard-faq#Why_is_IMG_alt_not_being_picked_up >> >> The above link states: >> >> "In addition all too often there is "garbage" (or just extra unwanted >> text) in alt attributes for a variety of publisher reasons, and that >> extraneous text would pollute otherwise clean property values in >> numerous existing sites." >> >> I couldn't find a reference to the analysis that lead to this >> conclusion? > >We didn't capture it at the time unfortunately, and we're being more >thorough now. We did actually try it the other way first (including >all nested "alternative" content) and found it worked worse across a >variety of existing real world sites, not just 1-2 examples but LOTS. It is indeed unfortunate that such evidence hasn't been captured; especially given your strong advocacy of evidence-based working and a "scientific" process. Someone cynical might think it hypocritical of you to then assert something without providing evidence for it. Perhaps it would be a good idea if you could provide at least a minimum amount of such evidence; preferably with URLs; per: Use real world examples People often invent completely fictitious (and theoretical) examples in order to try to make a point they are trying to make. Microformats themselves are based on studying real world examples and designing for real world examples. Thus arguments based on theoretical examples hold much less weight in microformats discussions and are apt to be ignored. Please avoid posting arguments / questions based solely on theoretical examples. Ask for real world examples If someone discusses or provides arguments based on theoretical examples, ask them to provide a real world example and point them to the above guideline. Use URLs to examples Please provide URLs to real world examples when possible. This helps to validate that such examples truly are "real world" as they are on the public Web, and provides additional context around the example which might be crucial to understanding it or answering questions about it. Ask for URLs to examples When people do not provide a specific URL to a test case or example, then especially as a developer, PLEASE ask them to provide a specific URL (and cite the previous guideline) rather than attempting to work out how an inline snippet of code might work. (which I believe you wrote) to forestall such criticism? >> What constitutes "garbage"? > >In this case things like duplicated text, text for chrome/UI etc. > >> What reasons would a publisher >> have to do this? If they're doing this, aren't they quite blatantly >> violating the HTML 4.01 and XHTML 1.0 specification? > >Not necessarily. Can you give a real world example of someone publishing such "garbage" alt text, pertinent to microformats (and again with URLs as above), which does not violate the HTML specs, please? -- Andy Mabbett From paul_wilkins at xtra.co.nz Tue Jul 10 14:52:26 2007 From: paul_wilkins at xtra.co.nz (Paul Wilkins) Date: Tue Jul 10 14:52:35 2007 Subject: [uf-new] img alt content (was:hAudio implemented on Bitmunk (with onesnag)) References: <4692F41E.2010503@digitalbazaar.com> Message-ID: <003501c7c33c$9b2e10d0$bc08a8c0@nzto22> From: "Andy Mabbett" >>> What reasons would a publisher >>> have to do this? [garbage in alt attributes] >>> If they're doing this, aren't they quite blatantly >>> violating the HTML 4.01 and XHTML 1.0 specification? >> >>Not necessarily. > > Can you give a real world example of someone publishing such "garbage" > alt text, pertinent to microformats (and again with URLs as above), > which does not violate the HTML specs, please? I can. Our website uses feature pages for our cleints to help improve their visibility to the general public through search engines. One of the ways of doing this is to load the page with specific keywords and phrases for our clients. Images for example would have "Copyright CityLife Auckland. Suite at our Auckland hotel accommodation" A google search for "auckland hotel accommodation" results in their feature page being the third result. http://www.google.co.nz/search?q=auckland+hotel+accommodation&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a In terms of the page and ensuring high visibility, this is the right thing to do, but in terms of microformats and providing the information that's required, using this alt information is the wrong thing to do. As far as my boss is concerned, microformats are a tiny blip on our radar and are not worth his time. I believe that he is wrong there, and am steadily massaging our information so that microformats can be applied as easily as possible when the time comes. However, as a business we have a commitment to our clients to provide them the best results that we can. When the time comes, microformats will need to take such issues into account before we apply them, because they must not reduce the effectiveness of our results. Our alt tags will contain whatever they must to maintain their high search engine placements and anything that interferes with that will get fallen by the wayside. -- Paul Wilkins From scott at makedatamakesense.com Tue Jul 10 15:10:14 2007 From: scott at makedatamakesense.com (Scott Reynen) Date: Tue Jul 10 15:10:29 2007 Subject: [uf-new] img alt content (was:hAudio implemented on Bitmunk (with one snag)) In-Reply-To: References: <4692F41E.2010503@digitalbazaar.com> Message-ID: <531BF2FB-496D-4999-B0D0-2F56471684C6@makedatamakesense.com> On Jul 10, 2007, at 2:16 PM, Andy Mabbett wrote: >> Many publishers go against many aspects of the HTML 4.01 >> specification >> yes, not in the least by publishing invalid content. > > Is the best way to encourage "POSH" to adhere to standards, or to > pander > to those who break them? If we had more control of web publishing, I would support pushing complete adherence to HTML specs. But we don't, so we have to balance de jure standards with de facto standards. We treat all alt values as content at the risk of discouraging publishers who use alt values for non-content from using microformats. Whether or not that's a worthwhile trade-off depends on how many publishers we're talking about. > Perhaps it > would be a good idea if you could provide at least a minimum amount of > such evidence; preferably with URLs Indeed, we should collect more examples of how alt is used in practice, because that's a very important factor in deciding how we should treat them. But if we're just collecting such examples with an eye toward supporting pre-determined conclusions, there's really no point. > Can you give a real world example of someone publishing such "garbage" > alt text, pertinent to microformats (and again with URLs as above), > which does not violate the HTML specs, please? While the HTML specs are a very important consideration, they are not the only consideration. While encouraging adherence to HTML, we need to recognize that such adherence is quite rare in practice. How many of us even have perfectly valid websites? Complete adherence to HTML is simply not a practical criteria to apply without concession on today's web. We should push it where we can and choose those battles carefully. -- Scott Reynen MakeDataMakeSense.com From andy at pigsonthewing.org.uk Wed Jul 11 01:07:26 2007 From: andy at pigsonthewing.org.uk (Andy Mabbett) Date: Wed Jul 11 01:09:00 2007 Subject: [uf-new] img alt content (was:hAudio implemented on Bitmunk (with onesnag)) In-Reply-To: <003501c7c33c$9b2e10d0$bc08a8c0@nzto22> References: <4692F41E.2010503@digitalbazaar.com> <003501c7c33c$9b2e10d0$bc08a8c0@nzto22> Message-ID: In message <003501c7c33c$9b2e10d0$bc08a8c0@nzto22>, Paul Wilkins writes >> Can you give a real world example of someone publishing such "garbage" >> alt text, pertinent to microformats (and again with URLs as above), >> which does not violate the HTML specs, please? > >I can. > >Our website uses feature pages for our cleints to help improve their >visibility to the general public through search engines. One of the >ways of doing this is to load the page with specific keywords and >phrases for our clients. > >Images for example would have "Copyright CityLife Auckland. Suite at >our Auckland hotel accommodation" Unless that's the graphical content of the image, which seems unlikely, that's an abuse of the alt attribute; such text should be in the title attribute. It *does* violate the HTML specs. And how is it "pertinent to microformats"? >as a business we have a commitment to our clients to provide them the >best results that we can. What about their responsibility to their customers, some of whom will have a visual disability? -- Andy Mabbett From andy at pigsonthewing.org.uk Wed Jul 11 01:12:56 2007 From: andy at pigsonthewing.org.uk (Andy Mabbett) Date: Wed Jul 11 01:17:06 2007 Subject: [uf-new] img alt content (was:hAudio implemented on Bitmunk (with one snag)) In-Reply-To: <531BF2FB-496D-4999-B0D0-2F56471684C6@makedatamakesense.com> References: <4692F41E.2010503@digitalbazaar.com> <531BF2FB-496D-4999-B0D0-2F56471684C6@makedatamakesense.com> Message-ID: In message <531BF2FB-496D-4999-B0D0-2F56471684C6@makedatamakesense.com>, Scott Reynen writes >> Can you give a real world example of someone publishing such "garbage" >> alt text, pertinent to microformats (and again with URLs as above), >> which does not violate the HTML specs, please? > >While the HTML specs are a very important consideration, they are not >the only consideration. While encouraging adherence to HTML, we need >to recognize that such adherence is quite rare in practice. How many >of us even have perfectly valid websites? Complete adherence to HTML >is simply not a practical criteria to apply without concession on >today's web. We should push it where we can and choose those battles >carefully. If that's true - which I dispute - then who's going to re-write: The first rule of POSH is that you must validate your POSH. accordingly? -- Andy Mabbett From msporny at digitalbazaar.com Wed Jul 11 07:15:10 2007 From: msporny at digitalbazaar.com (Manu Sporny) Date: Wed Jul 11 07:15:14 2007 Subject: [uf-new] img alt content (was:hAudio implemented on Bitmunk (with onesnag)) In-Reply-To: References: <4692F41E.2010503@digitalbazaar.com> <003501c7c33c$9b2e10d0$bc08a8c0@nzto22> Message-ID: <4694E5EE.2060902@digitalbazaar.com> Andy Mabbett wrote: >> Images for example would have "Copyright CityLife Auckland. Suite at >> our Auckland hotel accommodation" > Unless that's the graphical content of the image, which seems > unlikely, that's an abuse of the alt attribute; such text should be in > the title attribute. It *does* violate the HTML specs. And how is it > "pertinent to microformats"? There seems to be two parts to this discussion: 1. HTML specification violation (alt tag mis-use) 2. How the alt attribute is being used in the real-world Andy's line of reasoning is sound regarding the HTML specification violation. I don't think that anybody can state that placing text that does not match the graphical content of an image tag goes against the HTML specification. The second part is how the alt attribute is being used in the real-world. Tantek has asserted that 'alt' is being mis-used on a wide scale on the Interwebs. As Scott has pointed out, the only way to know this is to start gathering real data. I am in the process of writing an image crawler (which will hopefully be done by tonight) to gather these statistics. The crawler will crawl the web for image tags and gather statistics regarding: - How many image tags have 'alt' tags specified. - How many image tags have 'title' tags specified. - How many image tags have both specified. - Whether or not the 'alt' tag matches the image being display (I'll setup a website for all of us to help in analyzing this data) I'm assuming 125,626,329,000 unique images on the web (125,626,329 unique sites on the web - 1000 unique images per site). Statistically, I think we would only need around 385 unique site samples to get a 95% confidence interval with a 5% error rate (somebody correct me if this is wrong). To be safe, I'll collect 100,000 unique image tags , 1 per site to get our initial sample set. Any objections to this method of data collection? -- manu From scott at makedatamakesense.com Wed Jul 11 07:49:25 2007 From: scott at makedatamakesense.com (Scott Reynen) Date: Wed Jul 11 07:49:42 2007 Subject: [uf-new] img alt content In-Reply-To: References: <4692F41E.2010503@digitalbazaar.com> <531BF2FB-496D-4999-B0D0-2F56471684C6@makedatamakesense.com> Message-ID: On Jul 11, 2007, at 2:12 AM, Andy Mabbett wrote: >> Complete adherence to HTML >> is simply not a practical criteria to apply without concession on >> today's web. > > If that's true - which I dispute - then who's going to re-write: > > > > The first rule of POSH is that you must validate your POSH. > > accordingly? Validation and adherence to the HTML spec are not exactly the same thing. All spec-adherent websites are valid, but not all valid sites are spec-adherent. So full adherence to the spec is more work to ask of publishers than simple validation. Ironically, I think the HTML validator actually encourages poor use of the alt attribute because it returns an error on missing alt attributes, but doesn't make any mention that alt should be empty for non-content images. So publishers who leave out alt on non-content images see this error and end up adding alt attributes with exactly the kind of "red ball" values the HTML spec discourages. I completely agree such publishers should be encouraged to stop doing this; I just doubt whether such encouragement should come from the microformats community. I see our goal as a bit more specific than general encouragement of better HTML: making better HTML publishing more appealing by establishing practical benefits. And I think the best way to do this is to focus on areas where better HTML results in maximum practical benefits with minimum cost to publishers. In this case specifically, I suspect the best way to accomplish that goal would not be to encourage everyone publishing non-content alt attributes to change, but rather to encourage everyone publishing content in alt attributes to insert such content as more accessible text, and use style sheets to apply more stylized images, which I think is what Ben was suggesting (see [1]). This solution, I think, makes better HTML more useful without making microformats any more difficult to publish for those who aren't up to spec. [1] http://www.stopdesign.com/articles/replace_text/ -- Scott Reynen MakeDataMakeSense.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://microformats.org/discuss/mail/microformats-new/attachments/20070711/b8948dcf/attachment.html From paul_wilkins at xtra.co.nz Wed Jul 11 17:05:59 2007 From: paul_wilkins at xtra.co.nz (Paul Wilkins) Date: Wed Jul 11 17:06:02 2007 Subject: Fw: [uf-new] img alt content (was:hAudio implemented on Bitmunk (with onesnag)) Message-ID: <001801c7c418$6d636c40$bc08a8c0@nzto22> From: "Andy Mabbett" >>> What reasons would a publisher >>> have to do this? [garbage in alt attributes] >>> If they're doing this, aren't they quite blatantly >>> violating the HTML 4.01 and XHTML 1.0 specification? >> >>Not necessarily. > > Can you give a real world example of someone publishing such "garbage" > alt text, pertinent to microformats (and again with URLs as above), > which does not violate the HTML specs, please? I can. Our website uses feature pages for our cleints to help improve their visibility to the general public through search engines. One of the ways of doing this is to load the page with specific keywords and phrases for our clients. Images for example would have "Copyright CityLife Auckland. Suite at our Auckland hotel accommodation" A google search for "auckland hotel accommodation" results in their feature page being the third result. http://www.google.co.nz/search?q=auckland+hotel+accommodation&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a In terms of the page and ensuring high visibility, this is the right thing to do, but in terms of microformats and providing the information that's required, using this alt information is the wrong thing to do. As far as my boss is concerned, microformats are a tiny blip on our radar and are not worth his time. I believe that he is wrong there, and am steadily massaging our information so that microformats can be applied as easily as possible when the time comes. However, as a business we have a commitment to our clients to provide them the best results that we can. When the time comes, microformats will need to take such issues into account before we apply them, because they must not reduce the effectiveness of our results. Our alt tags will contain whatever they must to maintain their high search engine placements and anything that interferes with that will get fallen by the wayside. -- Paul Wilkins From regine at regine-heidorn.de Thu Jul 12 04:50:01 2007 From: regine at regine-heidorn.de (Regine Heidorn) Date: Thu Jul 12 04:47:57 2007 Subject: [uf-new] MicroFormats for (Music-)TopLists, htop-list? Message-ID: <946BF4B8-E31D-48D3-9A73-CD4E4EBFC89C@regine-heidorn.de> Hi All, first let me introduce myself: I'm a 35-year-old Webdeveloper living in Berlin, Germany. I care a lot about semantic stuff and CSS, so I feel kind of addicted to that MicroFormat-Thing. I didn't use it a lot up to now though, but there is an idea I would like to materialize. In the last months I cared a lot about Open Music, Music published under cc-licence and netlabels offering that stuff. Since I'm running a blog I got the idea of creating a top-5- list of the tracks I like best from the different labels. To make the thing more communicating I thought it funny to give the netlabels the opportunity to grab my top-lists including the resulting ratings. If more bloggers or community-sites feel like doing this, the labels could establish something like the blogosphere-web2.0-community- BillBoards. So I contacted one of the netlabels, they're quite interested in this idea, so I thought of how to form this idea in the most simple and effective way and stumbled across the microformat-thing. I looked around the blog and the wiki to find out if something would match my idea and what I can see for the moment is the vision of a top-list microformat (for audio), including the haudio-Format and the hreview- format to form something like this:
    1. rechazamos el ahora Christian Dittmann emporio thinner thn 092 cc-by-nc-nd
    My thoughts up to this point are: - Did I use the microformats right, did I get the idea properly? - Since it's a top-list the use of
      is the semantic correct way, so to establish the format it should be convention to start with something like
        ? - Is the licence correctly included with an ? - How can the rating be included? Does it make sense to work with hreview here? - A top-list can be seen as a review, might it be better to straighten the whole thing out to the hreview thing? So it could also be used for top-lists not regarding audio-tracks. But it also seems logical to use haudio if it's audio-material. So especially for audio- top-lists would it be OK to make that clear by the use of
      1. ? - If a top-list is also a review: should it be extended by the hreview-possibilities? Like for example if I write a review about a track like maybe on my blog or somewhere else, it would be semantically interesting to paste this information together with my top-list. Second is: how can those informations be collected by let's say the netlabels? As to now I have the idea of a php-script writing the data in a database and thus creating the netlabel-toplist consisting of the data from participating Blogs, community-sites and whatever. Lots of thoughts, hope one can follow at least ;-) Greez, Regine Heidorn From hkraft at gmail.com Fri Jul 13 02:46:05 2007 From: hkraft at gmail.com (Henrik Kraft) Date: Fri Jul 13 02:46:09 2007 Subject: [uf-new] Microformat for article/document? Message-ID: <68005cb10707130246tfc2929di22fc429950244ea8@mail.gmail.com> Hello Ive been looking but cant seem to find a mf for a document. I think it should contain something like,
        Header

        Text

        Bodytext

        Does this makes sense to anyone else or have I misunderstood what the mf should do? /Henrik From davidjanes at blogmatrix.com Fri Jul 13 03:14:19 2007 From: davidjanes at blogmatrix.com (David Janes) Date: Fri Jul 13 03:14:23 2007 Subject: [uf-new] Microformat for article/document? In-Reply-To: <68005cb10707130246tfc2929di22fc429950244ea8@mail.gmail.com> References: <68005cb10707130246tfc2929di22fc429950244ea8@mail.gmail.com> Message-ID: <21e523c20707130314h749c8b19ue782598eb98da9a9@mail.gmail.com> On 7/13/07, Henrik Kraft wrote: > Ive been looking but cant seem to find a mf for a document. > > I think it should contain something like, > >
        Header >

        Text

        >

        Bodytext

        >
        H1 and P are pretty good in and of themselves. Another level of granularity can be provided by hAtom, using respectively hentry, entry-title, summary & content. -- David Janes Founder, BlogMatrix http://www.blogmatrix.com http://blogmatrix.blogmatrix.com From supercanadian at gmail.com Fri Jul 13 13:42:27 2007 From: supercanadian at gmail.com (Charles Iliya Krempeaux) Date: Fri Jul 13 13:42:32 2007 Subject: [uf-new] Microformat for article/document? In-Reply-To: <68005cb10707130246tfc2929di22fc429950244ea8@mail.gmail.com> References: <68005cb10707130246tfc2929di22fc429950244ea8@mail.gmail.com> Message-ID: <84ce626f0707131342q1e8068fbn270db513b9fc5cbf@mail.gmail.com> Hello Henrik, On 7/13/07, Henrik Kraft wrote: > Hello > Ive been looking but cant seem to find a mf for a document. > > I think it should contain something like, > >
        Header >

        Text

        >

        Bodytext

        >
        Perhaps I'm missing the point, but... isn't considered to be a document. And thus is your "article". is your "header". And you can include some class names on various elements inside of <body> for your "preamble" and "bodytext". See ya -- Charles Iliya Krempeaux, B.Sc. <http://ChangeLog.ca/> All the Vlogging News on One Page http://vlograzor.com/ From andy at pigsonthewing.org.uk Fri Jul 13 16:54:00 2007 From: andy at pigsonthewing.org.uk (Andy Mabbett) Date: Fri Jul 13 16:55:26 2007 Subject: [uf-new] Microformat for article/document? In-Reply-To: <84ce626f0707131342q1e8068fbn270db513b9fc5cbf@mail.gmail.com> References: <68005cb10707130246tfc2929di22fc429950244ea8@mail.gmail.com> <84ce626f0707131342q1e8068fbn270db513b9fc5cbf@mail.gmail.com> Message-ID: <$tZ1COMYCBmGFwHd@pigsonthewing.org.uk> In message <84ce626f0707131342q1e8068fbn270db513b9fc5cbf@mail.gmail.com>, Charles Iliya Krempeaux <supercanadian@gmail.com> writes >On 7/13/07, Henrik Kraft <hkraft@gmail.com> wrote: >> Ive been looking but cant seem to find a mf for a document. >Perhaps I'm missing the point, but... isn't <html> considered to be a document. > >And thus <html> is your "article". <title> is your "header". And you >can include some class names on various elements inside of <body> for >your "preamble" and "bodytext". That's one way of looking at it; but in: <http://www.westmidlandbirdclub.com/biblio/SuAScene/2-15.htm> for example, the 2006 article (i.e. the whole page) contains and describes a 1948 article. Then again, the latter, and the original enquirer's document, could, perhaps, by wrapped in a "citation" microformat. -- Andy Mabbett From msporny at digitalbazaar.com Sat Jul 14 08:18:08 2007 From: msporny at digitalbazaar.com (Manu Sporny) Date: Sat Jul 14 08:18:12 2007 Subject: [uf-new] MicroFormats for (Music-)TopLists, htop-list? In-Reply-To: <946BF4B8-E31D-48D3-9A73-CD4E4EBFC89C@regine-heidorn.de> References: <946BF4B8-E31D-48D3-9A73-CD4E4EBFC89C@regine-heidorn.de> Message-ID: <4698E930.1000502@digitalbazaar.com> Regine Heidorn wrote: > - Did I use the microformats right, did I get the idea properly? Yes, you seem to have grasped and implemented the concept and markup of hAudio correctly. Nicely done :) > - Is the licence correctly included with an <a rel>? Yes, it is. > - How can the rating be included? Does it make sense to work with > hreview here? hAudio was intended to be embedded in hReview. So yes, you could put an hAudio in hReview and add rating to it that way. Keep in mind that I don't know of anybody that has implemented hAudio + hReview, so it would be good for the list to see an example and figure out if it works. > - A top-list can be seen as a review, might it be better to straighten > the whole thing out to the hreview thing? So it could also be used for > top-lists not regarding audio-tracks. But it also seems logical to use > haudio if it's audio-material. So especially for audio-top-lists would > it be OK to make that clear by the use of <li class="haudio">? This isn't that clear... we would have to understand what you mean by "top-list". You would have to differentiate the following from each other (here's a hint: Some of them could be viewed as very similar to one another): - "top-list" - "playlist" - hreview of a playlist - hreview of an audio collection or audio album Another way of looking at it is: How is a top-list any different from a playlist? > - If a top-list is also a review: should it be extended by the > hreview-possibilities? Like for example if I write a review about a > track like maybe on my blog or somewhere else, it would be semantically > interesting to paste this information together with my top-list. You will have to elaborate on this as your idea could be interpreted in a number of different ways. > Second is: how can those informations be collected by let's say the > netlabels? I have the idea of a php-script writing the data in > database and thus creating the netlabel-toplist consisting of the data > from participating Blogs, community-sites and whatever. You're talking about crawling, indexing and website implementation. While this community is interested in this stuff... they are implementation details that don't really have much to do with creating the Microformat you are talking about. > Since it's a top-list the use of <ol> is the semantic correct way, so > to establish the format it should be convention to start with > something like <ol class="htoplist">? Perhaps. What we really need to find out is how prevalent "top-lists" are on the Internet. I think everybody on here will agree that they do exist, but it will be your job to gather data to prove that they do exist. This is one of the first steps in the Microformats process - demonstrate, using hard data, that your problem exists. You will need to answer the following questions: - What problem is 'top-list' attempting to address? - How many sites have "top-list" type information? - What kind of information should be placed in a top-list? - Is there any way that we can combine haudio + hreview to solve the problem? -- manu From scott at makedatamakesense.com Sat Jul 14 08:19:37 2007 From: scott at makedatamakesense.com (Scott Reynen) Date: Sat Jul 14 08:19:55 2007 Subject: [uf-new] Fwd: [uf-discuss] Error messages References: <4698B342.90204@prodromou.name> Message-ID: <6922977F-26B9-4878-AB46-C3B006D0F1FA@makedatamakesense.com> Begin forwarded message: > From: Evan Prodromou <evan@prodromou.name> > Date: July 14, 2007 5:28:02 AM MDT > To: microformats-discuss@microformats.org > Subject: [uf-discuss] Error messages > Reply-To: Microformats Discuss <microformats-discuss@microformats.org> > > One of the most common HTML patterns in Web applications is error > messages. We see them all the time on the Web: login errors, form > validation errors, backend errors and user input errors. But what if > this common pattern was standardized? > > If HTML error messages all followed a similar format, we could have > browser plugins that recorded and analyzed the errors that come up. > They > could either feed back this structured error data when we needed it -- > say, when filing a bug report or talking to a tech support rep -- > or use > the error data to help us find workarounds or documentation online. > > I brought the idea up in the #microformats channel on Freenode, and it > got a good response, so I took the next step and created a list of > examples and a brainstorming page on the microformats wiki. > > http://microformats.org/wiki/error-message-examples > http://microformats.org/wiki/error-message-brainstorming > > I'd greatly appreciate the help of people on this list in collecting > error messages from the wild, and hopefully in building up a draft > microformat. > > ~Evan > > -- > Evan Prodromou - evan@prodromou.name - http://evan.prodromou.name/ -- Scott Reynen MakeDataMakeSense.com From msporny at digitalbazaar.com Sat Jul 14 09:29:37 2007 From: msporny at digitalbazaar.com (Manu Sporny) Date: Sat Jul 14 09:29:40 2007 Subject: [uf-new] img alt content statistics In-Reply-To: <4694E5EE.2060902@digitalbazaar.com> References: <4692F41E.2010503@digitalbazaar.com> <C2B846A8.91BB3%tantek@cs.stanford.edu> <RF$+ovQJk+kGFwhM@pigsonthewing.org.uk> <003501c7c33c$9b2e10d0$bc08a8c0@nzto22> <YCjMpgV++IlGFwIl@pigsonthewing.org.uk> <4694E5EE.2060902@digitalbazaar.com> Message-ID: <4698F9F1.1060409@digitalbazaar.com> Manu Sporny wrote: > As Scott has pointed out, the only way to know this is to start > gathering real data. I am in the process of writing an image crawler > (which will hopefully be done by tonight) to gather these statistics. The first run of the img tag analysis has been completed, here are the results: Total websites crawled : 14077 Total img tags analyzed: 224671 The percentages below are the percentages of img tags that contained non-empty attributes: src: 99% height: 66% width: 66% alt: 41% title: 5% id: 4% In general, only 41% of 'img' tags list non-empty 'alt' attributes. In other words - most websites are not using 'alt' attributes for 'img' tags. The next step of the analysis process will examine how the sites that ARE using 'alt' tags use them. -- manu From andy at pigsonthewing.org.uk Sat Jul 14 12:05:15 2007 From: andy at pigsonthewing.org.uk (Andy Mabbett) Date: Sat Jul 14 12:06:40 2007 Subject: [uf-new] img alt content statistics In-Reply-To: <4698F9F1.1060409@digitalbazaar.com> References: <4692F41E.2010503@digitalbazaar.com> <C2B846A8.91BB3%tantek@cs.stanford.edu> <RF$+ovQJk+kGFwhM@pigsonthewing.org.uk> <003501c7c33c$9b2e10d0$bc08a8c0@nzto22> <YCjMpgV++IlGFwIl@pigsonthewing.org.uk> <4694E5EE.2060902@digitalbazaar.com> <4698F9F1.1060409@digitalbazaar.com> Message-ID: <OOFJKlAr5RmGFwjD@pigsonthewing.org.uk> In message <4698F9F1.1060409@digitalbazaar.com>, Manu Sporny <msporny@digitalbazaar.com> writes >The percentages below are the percentages of img tags that contained >non-empty attributes: > >src: 99% >height: 66% >width: 66% >alt: 41% >title: 5% >id: 4% > >In general, only 41% of 'img' tags list non-empty 'alt' attributes. In >other words - most websites are not using 'alt' attributes for 'img' >tags. That's a bogus conclusion - empty "alt" attributes are perfectly valid, and are appropriate in many cases; and you're counting tags but making conclusions about "most websites". -- Andy Mabbett From msporny at digitalbazaar.com Sat Jul 14 12:36:02 2007 From: msporny at digitalbazaar.com (Manu Sporny) Date: Sat Jul 14 12:36:05 2007 Subject: [uf-new] img alt content statistics In-Reply-To: <OOFJKlAr5RmGFwjD@pigsonthewing.org.uk> References: <4692F41E.2010503@digitalbazaar.com> <C2B846A8.91BB3%tantek@cs.stanford.edu> <RF$+ovQJk+kGFwhM@pigsonthewing.org.uk> <003501c7c33c$9b2e10d0$bc08a8c0@nzto22> <YCjMpgV++IlGFwIl@pigsonthewing.org.uk> <4694E5EE.2060902@digitalbazaar.com> <4698F9F1.1060409@digitalbazaar.com> <OOFJKlAr5RmGFwjD@pigsonthewing.org.uk> Message-ID: <469925A2.3080902@digitalbazaar.com> Andy Mabbett wrote: > In message <4698F9F1.1060409@digitalbazaar.com>, Manu Sporny > <msporny@digitalbazaar.com> writes > >> The percentages below are the percentages of img tags that contained >> non-empty attributes: >> >> src: 99% >> height: 66% >> width: 66% >> alt: 41% >> title: 5% >> id: 4% >> >> In general, only 41% of 'img' tags list non-empty 'alt' attributes. In >> other words - most websites are not using 'alt' attributes for 'img' >> tags. > > That's a bogus conclusion - empty "alt" attributes are perfectly valid, > and are appropriate in many cases; and you're counting tags but making > conclusions about "most websites". I agree with you, Andy... it seems my statement wasn't clear. Perhaps it should have read: "In other words - most websites are using empty 'alt' attributes." or "59% of most websites are complying with the HTML 4.01 specification regarding usage of 'alt' with image tags." I used the terminology "most websites" because the data gathered is, statistically speaking, overkill. Assuming 125,626,329 websites (per Netcraft) we would need a sample set of 384 websites to get a 95% confidence level with an interval of 5%. So, we needed 384 samples - we got 224,671 across 14,077 websites. If you want to sift through the data yourself, I'll have it up tomorrow. I'll also be providing all of the source code to crawl, index and analyze the data. -- manu From derrick at pallas.us Sat Jul 14 13:42:18 2007 From: derrick at pallas.us (Derrick Lyndon Pallas) Date: Sat Jul 14 13:42:17 2007 Subject: [uf-new] img alt content statistics In-Reply-To: <469925A2.3080902@digitalbazaar.com> References: <4692F41E.2010503@digitalbazaar.com> <C2B846A8.91BB3%tantek@cs.stanford.edu> <RF$+ovQJk+kGFwhM@pigsonthewing.org.uk> <003501c7c33c$9b2e10d0$bc08a8c0@nzto22> <YCjMpgV++IlGFwIl@pigsonthewing.org.uk> <4694E5EE.2060902@digitalbazaar.com> <4698F9F1.1060409@digitalbazaar.com> <OOFJKlAr5RmGFwjD@pigsonthewing.org.uk> <469925A2.3080902@digitalbazaar.com> Message-ID: <4699352A.4090204@pallas.us> Manu Sporny wrote: > "59% of most websites are complying with the HTML 4.01 specification > regarding usage of 'alt' with image tags." > > I used the terminology "most websites" because the data gathered is, > statistically speaking, overkill. Assuming 125,626,329 websites (per > Netcraft) we would need a sample set of 384 websites to get a 95% > confidence level with an interval of 5%. > > So, we needed 384 samples - we got 224,671 across 14,077 websites. That's assuming that any given page from a website is representative of that website. What you really want are examples of <img/> usage on the web; the number of samples you need is based on usages/page * pages/unique site * unique sites/internet. For what it's worth, I actually did start an analysis but haven't had time to do much with the data. I took a random chunk of our archive, looked for every <a/>, storing the content of the anchor so I could look for lonely <img/>s with @alt text. The proof run found 1.4M <a/> on 14k pages. Of these anchors, * 240k contain at least one <img/> * 228k start with an <img/> * 152k contain at least one <img/> with an @alt * 121k contain at least one <img/> with a non-empty @alt * 25k contain at least one <img/> with a @title * 24k contain at least one <img/> with a non-empty @title A total of 247k <img/> were found in anchors. Of these images, * 151k contain an @alt * 120k contain a non-empty @alt * 25k contain a @title * 23k contain a non-empty @title * 11k have a garbage phrase (e.g. "click here", "use the right mouse button to save", etc.) in @alt or @title Of the 228k starting <img/>s, * 142k contain an @alt * 114k contain a non-empty @alt * 24k contain a @title * 22k contain a non-empty @title * 11k have a garbage phrase in @alt or @title The non-proof run is looking at 50x as many pages. All of this was gleaned from the services at <http://tinyurl.com/23czqt> ~ Derrick Pallas From bhawkeslewis at googlemail.com Sat Jul 14 15:52:57 2007 From: bhawkeslewis at googlemail.com (Benjamin Hawkes-Lewis) Date: Sat Jul 14 15:53:05 2007 Subject: [uf-new] img alt content statistics In-Reply-To: <469925A2.3080902@digitalbazaar.com> References: <4692F41E.2010503@digitalbazaar.com> <C2B846A8.91BB3%tantek@cs.stanford.edu> <RF$+ovQJk+kGFwhM@pigsonthewing.org.uk> <003501c7c33c$9b2e10d0$bc08a8c0@nzto22> <YCjMpgV++IlGFwIl@pigsonthewing.org.uk> <4694E5EE.2060902@digitalbazaar.com> <4698F9F1.1060409@digitalbazaar.com> <OOFJKlAr5RmGFwjD@pigsonthewing.org.uk> <469925A2.3080902@digitalbazaar.com> Message-ID: <469953C9.6000301@googlemail.com> I'm increasingly sceptical about non-qualitative statistical exercises of this sort. They need to be interpreted with great caution. For example, alt="" may be compliant with the (X)HTML specifications, or it may not be. You just can't tell without looking at the page in question. I'm not sure why mass use or abuse of @alt, treating all webpages as equals, is deterministic for hCard parsing. Doesn't there need to be a subsample containing only pages with markup that would be interpreted by a microformat parser as an hCard? -- Benjamin Hawkes-Lewis Manu Sporny wrote: > Andy Mabbett wrote: >> In message <4698F9F1.1060409@digitalbazaar.com>, Manu Sporny >> <msporny@digitalbazaar.com> writes >> >>> The percentages below are the percentages of img tags that contained >>> non-empty attributes: >>> >>> src: 99% >>> height: 66% >>> width: 66% >>> alt: 41% >>> title: 5% >>> id: 4% >>> >>> In general, only 41% of 'img' tags list non-empty 'alt' attributes. In >>> other words - most websites are not using 'alt' attributes for 'img' >>> tags. >> That's a bogus conclusion - empty "alt" attributes are perfectly valid, >> and are appropriate in many cases; and you're counting tags but making >> conclusions about "most websites". > > I agree with you, Andy... it seems my statement wasn't clear. Perhaps it > should have read: > > "In other words - most websites are using empty 'alt' attributes." > > or > > "59% of most websites are complying with the HTML 4.01 specification > regarding usage of 'alt' with image tags." > > I used the terminology "most websites" because the data gathered is, > statistically speaking, overkill. Assuming 125,626,329 websites (per > Netcraft) we would need a sample set of 384 websites to get a 95% > confidence level with an interval of 5%. > > So, we needed 384 samples - we got 224,671 across 14,077 websites. > > If you want to sift through the data yourself, I'll have it up tomorrow. > I'll also be providing all of the source code to crawl, index and > analyze the data. > > -- manu > _______________________________________________ > microformats-new mailing list > microformats-new@microformats.org > http://microformats.org/mailman/listinfo/microformats-new > From msporny at digitalbazaar.com Sun Jul 15 11:09:26 2007 From: msporny at digitalbazaar.com (Manu Sporny) Date: Sun Jul 15 11:09:30 2007 Subject: [uf-new] Use of img in rel-* (with analyzed data) Message-ID: <469A62D6.9020805@digitalbazaar.com> I'm starting a new thread as the "*img alt content*" discussion seems to be getting unfocused. Please familiarize yourself with the following thread, as this discussion is a more focused continuation of it: http://microformats.org/discuss/mail/microformats-new/2007-July/000590.html All of the tools and data that were used for this analysis, including source code released under the GPL, is available from the following URL: http://www.zenmachine.org/downloads/microformats/dbuft-0.3.tar.bz2 The Problem ----------- It is quite often that a site uses an image instead of a text link to present actions. For example: Instead of using the text "Download", they will use a graphic image with a downward-facing arrow pointing at a disk. In other words, if we have this: Download: <a href="http://my.site.com/download/3847293"> <img src="/images/cool_download_button.png"/> </a> How do we present this option to a human being in a non-web-page UI? This problem is applicable to any 'rel-*' pattern. Currently, it is affecting the implementation of hAudio because Operator does not extract ALT or TITLE attributes for IMG tags, thus when an image-only rel-* link is presented to the user, it is blank. The Argument Thus Far --------------------- Andy Mabbett proposed that Operator should use the ALT attribute from the IMG tag, as that is HTML/XHTML compliant[1]. Tantek ?elik raised the point that web authors often mis-use the ALT attribute[2]. Scott Reynen noted that we would need examples to more accurately make an informed decision, as no data had been collected as of yet[3]. The Data Collected So Far ------------------------- The first set of data collected attempted to determine the number of IMG tags that used 'alt', 'title' and 'id': Total websites crawled : 14077 Total img tags analyzed: 224671 @alt: 41% @title: 5% @id: 4% The second set of data collected came from Derrick Pallas. We are still waiting for analysis to be performed by him and that analysis posted to the mailing list. The third set of data collected looks at image-only anchors. In other words, it collects only links that look like the following: <a href="http://www.example.com"><img src="example.png" /></a> The data was analyzed by a human being to ensure that the ALT text matched the image. The following criteria was used to categorize images: Valid @alt - If the ALT text displayed to the user matched the image displayed, the image was marked as VALID. The ALT text was also marked as valid if it was blank. Unknown @alt - If the ALT text was in another language or was in UTF-8 (not displayable), the image was marked as UNKNOWN. Garbage @alt - If the ALT text was clearly not applicable to the image, such as "click here", "red ball", or "blog" when the image was a shopping cart, etc. This analysis required human interaction, thus the sample size is small (but still statistically significant). A small GUI displayed an image to a person and asked them to select if the image matched the ALT tag. This is the first time this data is being presented: Total websites crawled : 1721 Total img-only anchors analyzed: 1166 Valid @alt : 77.3% Unknown @alt: 5.8% Garbage @alt: 16.9% As mentioned previously, all of the tools and data that were used for this analysis, including source code, is available from the following URL: http://www.zenmachine.org/downloads/microformats/dbuft-0.3.tar.bz2 -- manu [1]http://microformats.org/discuss/mail/microformats-new/2007-July/000594.html [2]http://microformats.org/discuss/mail/microformats-new/2007-July/000598.html [3]http://microformats.org/discuss/mail/microformats-new/2007-July/000595.html From tantek at cs.stanford.edu Sun Jul 15 11:43:52 2007 From: tantek at cs.stanford.edu (Tantek =?ISO-8859-1?B?xw==?=elik) Date: Sun Jul 15 11:44:10 2007 Subject: [uf-new] Use of img in rel-* (with analyzed data) In-Reply-To: <469A62D6.9020805@digitalbazaar.com> Message-ID: <C2BFB8A0.9205A%tantek@cs.stanford.edu> On 7/15/07 11:09 AM, "Manu Sporny" <msporny@digitalbazaar.com> wrote: > Tantek ?elik raised the > point that web authors often mis-use the ALT attribute[2]. To be clear, the conclusion from this is that publishers should be given the detailed *choice* of whether or not the alt text in their pages is included in microformats property values (rather than being forced to by *always* using it in contained properties). Thus the alt (or src for that matter) attribute of an <img> element is *only* included on a property value if the property is set directly on the <img> OR via a class="value" construct. Our experience with this in practice has been quite good, and in fact, this is the first that *anyone* has raised any issues with it (in over two years of it functioning this way - that is it's not that no one's written it down yet - unlike some of the existing issues), so given experience to date, I would assert that we have the 80/20 (or far more than even) case covered, and that new cases regarding this being raised now have the burden of proof[1]. Thanks, Tantek [1] http://microformats.org/wiki/brainstorming#Burden_of_Proof From chris at placenamehere.com Sun Jul 15 14:35:56 2007 From: chris at placenamehere.com (Chris Casciano) Date: Sun Jul 15 14:36:11 2007 Subject: [uf-new] img alt content statistics In-Reply-To: <469953C9.6000301@googlemail.com> References: <4692F41E.2010503@digitalbazaar.com> <C2B846A8.91BB3%tantek@cs.stanford.edu> <RF$+ovQJk+kGFwhM@pigsonthewing.org.uk> <003501c7c33c$9b2e10d0$bc08a8c0@nzto22> <YCjMpgV++IlGFwIl@pigsonthewing.org.uk> <4694E5EE.2060902@digitalbazaar.com> <4698F9F1.1060409@digitalbazaar.com> <OOFJKlAr5RmGFwjD@pigsonthewing.org.uk> <469925A2.3080902@digitalbazaar.com> <469953C9.6000301@googlemail.com> Message-ID: <11001486-61EA-4CFC-8DA7-D7BE9D3E663D@placenamehere.com> On Jul 14, 2007, at 6:52 PM, Benjamin Hawkes-Lewis wrote: > I'm increasingly sceptical about non-qualitative statistical > exercises of this sort. They need to be interpreted with great > caution. For example, alt="" may be compliant with the (X)HTML > specifications, or it may not be. You just can't tell without > looking at the page in question. > > I'm not sure why mass use or abuse of @alt, treating all webpages > as equals, is deterministic for hCard parsing. Doesn't there need > to be a subsample containing only pages with markup that would be > interpreted by a microformat parser as an hCard? One thing I hope we don't lose sight of is that while we as a community should be promoting standards and other best practices in all web development and design fronts, if the microformat specs take a hard line on issues such as this where there is some regular use of a variety of techniques it may hurt both adoption on a case b case basis as well as how the movement as a whole is viewed in terms of practicality. Image replacement techniques, bowing to CSS, when an image is considered "content" or not are ALL areas where reasonable people have reasonable arguments for pros and cons and I think its the job of the microformats spec writers to /wherever/ possible to support common coding practices, because for the most part which technique is appropriate is determined by one two word rule: "it depends". Just my thought on the matter, anyway. -- [ Chris Casciano ] [ chris@placenamehere.com ] [ http://placenamehere.com ] From msporny at digitalbazaar.com Sun Jul 15 14:45:04 2007 From: msporny at digitalbazaar.com (Manu Sporny) Date: Sun Jul 15 14:45:09 2007 Subject: [uf-new] Use of img in rel-* (with analyzed data) In-Reply-To: <C2BFB8A0.9205A%tantek@cs.stanford.edu> References: <C2BFB8A0.9205A%tantek@cs.stanford.edu> Message-ID: <469A9560.4020706@digitalbazaar.com> Tantek ?elik wrote: > On 7/15/07 11:09 AM, "Manu Sporny" <msporny@digitalbazaar.com> wrote: >> Tantek ?elik raised the >> point that web authors often mis-use the ALT attribute[2]. > > To be clear, the conclusion from this is that publishers should be given the > detailed *choice* of whether or not the alt text in their pages is included > in microformats property values (rather than being forced to by *always* > using it in contained properties). Tantek, I don't quite follow the logic here. Publishers aren't given the option on whether or not their ALT text shows up in a text-based browser. They are also not given the option on whether their ALT text is read out loud when using a screen reader. Why, then, are we giving them the option on how ALT will be handled with regards to Microformats? Or rather, why are we giving them the option to hide data? > Thus the alt (or src for that matter) attribute of an <img> element is > *only* included on a property value if the property is set directly on the > <img> OR via a class="value" construct. You don't have the option of setting "rel-*" properties on images. That is the whole point of this discussion. Your "just set it on the <img> element" argument doesn't work for "rel-*". rel-* always go on anchor elements (<a>). As for class="value", that is a potential solution... thank you for identifying it. However, I ask again - why are we giving publishers the choice of violating the HTML specification? Of hiding data? Where are the real world examples of why we need to provide that option? > Our experience with this in practice has been quite good, and in fact, this > is the first that *anyone* has raised any issues with it (in over two years > of it functioning this way - that is it's not that no one's written it down > yet - unlike some of the existing issues), so given experience to date, I > would assert that we have the 80/20 (or far more than even) case covered Since you are asserting that the community has 80/20, could you please provide some data to back up that claim? How many people use images inside hCard/hCalendar/hAtom and hResume? How many of those people have @alt specified correctly? Incorrectly? How many examples of images used in rel-* do we have? We have collected quite a bit of data (and continue to do so) that shows that mis-use of @alt isn't as wide-spread as previously asserted. In fact, it falls quite short of the Microformat community's 80/20 rule. If I wasn't clear about that previously, here's a re-cap: As of right now, it looks as though roughly 80-90% of websites are using @alt correctly, either by not specifying a value or by specifying valid data in the attribute. If you'd like me to demonstrate that figure further, I would be more than happy to do so - using hard data that is available to everybody on this mailing list. -- manu From tantek at cs.stanford.edu Sun Jul 15 14:52:07 2007 From: tantek at cs.stanford.edu (Tantek =?ISO-8859-1?B?xw==?=elik) Date: Sun Jul 15 14:52:12 2007 Subject: [uf-new] Use of img in rel-* (with analyzed data) In-Reply-To: <469A9560.4020706@digitalbazaar.com> Message-ID: <C2BFE509.92077%tantek@cs.stanford.edu> On 7/15/07 2:45 PM, "Manu Sporny" <msporny@digitalbazaar.com> wrote: > You don't have the option of setting "rel-*" properties on images. That > is the whole point of this discussion. Your "just set it on the <img> > element" argument doesn't work for "rel-*". rel-* always go on anchor > elements (<a>). rel-* never applies for image content anyway because rel-* semantics are always between one URL and another URL which you must *hyperlink* to. It's the HTML 4.01 specification that provides this restriction, not microformats. Thus rel-* and <img> is a non-issue. Tantek From andy at pigsonthewing.org.uk Sun Jul 15 14:52:30 2007 From: andy at pigsonthewing.org.uk (Andy Mabbett) Date: Sun Jul 15 14:53:46 2007 Subject: [uf-new] Use of img in rel-* (with analyzed data) In-Reply-To: <C2BFB8A0.9205A%tantek@cs.stanford.edu> References: <469A62D6.9020805@digitalbazaar.com> <C2BFB8A0.9205A%tantek@cs.stanford.edu> Message-ID: <PiYJR2SecpmGFwB6@pigsonthewing.org.uk> In message <C2BFB8A0.9205A%tantek@cs.stanford.edu>, Tantek ?elik <tantek@cs.stanford.edu> writes >Our experience with this in practice has been quite good, and in fact, >this is the first that *anyone* has raised any issues with it I've raised the matter previously. >I would assert [...] > that new cases regarding this being raised now have the burden of >proof Surely, by your own standards, the burden is on you, to provide evidence to support your assertions? I note that you have not replied to my previous suggestion that you do so. -- Andy Mabbett From tantek at cs.stanford.edu Sun Jul 15 14:56:07 2007 From: tantek at cs.stanford.edu (Tantek =?ISO-8859-1?B?xw==?=elik) Date: Sun Jul 15 14:56:11 2007 Subject: [uf-new] Use of img in rel-* (with analyzed data) In-Reply-To: <469A9560.4020706@digitalbazaar.com> Message-ID: <C2BFE592.9207A%tantek@cs.stanford.edu> On 7/15/07 2:45 PM, "Manu Sporny" <msporny@digitalbazaar.com> wrote: > We have collected quite a bit of data (and continue to do so) that shows > that mis-use of @alt isn't as wide-spread as previously asserted. In > fact, it falls quite short of the Microformat community's 80/20 rule. If > I wasn't clear about that previously, here's a re-cap: > > As of right now, it looks as though roughly 80-90% of websites are using > @alt correctly, either by not specifying a value or by specifying valid > data in the attribute. Actually no. As several others have pointed out, the methodologies you used to gather "quite a bit of data" and the conclusions you reached are seriously flawed for a number of reasons. You cannot determine that they are "specifying valid data in the attribute" unless you inspect the value of the attribute and the page itself *by hand* to determine whether from a human perspective proper semantics are being followed. I believe other folks (some with accessibility expertise) have already pointed this out. Tantek From tantek at cs.stanford.edu Sun Jul 15 14:58:07 2007 From: tantek at cs.stanford.edu (Tantek =?ISO-8859-1?B?xw==?=elik) Date: Sun Jul 15 14:58:11 2007 Subject: [uf-new] Use of img in rel-* (with analyzed data) In-Reply-To: <469A9560.4020706@digitalbazaar.com> Message-ID: <C2BFE62D.9207C%tantek@cs.stanford.edu> On 7/15/07 2:45 PM, "Manu Sporny" <msporny@digitalbazaar.com> wrote: > However, I ask again - why are we giving publishers the > choice of violating the HTML specification? That has never been demonstrated via an actual example with URL and citation of the clause in the spec with URL that is allegedly being violated, and reasoning applied to the actual example as such. It's only been asserted in hand-waving. > Of hiding data? No one is advocating that AFAIK. > Where are > the real world examples of why we need to provide that option? Manu, please check the other responses on this thread, there has already been at least one publisher that has responded and demonstrated as such. Thanks, Tantek From tantek at cs.stanford.edu Sun Jul 15 15:05:17 2007 From: tantek at cs.stanford.edu (Tantek =?ISO-8859-1?B?xw==?=elik) Date: Sun Jul 15 15:05:21 2007 Subject: [admin] [EoT request] was Re: [uf-new] Use of img in rel-* (with analyzed data) In-Reply-To: <469A9560.4020706@digitalbazaar.com> Message-ID: <C2BFE7BB.92081%tantek@cs.stanford.edu> This thread is quickly repeating itself, dominating the email discussions on the list, and thus becoming more noise than signal for most. Thus I'm going to ask folks who have an agenda of pushing change here to please STOP repeating themselves (especially when those asking for change are ignoring criticisms brought forth by the community). In addition this is a general admin request for those mentioned (Manu and Andy in particular) to STOP posting on this thread in the list for at least 7 days (to reduce list noise) or until they've documented concrete proposals *and* the criticisms brought up in the email thread using the wiki. Thanks, Tantek From andy at pigsonthewing.org.uk Sun Jul 15 15:52:29 2007 From: andy at pigsonthewing.org.uk (Andy Mabbett) Date: Sun Jul 15 15:53:47 2007 Subject: [admin] [EoT request] was Re: [uf-new] Use of img in rel-* (with analyzed data) In-Reply-To: <C2BFE7BB.92081%tantek@cs.stanford.edu> References: <469A9560.4020706@digitalbazaar.com> <C2BFE7BB.92081%tantek@cs.stanford.edu> Message-ID: <jQUidlUtUqmGFw3d@pigsonthewing.org.uk> In message <C2BFE7BB.92081%tantek@cs.stanford.edu>, Tantek ?elik <tantek@cs.stanford.edu> writes >this is a general admin request for those mentioned (Manu and Andy in >particular) to STOP posting on this thread in the list for at least 7 >days Is that a request, or an instruction? -- Andy Mabbett From joe at andrieu.net Sun Jul 15 15:58:13 2007 From: joe at andrieu.net (Joe Andrieu) Date: Sun Jul 15 15:57:57 2007 Subject: [admin] [EoT request] was Re: [uf-new] Use of img in rel-* (withanalyzed data) In-Reply-To: <C2BFE7BB.92081%tantek@cs.stanford.edu> Message-ID: <000201c7c733$a034e8b0$0501a8c0@andrieuhome> Tantek ? elik wrote (Sunday, July 15, 2007 3:05 PM) > This thread is quickly repeating itself, dominating the email > discussions on the list, and thus becoming more noise than > signal for most. > > Thus I'm going to ask folks who have an agenda of pushing > change here to please STOP repeating themselves (especially > when those asking for change are ignoring criticisms brought > forth by the community). > > In addition this is a general admin request for those > mentioned (Manu and Andy in particular) to STOP posting on > this thread in the list for at least 7 days (to reduce list > noise) or until they've documented concrete proposals > *and* the criticisms brought up in the email thread using the wiki. Tantek, It is pretty unsporting of you to cut off all discussion after you post your own points to the list. I agree the conversation is going in circles... And probably could use some time off. However, requesting as admin that those who disagree with you should quiet down--after you make your own points--comes across as a heavy-handed way to get the last word in. I would also like to see some of the back-and-forth move to concrete proposals on the wiki. Including your own points, Tantek. The most popular uFs, such as hCard and hCalendar never went through the uF process with the same documentation and rigor that new proposals face. You yourself have acknowledged the lack of documentation before. As a result, I'd say the burden of proof exists, as usual, for everyone making a case. And in my opinion, it is even greater for those defending the status quo, if simply because the incumbant have the benefit of possession of the standard. [other thoughts presented in the non-admin thread] -j -- Joe Andrieu SwitchBook Software http://www.switchbook.com joe@switchbook.com +1 (805) 705-8651 From joe at andrieu.net Sun Jul 15 16:00:21 2007 From: joe at andrieu.net (Joe Andrieu) Date: Sun Jul 15 16:00:06 2007 Subject: [uf-new] Use of img in rel-* (with analyzed data) In-Reply-To: <C2BFE62D.9207C%tantek@cs.stanford.edu> Message-ID: <000301c7c733$ecd39fe0$0501a8c0@andrieuhome> Tantek ? elik wrote (Sunday, July 15, 2007 2:58 PM) > On 7/15/07 2:45 PM, "Manu Sporny" <msporny@digitalbazaar.com> wrote: > > > However, I ask again - why are we giving publishers the choice of > > violating the HTML specification? > > That has never been demonstrated via an actual example with > URL and citation of the clause in the spec with URL that is > allegedly being violated, and reasoning applied to the actual > example as such. It's only been asserted in hand-waving. > > > > Of hiding data? > > No one is advocating that AFAIK. > > > > Where are > > the real world examples of why we need to provide that option? > > Manu, please check the other responses on this thread, there > has already been at least one publisher that has responded > and demonstrated as such. I don't understand the use of examples in this debate. If we were to examine contact information in the wild, we wouldn't suggest that class="title" is bad because nobody is using it or because people use it outside of hcards. We aren't about to go scan every ALT value for semantic data; the suggestion, as I understand it, is to allow ALT values as a source of data /within/ uFs. The issue, imo, seems to be that /when authoring uFs/, can or should data be placed in alt tags so that authors can specify data that otherwise might be burried in an image? That becomes two questions. 1. Is it semantically valid within common HTML usage? Or put another way, images sometimes contain human readable data that are not (easily) machine readable. Is the alt tag an appropriate way to specify that data in a machine-readable way? 2. Does it break existing uFs? And that means specifications and usage; Whether or not it breaks parsers is a different issue. In this case, it may make sense to evaluate the use of IMG ALT tags within uFs to see if uF-using authors have adopted widespread practices that would break. However, evaluating random selections of IMG tags doesn't really help us understand anything about current uF usage and how this change to the spec might cause problems with existing uFs. -j -- Joe Andrieu SwitchBook Software http://www.switchbook.com joe@switchbook.com +1 (805) 705-8651 From tantek at cs.stanford.edu Sun Jul 15 17:23:24 2007 From: tantek at cs.stanford.edu (Tantek =?ISO-8859-1?B?xw==?=elik) Date: Sun Jul 15 17:23:30 2007 Subject: [admin] [EoT request] was Re: [uf-new] Use of img in rel-* (with analyzed data) In-Reply-To: <jQUidlUtUqmGFw3d@pigsonthewing.org.uk> Message-ID: <C2C00830.92098%tantek@cs.stanford.edu> On 7/15/07 3:52 PM, "Andy Mabbett" <andy@pigsonthewing.org.uk> wrote: > In message <C2BFE7BB.92081%tantek@cs.stanford.edu>, Tantek ?elik > <tantek@cs.stanford.edu> writes > >> this is a general admin request for those mentioned (Manu and Andy in >> particular) to STOP posting on this thread in the list for at least 7 >> days > > Is that a request, or an instruction? To be clear, a request. Thanks Andy, Tantek From tantek at cs.stanford.edu Sun Jul 15 17:48:59 2007 From: tantek at cs.stanford.edu (Tantek =?ISO-8859-1?B?xw==?=elik) Date: Sun Jul 15 17:49:07 2007 Subject: [admin] [EoT request] was Re: [uf-new] Use of img in rel-* (withanalyzed data) In-Reply-To: <000201c7c733$a034e8b0$0501a8c0@andrieuhome> Message-ID: <C2C00E88.920A0%tantek@cs.stanford.edu> On 7/15/07 3:58 PM, "Joe Andrieu" <joe@andrieu.net> wrote: > Tantek ? elik wrote (Sunday, July 15, 2007 3:05 PM) >> This thread is quickly repeating itself, dominating the email >> discussions on the list, and thus becoming more noise than >> signal for most. >> >> Thus I'm going to ask folks who have an agenda of pushing >> change here to please STOP repeating themselves (especially >> when those asking for change are ignoring criticisms brought >> forth by the community). >> >> In addition this is a general admin request for those >> mentioned (Manu and Andy in particular) to STOP posting on >> this thread in the list for at least 7 days (to reduce list >> noise) or until they've documented concrete proposals >> *and* the criticisms brought up in the email thread using the wiki. > > Tantek, > > It is pretty unsporting of you to cut off all discussion after you post your > own points to the list. Joe, Apologies as I do realize it came across like that. I realized shortly after posting my own recent emails that I wasn't helping the problem either and thus put on my admin hat and posted my request regarding the thread which I as well will stick to until those advocating changes do the requested work on the wiki. > I agree the conversation is going in circles... And probably could use some > time off. However, requesting as admin that those who > disagree with you should quiet down--after you make your own points--comes > across as a heavy-handed way to get the last word in. The admin request applies to thread as a whole, but especially to those who have posted most often in the thread as the high-frequency of posting (and thus apparent noise on the list) is due to a few, not everyone. > I would also like to see some of the back-and-forth move to concrete proposals > on the wiki. Thanks Joe. > Including your own points, Tantek. I'm wiking everything I can, in priority order per my to-do list on the wiki: http://microformats.org/wiki/to-do#Tantek I encourage you to add a section for yourself on the to-do page as well for the things you want to get done in the microformats community. > The > most popular uFs, such as hCard and hCalendar never went through the uF > process with the same documentation and rigor that new > proposals face. They did go through various checks and balances similar to those in the process (in fact, much of the process was written as a result of documenting the methodology developed *while* developing hCard and hCalendar). Your request for more specific history is reasonable, and will certainly benefit both out existing microformats, and those looking to understand the development of microformats in general. I've added it to my personal to-do list. > You yourself have acknowledged the lack of documentation > before. As a result, I'd say the burden of proof exists, as > usual, for everyone making a case. It is not the same for everyone no. There is what is established and thus works today, and there are proposals for change. The proposals for change have burden of proof. The documentation at this point for those that actually worked on it is a matter of historical documentation, not process. > And in my opinion, it is even greater for > those defending the status quo, if simply because the > incumbant have the benefit of possession of the standard. We will simply have to choose to disagree on this point then. The burden of proof is always on those who wish to change or modify what already "works" to a great extent today. This principle is actually in use all over microformats, such as re-using existing implied schemas and looking at existing widely interoperable standards as a basis for vocabulary for microformats. Thus it could be said that a key principle of microformats in general "doing what already works" (i.e. re-use) is greatly valued over "changing everything and starting from scratch" (i.e. re-invention). Thanks, Tantek From alasdairking at gmail.com Mon Jul 16 00:14:23 2007 From: alasdairking at gmail.com (Alasdair King) Date: Mon Jul 16 00:14:27 2007 Subject: [uf-new] img alt content statistics In-Reply-To: <11001486-61EA-4CFC-8DA7-D7BE9D3E663D@placenamehere.com> References: <4692F41E.2010503@digitalbazaar.com> <RF$+ovQJk+kGFwhM@pigsonthewing.org.uk> <003501c7c33c$9b2e10d0$bc08a8c0@nzto22> <YCjMpgV++IlGFwIl@pigsonthewing.org.uk> <4694E5EE.2060902@digitalbazaar.com> <4698F9F1.1060409@digitalbazaar.com> <OOFJKlAr5RmGFwjD@pigsonthewing.org.uk> <469925A2.3080902@digitalbazaar.com> <469953C9.6000301@googlemail.com> <11001486-61EA-4CFC-8DA7-D7BE9D3E663D@placenamehere.com> Message-ID: <7df2c90b0707160014t2702c65dy7e353319f50573d2@mail.gmail.com> I develop a free web browser for blind people called WebbIE ( http://www.webbie.org.uk). The use of images in links is a problem for my users too. I use the alt tag content as text for the link: if the alt tag is blank or missing I use the filename of the target, so http://www.mypage.com/contact.htm becomes Link 1: contact.htm (And on looking at it how it should probably just go to just "contact"...!) Offered as a real-world example (insignificant numbers vs IE, significant numbers vs blind people). -- Alasdair King WebbIE http://www.webbie.org.uk alasdair@webbie.org.uk On 7/15/07, Chris Casciano <chris@placenamehere.com> wrote: > > > On Jul 14, 2007, at 6:52 PM, Benjamin Hawkes-Lewis wrote: > > > I'm increasingly sceptical about non-qualitative statistical > > exercises of this sort. They need to be interpreted with great > > caution. For example, alt="" may be compliant with the (X)HTML > > specifications, or it may not be. You just can't tell without > > looking at the page in question. > > > > I'm not sure why mass use or abuse of @alt, treating all webpages > > as equals, is deterministic for hCard parsing. Doesn't there need > > to be a subsample containing only pages with markup that would be > > interpreted by a microformat parser as an hCard? > > > One thing I hope we don't lose sight of is that while we as a > community should be promoting standards and other best practices in > all web development and design fronts, if the microformat specs take > a hard line on issues such as this where there is some regular use of > a variety of techniques it may hurt both adoption on a case b case > basis as well as how the movement as a whole is viewed in terms of > practicality. > > > Image replacement techniques, bowing to CSS, when an image is > considered "content" or not are ALL areas where reasonable people > have reasonable arguments for pros and cons and I think its the job > of the microformats spec writers to /wherever/ possible to support > common coding practices, because for the most part which technique is > appropriate is determined by one two word rule: "it depends". > > > Just my thought on the matter, anyway. > > -- > [ Chris Casciano ] > [ chris@placenamehere.com ] [ http://placenamehere.com ] > > _______________________________________________ > microformats-new mailing list > microformats-new@microformats.org > http://microformats.org/mailman/listinfo/microformats-new > -- Alasdair King -------------- next part -------------- An HTML attachment was scrubbed... URL: http://microformats.org/discuss/mail/microformats-new/attachments/20070716/9def9b91/attachment.html From msporny at digitalbazaar.com Mon Jul 16 09:29:48 2007 From: msporny at digitalbazaar.com (Manu Sporny) Date: Mon Jul 16 09:29:51 2007 Subject: [admin] [EoT request] was Re: [uf-new] Use of img in rel-* (with analyzed data) In-Reply-To: <C2BFE7BB.92081%tantek@cs.stanford.edu> References: <C2BFE7BB.92081%tantek@cs.stanford.edu> Message-ID: <469B9CFC.2090409@digitalbazaar.com> Tantek ?elik wrote: > In addition this is a general admin request for those mentioned (Manu and > Andy in particular) to STOP posting on this thread in the list for at least > 7 days (to reduce list noise) or until they've documented concrete proposals > *and* the criticisms brought up in the email thread using the wiki. Out of respect for the community, I'll stop posting for 7 days. I'll spend time documenting this argument on the wiki and will point everyone to that page once it is updated along with a more detailed explanation as to how the analysis was completed and why the data is pertinent. If there is any other information that people would like included on the page, please let me know off-list. -- manu From cgriego at gmail.com Mon Jul 16 12:24:31 2007 From: cgriego at gmail.com (Chris Griego) Date: Mon Jul 16 12:24:33 2007 Subject: [uf-new] Use of img in rel-* (with analyzed data) In-Reply-To: <C2BFB8A0.9205A%tantek@cs.stanford.edu> References: <469A62D6.9020805@digitalbazaar.com> <C2BFB8A0.9205A%tantek@cs.stanford.edu> Message-ID: <15996c030707161224j16eb5f18k157f21a434766922@mail.gmail.com> On 7/15/07, Tantek ?elik <tantek@cs.stanford.edu> wrote: > Our experience with this in practice has been quite good, and in fact, this > is the first that *anyone* has raised any issues with it (in over two years > of it functioning this way - that is it's not that no one's written it down > yet - unlike some of the existing issues), so given experience to date, I > would assert that we have the 80/20 (or far more than even) case covered, > and that new cases regarding this being raised now have the burden of > proof[1]. I have raised this issue before in IRC directly in conversation with you, Tantek. It also came up during Twitter's adoption of microformats because their usage assumed that the alt text was considered part of the microformat output without specifying anything specific. -- Chris Griego From andy at pigsonthewing.org.uk Mon Jul 16 13:19:23 2007 From: andy at pigsonthewing.org.uk (Andy Mabbett) Date: Mon Jul 16 13:21:03 2007 Subject: [uf-new] Use of img in rel-* (with analyzed data) In-Reply-To: <15996c030707161224j16eb5f18k157f21a434766922@mail.gmail.com> References: <469A62D6.9020805@digitalbazaar.com> <C2BFB8A0.9205A%tantek@cs.stanford.edu> <15996c030707161224j16eb5f18k157f21a434766922@mail.gmail.com> Message-ID: <Pld5ArfLL9mGFwhn@pigsonthewing.org.uk> In message <15996c030707161224j16eb5f18k157f21a434766922@mail.gmail.com>, Chris Griego <cgriego@gmail.com> writes >On 7/15/07, Tantek ?elik <tantek@cs.stanford.edu> wrote: >> Our experience with this in practice has been quite good, and in fact, this >> is the first that *anyone* has raised any issues with it (in over two years >> of it functioning this way - that is it's not that no one's written it down >> yet - unlike some of the existing issues), so given experience to date, I >> would assert that we have the 80/20 (or far more than even) case covered, >> and that new cases regarding this being raised now have the burden of >> proof[1]. > >I have raised this issue before in IRC directly in conversation with >you, Tantek. It also came up during Twitter's adoption of microformats >because their usage assumed that the alt text was considered part of >the microformat output without specifying anything specific. Also: <http://rbach.priv.at/Microformats-IRC/2006-02-27#T220544> (reformatted and edited for readability) # [22:05:44] <RobertBachmann> one question ... <a href="http://www.adobe.com/" class="url fn org"> <img src="...gif" alt="Adobe Systems, Inc." /></a> # [22:05:44] <RobertBachmann> should work? # [22:06:05] <kingryan> should # [22:06:12] <tantek> yes, as long as there is a <span class="vcard"> around it and this mailing list thread: <http://microformats.org/discuss/mail/microformats-dev/2007-March/000250.html> This thread has some useful background: <http://microformats.org/discuss/mail/microformats-discuss/2005-December/002398.html> -- Andy Mabbett From Leif_Storset at intuit.com Mon Jul 16 15:06:06 2007 From: Leif_Storset at intuit.com (Storset, Leif) Date: Mon Jul 16 15:06:08 2007 Subject: [uf-new] Receipt microformat References: <657A9BE009D3504AAE29BD8E8C2DD61E07303B24@SDGEXEVS02.corp.intuit.net> <A346E6D020642649BB6701C37E8934110133ED89@SDGEXEVS04.corp.intuit.net> Message-ID: <657A9BE009D3504AAE29BD8E8C2DD61E0735809A@SDGEXEVS02.corp.intuit.net> Fellow microformat enthusiasts, I work in Intuit's Technology Innovation Group, which explores new and emerging technologies and helps Intuit product teams adopt them. (For those outside North America: Intuit is the leading vendor of financial and tax software for individuals and small business. In America, our products Quicken, QuickBooks and TurboTax are household names.) Our group is interested in microformats - specifically the possibility of a receipt microformat. We believe that a receipt format for online stores could significantly reduce data entry for our users. Following the "why a new microformat" process: The PROBLEM: our users currently enter expenses into our software manually, even when the information is available in digital form. This is done in lump sums, which hinders further analysis and categorization. All this can be automated. (Indeed it is already automated through screen scraping, but this is unreliable, error-prone and not future-proof.) Is there a SIMPLER PROBLEM? Some components of a receipt are simpler problems that have been solved. Billing address and delivery address are obviously vCards; price could use the proposed hCurrency. But data such as the line items (Product X in quantity N at price Y) would not make sense out of the context of a "receipt". (hProduct and hListing obviously come close and might possibly be integrated somehow.) In short, we don't see a simpler problem to solve, since we already have some microformats in place. Has the problem been SOLVED? As far as we can tell, no. Martin Owens and Joe Osowski exchanged ideas on this microformat earlier, and we'd like to build on their work. (http://microformats.org/discuss/mail/microformats-new/2007-May/000394.h tml) In case the purpose of the microformat is not clear, imagine the following use case: The customer, a Quicken user with the (hypothetical) Quicken Browser Toolbar, is shopping at Amazon.com and is ready for checkout. After paying for the purchase, the customer wishes to enter the data into Quicken. Instead of manually typing everything into Quicken, the customer selects "Save receipt" from the Quicken Browser Toolbar, which imports the expense into Quicken. Another possibility is to use a JavaScript-powered button to copy the receipt to the clipboard and support pasting the microformat from within Quicken. We are looking forward to your input and participation. Has the problem been solved before? Are there other useful microformats already in existence that could be included in a receipt format? Thanks, Leif Arne Storset Technology Innovation Group, Intuit -------------- next part -------------- An HTML attachment was scrubbed... URL: http://microformats.org/discuss/mail/microformats-new/attachments/20070716/a13965d7/attachment.html From joe at andrieu.net Mon Jul 16 22:29:16 2007 From: joe at andrieu.net (Joe Andrieu) Date: Mon Jul 16 22:28:57 2007 Subject: [admin] [EoT request] was Re: [uf-new] Use of img in rel-*(withanalyzed data) In-Reply-To: <C2C00E88.920A0%tantek@cs.stanford.edu> Message-ID: <006c01c7c833$6c72f930$0501a8c0@andrieuhome> Tantek ? elik wrote (Sunday, July 15, 2007 5:49 PM): > On 7/15/07 3:58 PM, "Joe Andrieu" <joe@andrieu.net> wrote: > > > I would also like to see some of the back-and-forth move to > concrete > > proposals on the wiki. > > Thanks Joe. > > > > Including your own points, Tantek. > > I'm wiking everything I can, in priority order per my to-do > list on the > wiki: > > http://microformats.org/wiki/to-do#Tantek > > I encourage you to add a section for yourself on the to-do > page as well for the things you want to get done in the > microformats community. Tantek, I have been specifically requested by Rohit /not/ to put my issues on the wiki. And since he has removed them, spoken with me personally, and promised some sort of progress behind the scenes, I will continue to respect that. However, until that progress is visible and my concerns about governance, ownership, and IP are resolved satisfactorily, I will continue to refrain from substantial contributions to the wiki, just as I will be careful about using uF in my own work. Where I can, I will contribue in email discussions and participate in the community with hopes that it will eventually evolve from a private cabal of individual uF owners to a true open source community. > > You yourself have acknowledged the lack of documentation > before. As a > > result, I'd say the burden of proof exists, as usual, for everyone > > making a case. > > It is not the same for everyone no. There is what is > established and thus works today, and there are proposals for > change. The proposals for change have burden of proof. The > documentation at this point for those that actually worked on > it is a matter of historical documentation, not process. > > > And in my opinion, it is even greater for > > those defending the status quo, if simply because the > incumbant have > > the benefit of possession of the standard. > > We will simply have to choose to disagree on this point then. > > The burden of proof is always on those who wish to change or > modify what already "works" to a great extent today. This > principle is actually in use all over microformats, such as > re-using existing implied schemas and looking at existing > widely interoperable standards as a basis for vocabulary for > microformats. > > Thus it could be said that a key principle of microformats in > general "doing what already works" (i.e. re-use) is greatly > valued over "changing everything and starting from scratch" > (i.e. re-invention). Respectfully, please avoid hyperbole if you would like to have a constructive conversation. Nobody has suggested "changing everything and starting from scratch". It would be easier to do that outside of microformats.org if it were appropriate. Based on my own experience and informal research of significant cultural, historical, and philosophical essays on the issue of authority, I reject the argument that things should stay the same just "because that's the way we've always done it". Time and again, questioning the status quo has repeatedly generated improvements, even when catalytic of disruptive change. If there are documented and well-found reasons for a standing decision, the simple response is to point to those reasons and suggest to those who would suggest something new, that they address those reasons explicitly in any new proposals. Clearly articulated and well argued foundations for decisions can stand the test of time... but that fact should not be taken as license to reject suggestions out of hand simply because they are new or "not invented here". The foundation of technical authority must lie in the merit of the technology, not in the legacy of authorship. Just because a handful of smart guys documented semantic HTML representations of vCard and iCalendar does not make those specifications "holy" or unchangeable. This community has severe limitations on growth, especially in the area of versioning. As respectfully as possible, I suggest it is largely because the original authors often react defensively when changes are proposed. Supporting that emotional dynamic, there is no change control process. Either the original author deems it appropriate and updates the spec--such as when you added "places" to the semantics of vCard--or the original authors fight tooth and nail until well-intentioned suggestions are bludgeoned to death. In contrast, new proposals go through a brutal review process where every last detail is examined and debated. For those proposals that survive the gauntlet, the outcome promises to be a solid, robust microformat. Perhaps it would be more constructive if proposed changes to existing standards had some sort of agreed upon process for documentation, evaluation, and acceptance/rejection. -j -- Joe Andrieu SwitchBook Software http://www.switchbook.com joe@switchbook.com +1 (805) 705-8651 From bhawkeslewis at googlemail.com Tue Jul 17 01:14:55 2007 From: bhawkeslewis at googlemail.com (Benjamin Hawkes-Lewis) Date: Tue Jul 17 01:15:01 2007 Subject: [uf-new] Use of img in rel-* (with analyzed data) In-Reply-To: <469A62D6.9020805@digitalbazaar.com> References: <469A62D6.9020805@digitalbazaar.com> Message-ID: <469C7A7F.8010308@googlemail.com> Manu Sporny wrote: > This analysis required human interaction, thus the sample size is small > (but still statistically significant). A small GUI displayed an image to > a person and asked them to select if the image matched the ALT tag. This > is the first time this data is being presented: I think your analysis has done a good job of showing that @alt usage is better than I at least would have generally assumed, and it's great that you actually tested @alt with humans. But I do think there is a methodological flaw with how you did this. @alt text does not exist in a vacuum, but in the context of a page. @alt does not match image, but the use of an image within a given context. For example: <a href="help"><img src="question-mark.gif" alt="Help">Help</a> would be better than no @alt, but would still be misguided. In context, the correct alternative text would actually be alt="". And such errors would matter for microformat parsing, e.g.: <span class="fn"><img src="benjamin-hawkes-lewis.jpg" alt="Benjamin Hawkes-Lewis">Benjamin Hawkes-Lewis</span> So it would be better to present at least the immediate context of the image to human testers, not just the image itself. (Note I /strongly/ agree that microformat parsers should treat @alt text just like other text as per the HTML specification and WCAG; I'm making a purely methodological point about your statistical approach here.) -- Benjamin Hawkes-Lewis From Leif_Storset at intuit.com Tue Jul 17 10:08:36 2007 From: Leif_Storset at intuit.com (Storset, Leif) Date: Tue Jul 17 10:08:43 2007 Subject: [uf-new] Receipt microformat In-Reply-To: <657A9BE009D3504AAE29BD8E8C2DD61E0735809A@SDGEXEVS02.corp.intuit.net> References: <657A9BE009D3504AAE29BD8E8C2DD61E07303B24@SDGEXEVS02.corp.intuit.net> <A346E6D020642649BB6701C37E8934110133ED89@SDGEXEVS04.corp.intuit.net> <657A9BE009D3504AAE29BD8E8C2DD61E0735809A@SDGEXEVS02.corp.intuit.net> Message-ID: <657A9BE009D3504AAE29BD8E8C2DD61E073CBA3F@SDGEXEVS02.corp.intuit.net> Hello again, Regarding the receipt microformat, I thought I would link to Joe Osowski's (http://microformats.org/discuss/mail/microformats-new/2007-May/000394.h tml) and Martin Owens's (http://microformats.org/discuss/mail/microformats-discuss/2007-January/ 008033.html) proposals for your reference. I have collected a few samples that I will upload as soon as we agree that a wiki page is warranted. Looking forward to hearing from you! Leif Arne Storset Technology Innovation Group, Intuit ________________________________ From: microformats-new-bounces@microformats.org [mailto:microformats-new-bounces@microformats.org] On Behalf Of Storset, Leif Sent: Monday, July 16, 2007 3:06 PM To: microformats-new@microformats.org Subject: [uf-new] Receipt microformat Fellow microformat enthusiasts, I work in Intuit's Technology Innovation Group, which explores new and emerging technologies and helps Intuit product teams adopt them. (For those outside North America: Intuit is the leading vendor of financial and tax software for individuals and small business. In America, our products Quicken, QuickBooks and TurboTax are household names.) Our group is interested in microformats - specifically the possibility of a receipt microformat. We believe that a receipt format for online stores could significantly reduce data entry for our users. Following the "why a new microformat" process: The PROBLEM: our users currently enter expenses into our software manually, even when the information is available in digital form. This is done in lump sums, which hinders further analysis and categorization. All this can be automated. (Indeed it is already automated through screen scraping, but this is unreliable, error-prone and not future-proof.) Is there a SIMPLER PROBLEM? Some components of a receipt are simpler problems that have been solved. Billing address and delivery address are obviously vCards; price could use the proposed hCurrency. But data such as the line items (Product X in quantity N at price Y) would not make sense out of the context of a "receipt". (hProduct and hListing obviously come close and might possibly be integrated somehow.) In short, we don't see a simpler problem to solve, since we already have some microformats in place. Has the problem been SOLVED? As far as we can tell, no. Martin Owens and Joe Osowski exchanged ideas on this microformat earlier, and we'd like to build on their work. (http://microformats.org/discuss/mail/microformats-new/2007-May/000394.h tml) In case the purpose of the microformat is not clear, imagine the following use case: The customer, a Quicken user with the (hypothetical) Quicken Browser Toolbar, is shopping at Amazon.com and is ready for checkout. After paying for the purchase, the customer wishes to enter the data into Quicken. Instead of manually typing everything into Quicken, the customer selects "Save receipt" from the Quicken Browser Toolbar, which imports the expense into Quicken. Another possibility is to use a JavaScript-powered button to copy the receipt to the clipboard and support pasting the microformat from within Quicken. We are looking forward to your input and participation. Has the problem been solved before? Are there other useful microformats already in existence that could be included in a receipt format? Thanks, Leif Arne Storset Technology Innovation Group, Intuit -------------- next part -------------- An HTML attachment was scrubbed... URL: http://microformats.org/discuss/mail/microformats-new/attachments/20070717/2626380b/attachment-0001.html From msporny at digitalbazaar.com Tue Jul 17 10:51:46 2007 From: msporny at digitalbazaar.com (Manu Sporny) Date: Tue Jul 17 10:51:50 2007 Subject: [uf-new] Receipt microformat In-Reply-To: <657A9BE009D3504AAE29BD8E8C2DD61E0735809A@SDGEXEVS02.corp.intuit.net> References: <