From glenn.jones at madgex.com  Tue Jul  1 01:28:24 2008
From: glenn.jones at madgex.com (Glenn Jones)
Date: Tue Jul  1 01:28:33 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <ae2b2ba80806302212o7530876fxdd46708c0ce64dcd@mail.gmail.com>
References: <36A319113CF910438942741C4727ADFF02132B0D@MOBY.Clarence.local><1214820941.3171.25.camel@localhost.localdomain><61CCA888-066E-4F3E-8023-85F0EFBB37B5@adactio.com><28AAF834-9517-42AF-9FD0-982891C9AA50@ben-ward.co.uk><36A13CC9-03D2-46B6-AA3E-5DBDAFB7940A@adactio.com><FAFFA1A4-1EF8-4CE0-8DFB-300ECFB9780D@randomchaos.com><AEF59CBB-82C7-41BD-B04A-340DDF459372@adactio.com><3BCE3C9D-2E84-4DFF-AD18-891C8CB492FB@randomchaos.com>
	<ae2b2ba80806302212o7530876fxdd46708c0ce64dcd@mail.gmail.com>
Message-ID: <36A319113CF910438942741C4727ADFF02132F1F@MOBY.Clarence.local>

As the exchange between Ben and Jeremy has shown what is human readable
is up for debate. Having spent far too much time looking at the ISO date
formats they are all readable to me, but I know that's not the case for
everyone else.

We need to expand the discussion and ask those involved in the
accessibility area what is an acceptable human readable format. The
format 2008-01-25 is a compromise and as such we need to ask the other
party is it's an acceptable middle ground. For example would the BBC
accept 2008-01-25 in the title of a abbr.

For me a good rule of thumb is as a html author would you be happy
writing out the format in the text of a page for your users to read. I
personally would never write 2008-01-05 in a public document.


My main issue with the "value excerption optimization rule" approach
that Jeremy has been talking about, is that it may not work with other
data types

A <abbr class="duration" title="P2D">2 day</abbr> event
<abbr class="geo" title="37.77;-122.41">Northern California</abbr>
<abbr class="tz" title="-07:00">EST</abbr>
<abbr class="rate" title="4">4 out of 5</abbr>
Etc.

The only way to escape the internationalisation issues is not to use
anything other than numerical and separator chars. Expressing a duration
of "2 weeks and 3 days" in numbers and is still making it human readable
is a challenge!

Could we also say the rate title attribute with a value "4" is "provide
the full or expanded form of the expression"  4 out of 5.

We do need to resolve this issue globally across all content which
requires machine readability.

Although this option looks attractive at first sight, it is still
problematic.  


Glenn Jones

 



       


From xbadosa at gmail.com  Tue Jul  1 05:19:15 2008
From: xbadosa at gmail.com (Xavier Badosa)
Date: Tue Jul  1 05:26:36 2008
Subject: [uf-discuss] Microformat for statistical (tabular) data
Message-ID: <73b889410807010519m48e0853o3dbdf0a31e14d75b@mail.gmail.com>

Is there anyone working on a microformat for statistical information?
Such a microformat could be used to add more semantics to <table>s.
For example, the unit of the data, the time of reference, update time,
etc.

Some existing standards in the field to consider:

SDMX
http://www.sdmx.org

COSSI
http://www.stat.fi/org/tut/dthemes/drafts/cossi_en.html

Also:

DDI
http://www.ddialliance.org

XBRL
http://www.xbrl.org

X.
From scott at randomchaos.com  Tue Jul  1 05:28:03 2008
From: scott at randomchaos.com (Scott Reynen)
Date: Tue Jul  1 05:28:13 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <ae2b2ba80806302212o7530876fxdd46708c0ce64dcd@mail.gmail.com>
References: <36A319113CF910438942741C4727ADFF02132B0D@MOBY.Clarence.local>
	<1214820941.3171.25.camel@localhost.localdomain>
	<61CCA888-066E-4F3E-8023-85F0EFBB37B5@adactio.com>
	<28AAF834-9517-42AF-9FD0-982891C9AA50@ben-ward.co.uk>
	<36A13CC9-03D2-46B6-AA3E-5DBDAFB7940A@adactio.com>
	<FAFFA1A4-1EF8-4CE0-8DFB-300ECFB9780D@randomchaos.com>
	<AEF59CBB-82C7-41BD-B04A-340DDF459372@adactio.com>
	<3BCE3C9D-2E84-4DFF-AD18-891C8CB492FB@randomchaos.com>
	<ae2b2ba80806302212o7530876fxdd46708c0ce64dcd@mail.gmail.com>
Message-ID: <B42A4FFE-FEF8-404E-BA86-CC84CDAFB48A@randomchaos.com>

On [Jun 30], at [ Jun 30] 11:12 , Breton Slivka wrote:

> I think you'll find that metadata of any kind is a comprimise of the
> "microformats core principles"

What I mean by "metadata" is information about content, which already  
makes up the bulk of microformats, e.g. class names, rel values, tag  
names, none of which is readily visible to humans.  Making content  
visible is a principle; making such metadata visible is not.  The  
difference with ISO dates is we've previously defined them as content;  
I'm suggesting that's a mistaken definition, as these dates don't  
function as content in our reference standard iCalendar.

Peace,
Scott

From lists at ben-ward.co.uk  Tue Jul  1 06:09:12 2008
From: lists at ben-ward.co.uk (Ben Ward)
Date: Tue Jul  1 06:09:18 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <B42A4FFE-FEF8-404E-BA86-CC84CDAFB48A@randomchaos.com>
References: <36A319113CF910438942741C4727ADFF02132B0D@MOBY.Clarence.local>
	<1214820941.3171.25.camel@localhost.localdomain>
	<61CCA888-066E-4F3E-8023-85F0EFBB37B5@adactio.com>
	<28AAF834-9517-42AF-9FD0-982891C9AA50@ben-ward.co.uk>
	<36A13CC9-03D2-46B6-AA3E-5DBDAFB7940A@adactio.com>
	<FAFFA1A4-1EF8-4CE0-8DFB-300ECFB9780D@randomchaos.com>
	<AEF59CBB-82C7-41BD-B04A-340DDF459372@adactio.com>
	<3BCE3C9D-2E84-4DFF-AD18-891C8CB492FB@randomchaos.com>
	<ae2b2ba80806302212o7530876fxdd46708c0ce64dcd@mail.gmail.com>
	<B42A4FFE-FEF8-404E-BA86-CC84CDAFB48A@randomchaos.com>
Message-ID: <18E0B8A8-096D-479C-AC3D-445A407B48DD@ben-ward.co.uk>

On 1 Jul 2008, at 13:28, Scott Reynen wrote:

> The difference with ISO dates is we've previously defined them as  
> content; I'm suggesting that's a mistaken definition, as these dates  
> don't function as content in our reference standard iCalendar.

In my view, it's not so much that an ISO dates isn't content per se,  
it's that it's not content for humans, and in this case, the date  
content for humans is being published in a different form. In HTML,  
visible content is for humans, content for machines is hidden.

This makes for a violation of the DRY principal, but it's the same  
violation we're already making, and it applies not just to datetimes,  
but also to durations (which has only just been mentioned in this  
discussion, and is important not to ignore), hCard telephone types,  
geo co-ordinates, and everything else documented on http://microformats.org/wiki/machine-data 
.

As an aside, this is why I favoured and have done some initial work  
into the empty-element-with-title extension to the value-excerption- 
pattern (which I'm also leading the effort to get properly specified,  
since it's previously not been). It keeps the machine content in the  
HTML, can be specified to keep it physical proximity to the human  
form, but due to the way empty elements are treated, does not expose  
that content to humans. It does not violate DRY any more than we  
already do and in relation to the ?hidden data? principal, I argue  
these are exceptional cases _because_ they are DRY violations. We are  
not hiding information, we're hiding an alternate representation of  
visible information. (issues page: http://microformats.org/wiki/value-excerption-pattern-issues) 
.

Much of this same line of discussion applies to the class-name data  
embedding that Jake and Frances have discussed.

If there's a semantically acceptable solution to this, which doesn't  
violate any principals, or DRY, or the semantics of HTML, doesn't  
compromise accessibility or internationalisation, and meets publishers  
demands for flexibility and doesn't compromise user experience, then  
that would be fantastic. None of the discussions so far seem to match  
that.

B
From michael.hausenblas at joanneum.at  Tue Jul  1 07:37:05 2008
From: michael.hausenblas at joanneum.at (Hausenblas, Michael)
Date: Tue Jul  1 08:36:04 2008
Subject: [uf-discuss] Microformat for statistical (tabular) data
In-Reply-To: <73b889410807010519m48e0853o3dbdf0a31e14d75b@mail.gmail.com>
Message-ID: <768DACDC356ED04EA1F1130F97D29852017A16A9@RZJC2EX.jr1.local>


We work not precisely on a microformat, but you may also want to look at
http://purl.org/NET/scovo (the statistical core vocabulary).

Cheers,
	Michael

----------------------------------------------------------
 Michael Hausenblas, MSc.
 Institute of Information Systems & Information Management
 JOANNEUM RESEARCH Forschungsgesellschaft mbH
  
 http://www.joanneum.at/iis/
----------------------------------------------------------
 

>-----Original Message-----
>From: microformats-discuss-bounces@microformats.org 
>[mailto:microformats-discuss-bounces@microformats.org] On 
>Behalf Of Xavier Badosa
>Sent: Tuesday, July 01, 2008 2:19 PM
>To: microformats-discuss@microformats.org
>Subject: [uf-discuss] Microformat for statistical (tabular) data
>
>Is there anyone working on a microformat for statistical information?
>Such a microformat could be used to add more semantics to <table>s.
>For example, the unit of the data, the time of reference, update time,
>etc.
>
>Some existing standards in the field to consider:
>
>SDMX
>http://www.sdmx.org
>
>COSSI
>http://www.stat.fi/org/tut/dthemes/drafts/cossi_en.html
>
>Also:
>
>DDI
>http://www.ddialliance.org
>
>XBRL
>http://www.xbrl.org
>
>X.
>_______________________________________________
>microformats-discuss mailing list
>microformats-discuss@microformats.org
>http://microformats.org/mailman/listinfo/microformats-discuss
>

From guillaume at lebleu.org  Tue Jul  1 09:01:33 2008
From: guillaume at lebleu.org (Guillaume Lebleu)
Date: Tue Jul  1 09:01:59 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <36A319113CF910438942741C4727ADFF02132F1F@MOBY.Clarence.local>
References: <36A319113CF910438942741C4727ADFF02132B0D@MOBY.Clarence.local><1214820941.3171.25.camel@localhost.localdomain><61CCA888-066E-4F3E-8023-85F0EFBB37B5@adactio.com><28AAF834-9517-42AF-9FD0-982891C9AA50@ben-ward.co.uk><36A13CC9-03D2-46B6-AA3E-5DBDAFB7940A@adactio.com><FAFFA1A4-1EF8-4CE0-8DFB-300ECFB9780D@randomchaos.com><AEF59CBB-82C7-41BD-B04A-340DDF459372@adactio.com><3BCE3C9D-2E84-4DFF-AD18-891C8CB492FB@randomchaos.com>	<ae2b2ba80806302212o7530876fxdd46708c0ce64dcd@mail.gmail.com>
	<36A319113CF910438942741C4727ADFF02132F1F@MOBY.Clarence.local>
Message-ID: <486A54DD.2030105@lebleu.org>

Glenn Jones wrote:
> As the exchange between Ben and Jeremy has shown what is human readable
> is up for debate. Having spent far too much time looking at the ISO date
> formats they are all readable to me, but I know that's not the case for
> everyone else.
>
> We need to expand the discussion and ask those involved in the
> accessibility area what is an acceptable human readable format. The
> format 2008-01-25 is a compromise and as such we need to ask the other
> party is it's an acceptable middle ground. For example would the BBC
> accept 2008-01-25 in the title of a abbr.
>   
Since the BBC's request was specifically related to screen readers, we 
may want to distinguish "machine-readable", "human-readable" and 
"human-hearable". I think there is less debate re: what is 
"human-hearable" than there is debate re: what is "human-readable"

IMO, "2008-01-25" is indeed more human-readable than 
"2008-01-25T12:00:11", but it is still less "human-hearable" than the 
plain old English "January 25th, 2008", which is human-readable and 
machine-readable as long as it is written following precisely English US 
conventions and the locale can be deduced from a lang attribute (either 
global to the HTML document or local to the date).

Moreover, "January 25th, 2008" is indeed an expansion form of say "1/25" 
so, the following is correct HTML:

<abbr title="January 25th, 2008" class="dstart" lang="en-us">1/25</abbr>

Guillaume
From xbadosa at gmail.com  Tue Jul  1 09:08:10 2008
From: xbadosa at gmail.com (Xavier Badosa)
Date: Tue Jul  1 09:08:15 2008
Subject: [uf-discuss] Current state of grouping proposal
Message-ID: <73b889410807010908m162c2117v907d9d56b9d228d8@mail.gmail.com>

I'm a little confused about the current state of the grouping
proposal. I'm not sure even if the uf-community is working on a
general solution (a microformat for grouping any sort of items) (+1
vote) or a particular solution for some of the existing microformats
(0 votes).

I think some sort of grouping is needed in hReview if we want to
follow the principle of adapting to current behaviors and usage
patterns. Usually, webpages include more than one review for a single
item. To solve this, hReview forces us:

1) to repeat an unnecessary hidden item for every hreview (this
somehow violates the hidden (meta)data principle);

or

2) to use the include-pattern (empty anchor, accessibility issues).

A grouping mechanism could come to the rescue. Something like:

<div class="hset">
  <h1 class="item"><a href="IMDB_URI" class="fn url">The Godfather II</a></h1>
  <div class="hreview">
     <blockquote class="description">The best!</blockquote>
     <p class="reviewer vcard"><em class="fn">Some guy</em></p>
  </div>
  <div class="hreview">
     <blockquote class="description">Soooooo good!</blockquote>
     <p class="reviewer vcard"><em class="fn">Enthusiastic girl</em></p>
  </div>
</div>

could be interpreted by a parser that the same item should be
associated with every hreview. In fact, a grouping microformat would
be an alternative (easy to parse) include-pattern mechanism.

X.
From guillaume at lebleu.org  Tue Jul  1 09:27:19 2008
From: guillaume at lebleu.org (Guillaume Lebleu)
Date: Tue Jul  1 09:27:23 2008
Subject: [uf-discuss] Plain Old English/French/...,
 human-readable/hearable alternative to ISO date
Message-ID: <486A5AE7.3000705@lebleu.org>

FYI. I've summarized/combined some of the ideas suggested by Glenn 
Jones, myself and others here [1].
I will elaborate on some of the details (ex. time) later.

Guillaume

[1] 
http://microformats.org/wiki/datetime-design-pattern#Plain_Old_English_alternative_to_ISO_date
From xbadosa at gmail.com  Tue Jul  1 08:11:37 2008
From: xbadosa at gmail.com (Xavier Badosa)
Date: Tue Jul  1 09:59:27 2008
Subject: [uf-discuss] Current state of grouping proposal? A possible
	solution for hReview?
Message-ID: <73b889410807010811l2404201id7e14d4e5b042592@mail.gmail.com>

I'm a little confused about the current state of the grouping
proposal. I'm not sure even if the uf-community is working on a
general solution (a microformat for grouping any sort of items) (+1
vote) or a particular solution for some of the existing microformats
(0 votes).

I think some sort of grouping is needed in hReview if we want to
follow the principle of adapting to current behaviors and usage
patterns. Usually, webpages include more than one review for a single
item. To solve this, hReview forces us:

1) to repeat an unnecessary hidden item for every hreview (this
somehow violates the hidden (meta)data principle);

or

2) to use the include-pattern (empty anchor, accessibility issues).

A grouping mechanism could come to the rescue. Something like:

<div class="hset">
  <h1 class="item"><a href="IMDB_URI" class="fn url">The Godfather II</a></h1>
  <div class="hreview">
     <blockquote class="description">The best!</blockquote>
     <p class="reviewer vcard"><em class="fn">Some guy</em></p>
  </div>
  <div class="hreview">
     <blockquote class="description">Soooooo good!</blockquote>
     <p class="reviewer vcard"><em class="fn">Enthusiastic girl</em></p>
  </div>
</div>

could be interpreted by a parser that the same item should be
associated with every hreview. In fact, a grouping microformat would
be an alternative (easy to parse) include-pattern mechanism.

X.
From lists at ben-ward.co.uk  Tue Jul  1 09:42:48 2008
From: lists at ben-ward.co.uk (Ben Ward)
Date: Tue Jul  1 10:04:12 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <486A54DD.2030105@lebleu.org>
References: <36A319113CF910438942741C4727ADFF02132B0D@MOBY.Clarence.local><1214820941.3171.25.camel@localhost.localdomain><61CCA888-066E-4F3E-8023-85F0EFBB37B5@adactio.com><28AAF834-9517-42AF-9FD0-982891C9AA50@ben-ward.co.uk><36A13CC9-03D2-46B6-AA3E-5DBDAFB7940A@adactio.com><FAFFA1A4-1EF8-4CE0-8DFB-300ECFB9780D@randomchaos.com><AEF59CBB-82C7-41BD-B04A-340DDF459372@adactio.com><3BCE3C9D-2E84-4DFF-AD18-891C8CB492FB@randomchaos.com>	<ae2b2ba80806302212o7530876fxdd46708c0ce64dcd@mail.gmail.com>
	<36A319113CF910438942741C4727ADFF02132F1F@MOBY.Clarence.local>
	<486A54DD.2030105@lebleu.org>
Message-ID: <61F0DF46-CD31-43A3-AF97-1D357E74B431@ben-ward.co.uk>

On 1 Jul 2008, at 17:01, Guillaume Lebleu wrote:

> Since the BBC's request was specifically related to screen readers,  
> we may want to distinguish "machine-readable", "human-readable" and  
> "human-hearable". I think there is less debate re: what is "human- 
> hearable" than there is debate re: what is "human-readable"

The BBC complaint directly refers to both screen readers and the  
display of unexpected text in tool-tips. It's not just about aural  
output.

At the core, in breaking with the semantics of an HTML element, we've  
broken the behaviour of technologies using the element correctly and  
intelligently (hence my strong opposition to continuing to stretch  
ABBR outside of textual abbreviations as commonly described by  
dictionaries: ?An abbreviation is a shortened form of a word or  
phrase.? ? Wikipedia, Apple OSX Dictionary, Dictionary.com)

B
From xbadosa at gmail.com  Tue Jul  1 05:09:29 2008
From: xbadosa at gmail.com (Xavier Badosa)
Date: Tue Jul  1 10:54:33 2008
Subject: [uf-discuss] Current state of grouping proposal? A possible
	solution for hReview?
Message-ID: <73b889410807010509o6bcd2312l3cc52df066b2978b@mail.gmail.com>

I'm a little confused about the current state of the grouping
proposal. I'm not sure even if the uf-community is working on a
general solution (a microformat for grouping any sort of items) (+1
vote) or a particular solution for some of the existing microformats
(0 votes).

I think some sort of grouping is needed in hReview if we want to
follow the principle of adapting to current behaviors and usage
patterns. Usually, webpages include more than one review for a single
item. To solve this, hReview forces us:

1) to repeat an unnecessary hidden item for every hreview (this
somehow violates the hidden (meta)data principle);

or

2) to use the include-pattern (empty anchor, accessibility issues).

A grouping mechanism could come to the rescue. Something like:

<div class="hset">
   <h1 class="item"><a href="IMDB_URI" class="fn url">The Godfather II</a></h1>
   <div class="hreview">
      <blockquote class="description">The best!</blockquote>
      <p class="reviewer vcard"><em class="fn">Some guy</em></p>
   </div>
   <div class="hreview">
      <blockquote class="description">Soooooo good!</blockquote>
      <p class="reviewer vcard"><em class="fn">Enthusiastic girl</em></p>
   </div>
</div>

could be interpreted by a parser that the same item should be
associated with every hreview. In fact, a grouping microformat would
be an alternative (easy to parse) include-pattern mechanism.

X.
From danbri at danbri.org  Tue Jul  1 11:16:00 2008
From: danbri at danbri.org (Dan Brickley)
Date: Tue Jul  1 11:16:05 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <486A54DD.2030105@lebleu.org>
References: <36A319113CF910438942741C4727ADFF02132B0D@MOBY.Clarence.local><1214820941.3171.25.camel@localhost.localdomain><61CCA888-066E-4F3E-8023-85F0EFBB37B5@adactio.com><28AAF834-9517-42AF-9FD0-982891C9AA50@ben-ward.co.uk><36A13CC9-03D2-46B6-AA3E-5DBDAFB7940A@adactio.com><FAFFA1A4-1EF8-4CE0-8DFB-300ECFB9780D@randomchaos.com><AEF59CBB-82C7-41BD-B04A-340DDF459372@adactio.com><3BCE3C9D-2E84-4DFF-AD18-891C8CB492FB@randomchaos.com>	<ae2b2ba80806302212o7530876fxdd46708c0ce64dcd@mail.gmail.com>	<36A319113CF910438942741C4727ADFF02132F1F@MOBY.Clarence.local>
	<486A54DD.2030105@lebleu.org>
Message-ID: <486A7460.1060704@danbri.org>

Guillaume Lebleu wrote:
> Glenn Jones wrote:
>> As the exchange between Ben and Jeremy has shown what is human readable
>> is up for debate. Having spent far too much time looking at the ISO date
>> formats they are all readable to me, but I know that's not the case for
>> everyone else.
>>
>> We need to expand the discussion and ask those involved in the
>> accessibility area what is an acceptable human readable format. The
>> format 2008-01-25 is a compromise and as such we need to ask the other
>> party is it's an acceptable middle ground. For example would the BBC
>> accept 2008-01-25 in the title of a abbr.
>>   
> Since the BBC's request was specifically related to screen readers, we 
> may want to distinguish "machine-readable", "human-readable" and 
> "human-hearable". I think there is less debate re: what is 
> "human-hearable" than there is debate re: what is "human-readable"

This reading is a little narrow: screen readers can also have Braille 
output; eg. see http://www.yourdolphin.com/productdetail.asp?id=5&z=1
http://en.wikipedia.org/wiki/Refreshable_Braille_display

cheers,

Dan

--
http://danbri.org/

From uf-discuss at cilux.org  Tue Jul  1 13:49:33 2008
From: uf-discuss at cilux.org (Duncan Cragg)
Date: Tue Jul  1 13:49:39 2008
Subject: [uf-discuss] class="tag"
In-Reply-To: <cdc278e10806300103n53e8ad94x51951f731507956a@mail.gmail.com>
References: <04DA6562-7A56-4E11-B05C-D2D2994E9709@tobyinkster.co.uk>	<48679704.6050708@cilux.org>
	<cdc278e10806300103n53e8ad94x51951f731507956a@mail.gmail.com>
Message-ID: <486A985D.2030403@cilux.org>

Ciaran McNulty wrote:
> On Sun, Jun 29, 2008 at 3:07 PM, Duncan Cragg <uf-discuss@cilux.org> wrote:
>   
>> Those of us who favour opaque URLs (actually for practical reasons such as
>> clean separation of concerns, maintainability, etc.) are unhappy with being
>> forced into a semantic URL schema when using rel-tag.
>>     

> Can you go into a bit more detail, or point to a resource explaining
> the benefits of opaque URLs?  It's something I've not come across
> before and I'd be intrigued to see the reasons behind it.
>   

I'll do both. Here's a resource explaining it - I addressed the subject 
in this blog post:

http://duncan-cragg.org/blog/post/content-types-and-uris-rest-dialogues/

That is a very transparent URL (see: I'm not obsessive about it!). 

The trouble with my URL is that it mixes three concerns:

1. making a connection to my server and kicking off HTTP
2. identifying a resource (with a completely opaque string) within HTTP
3. kicking off some Python code with an argument string

It's 1. and 3. I'm talking about. URLs are already opaque to HTTP.

As soon as you allow in syntax or schema in URLs - as soon as you start 
using anything other than long random numbers - you've got a problem of 
namespace allocation and schema standardisation. I refer to "Zooko's 
Triangle" on my blog's right rail which discusses the trade-off between 
global uniqueness, security and memorability.
_________________________________________

On 1.: Unless you're running fancy P2P algorithms, it's hard to argue 
against putting a big hint in the URL to say where to go to find the 
resource. But don't forget that you needn't go to that server - you 
could ask an intermediary proxy - which is kind of a simplistic P2P 
algorithm... 

However, there is a case for arguing that DNS has been a failure: it 
isn't any more easy to type a URL when you know you have to be so 
precise to avoid scam sites. And it isn't any easier to use it to 
identify a site when you have to avoid the likes of 
www.yahoo.com.baddies.com or www.google.randomtld . You may as well only 
use IP addresses; as hard to type and as useless to read. Most programs 
come with a copy-paste function to save some typing...

Add to this lack of security (and other security holes) the absurd 
scramble for domain name real estate and such bad behaviour as domain 
squatting, etc., and it's looking like a system that only system admins 
and crooks benefit from. 

Most people (including myself) would type 'acme' into Google instead of 
'acme.com' into the URL bar, to give an extra level of intelligence, 
familiarity, trust and user interface consistency.
_________________________________________

But really it's 3. that bothers me most. Using URLs to pass 
human-readable strings to an application 'above' HTTP.

A transparent URL string is always a query string (whether it has a '?' 
or not) - in other words, it could potentially be ambiguous and return, 
not definitely one, but zero or many possible results.  We probably get 
zero results when we 'hack' a URL or when the site gets reorganised. We 
gloss over the many-results case by returning a single page that we call 
'query results'. But by allowing in zero or many resources so easily, 
we've loosened the Web by removing the definite 1-1 mapping of URL to 
resource.

Hackable URLs should not be part of a self-respecting website's user 
interface. We would give a better user experience if we took the URL bar 
away and replaced it with a 'jump to first clipboard web link' button, 
for those copy-paste situations. Such a button would intelligently parse 
the text on the clipboard for URLs and jump to the first location 
discovered.  A good information architecture and user interaction design 
makes hackable URLs irrelevant.

Another problem is when people start using their knowledge of the URL 
structure to generate new URLs - it may be acceptable or encouraged 
(even prescribed in an HTML GET form), but each time it happens, we're 
creating a unique mini-contract - another non-standard schema.  The Web 
thrives on URL proliferation, not on schema proliferation!

The need for URLs to be reliable - to always return what they are 
expected to return each time they're used - means that whatever URL 
schema or namespace you come up with is something you're stuck with - 
people or even programs may depend on it.  But there's no standards body 
or namespace body looking after the bigger picture for you. Your 
mistakes may haunt you for a long time.

Also, query URLs are inherently /not/ reliable - the resource they 
return is /expected/ to change, which again makes their (re)-use less 
desirable.

Clearly, the W3C's unfortunate 'httpRange-14' issue would never have 
occurred with opaque URLs. In other words, opaque, semantics-free HTTP 
URIs are /always/ dereferencable to 'information resources' and /never/ 
refer to cars! Strings that are part of a car domain model belong inside 
/content/ not in links to content - they belong above HTTP. I'm not 
fully conversant in the Semantic Web domain, but I suspect that there 
are issues in there that are caused by mixing up globally unique 
identifier strings used to build information structures with strings 
that are semantically-meaningful over those structures, and that can 
dereference to sets.

So my main objection to transparent URLs is the way they mix up the 
mechanism for linking up the Web with a mechanism for querying it. The 
Web works fine using HTTP and opaque URLs. We have POST and Content-Type 
and OpenSearch schemas to query the Web.
_________________________________________

Practical examples..

You can return opaque links to time-ordered collections listing the 
latest documents to be tagged 'semweb':

<a class="tag" href="http://tagbeat.com/3720a-993117b">semweb</a>

Keep your URLs opaque (like GUIDs in databases) and put your application 
data and queries in the content (like SQL queries and result sets in 
databases). Give your query content resources a first-class schema - see 
OpenSearch - and even their own URLs. POST these queries to opaque 
collection URLs. Make your result sets transient (returned in the POST 
response, thus no-cache by default). Result sets should only be 
'grounded' (thus linkable and cacheable) if explicitly asked for in the 
query, when you should redirect to a new resource in the POST response.

Of course, you can still surround the UUID/GUID part of your opaque URLs 
with human-readable string decorations, as long as they're never used to 
dereference the resource but just for mnemonic purpose, or for search 
engine optimisation.
_________________________________________

I've gone on at length (again!), but hope you have had the patience to 
get my point of view. =0)

Cheers!

Duncan Cragg

PS  I work at the Financial Times over the river from you - but I was a 
URL opacitist /before/ having to wrangle with the FT CMS...!



From brian.suda at gmail.com  Tue Jul  1 14:05:10 2008
From: brian.suda at gmail.com (Brian Suda)
Date: Tue Jul  1 14:05:22 2008
Subject: [uf-discuss] class="tag"
In-Reply-To: <486A985D.2030403@cilux.org>
References: <04DA6562-7A56-4E11-B05C-D2D2994E9709@tobyinkster.co.uk>
	<48679704.6050708@cilux.org>
	<cdc278e10806300103n53e8ad94x51951f731507956a@mail.gmail.com>
	<486A985D.2030403@cilux.org>
Message-ID: <21e770780807011405o601dbab4s54156e6a34ab6431@mail.gmail.com>

On Tue, Jul 1, 2008 at 8:49 PM, Duncan Cragg <uf-discuss@cilux.org> wrote:
> Practical examples..
>
> You can return opaque links to time-ordered collections listing the latest
> documents to be tagged 'semweb':
>
> <a class="tag" href="http://tagbeat.com/3720a-993117b">semweb</a>
--- i think we are trying to re-invent:

<a class="category" href="http://tagbeat.com/3720a-993117b">semweb</a>

Instead of trying to create "tag" as a class value which does the
exact same thing as "category" we should approach the various
microformats and see if they can/should simply include 'category' as
one of the values they recognize rather than trying to re-invent
rel-tag as class-tag.

-brian

-- 
brian suda
http://suda.co.uk
From mdagn at spraci.com  Wed Jul  2 00:49:45 2008
From: mdagn at spraci.com (Michael MD)
Date: Wed Jul  2 00:49:49 2008
Subject: [uf-discuss] Human and machine readable data format
References: <36A319113CF910438942741C4727ADFF02132B0D@MOBY.Clarence.local><1214820941.3171.25.camel@localhost.localdomain><61CCA888-066E-4F3E-8023-85F0EFBB37B5@adactio.com><28AAF834-9517-42AF-9FD0-982891C9AA50@ben-ward.co.uk><36A13CC9-03D2-46B6-AA3E-5DBDAFB7940A@adactio.com><FAFFA1A4-1EF8-4CE0-8DFB-300ECFB9780D@randomchaos.com><AEF59CBB-82C7-41BD-B04A-340DDF459372@adactio.com><3BCE3C9D-2E84-4DFF-AD18-891C8CB492FB@randomchaos.com>	<ae2b2ba80806302212o7530876fxdd46708c0ce64dcd@mail.gmail.com><36A319113CF910438942741C4727ADFF02132F1F@MOBY.Clarence.local>
	<486A54DD.2030105@lebleu.org>
Message-ID: <001401c8dc18$327bf750$116bacca@COMCEN>

> IMO, "2008-01-25" is indeed more human-readable than 
> "2008-01-25T12:00:11", but it is still less "human-hearable" than the 
> plain old English "January 25th, 2008", which is human-readable and 
> machine-readable as long as it is written following precisely English US 
> conventions and the locale can be deduced from a lang attribute (either 
> global to the HTML document or local to the date).



Allowing language conventions for date parsing to be determined by anything 
"global" sounds a bit dangerous to me.

Someone might post on a shared blog/forum site in a different country and 
mark it up in a way that does not match a lang attribute somewhere else on 
the page!

also - who is going to say that all replies to the post or comments that 
might also appear on that same page are going to follow the same language 
rules







From guillaume at lebleu.org  Wed Jul  2 09:36:30 2008
From: guillaume at lebleu.org (Guillaume Lebleu)
Date: Wed Jul  2 09:36:39 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <001401c8dc18$327bf750$116bacca@COMCEN>
References: <36A319113CF910438942741C4727ADFF02132B0D@MOBY.Clarence.local><1214820941.3171.25.camel@localhost.localdomain><61CCA888-066E-4F3E-8023-85F0EFBB37B5@adactio.com><28AAF834-9517-42AF-9FD0-982891C9AA50@ben-ward.co.uk><36A13CC9-03D2-46B6-AA3E-5DBDAFB7940A@adactio.com><FAFFA1A4-1EF8-4CE0-8DFB-300ECFB9780D@randomchaos.com><AEF59CBB-82C7-41BD-B04A-340DDF459372@adactio.com><3BCE3C9D-2E84-4DFF-AD18-891C8CB492FB@randomchaos.com>	<ae2b2ba80806302212o7530876fxdd46708c0ce64dcd@mail.gmail.com><36A319113CF910438942741C4727ADFF02132F1F@MOBY.Clarence.local>	<486A54DD.2030105@lebleu.org>
	<001401c8dc18$327bf750$116bacca@COMCEN>
Message-ID: <486BAE8E.90401@lebleu.org>

Michael MD wrote:
> Allowing language conventions for date parsing to be determined by 
> anything "global" sounds a bit dangerous to me.
>
> Someone might post on a shared blog/forum site in a different country 
> and mark it up in a way that does not match a lang attribute somewhere 
> else on the page!
>
> also - who is going to say that all replies to the post or comments 
> that might also appear on that same page are going to follow the same 
> language rules 
Sorry if I didn't express myself clearly. What I meant here was that a 
lang="..." attribute on the element of class vevent or dstart is 
recommended at all times (to deal with the very ambiguity you are 
referring to), but is optional to comply with DRY. If not present, its 
value may be inferred from the closest containing/ancestor element with 
a lang attribute, for instance a lang attribute value at the level of 
the html element.

In other words, if I want to write my date in French in an en-us html 
document, I'd have to attach lang="fr" to my date or its containing 
content, but if I want to write my date in American English in the same 
document, I don't have to attach lang="en-us", although it wouldn't hurt to.

Do you still see this as dangerous practice?

G
From ameer1234567890 at gmail.com  Wed Jul  2 13:04:00 2008
From: ameer1234567890 at gmail.com (Ameer Dawood)
Date: Wed Jul  2 13:04:10 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: GmailId11ae4a7a3727c1b1
Message-ID: <17993138.377781215029040996.JavaMail.flurry@web1>

Hi G,
Internationalization of metadata is a bad and ineffective concept. It would not only result bloat in pharsers, but also bloat in the data format itself. You are proposing to internationalize the machine readable date which is metadata (and not content, in this case). A very clear example would be; if we are to internationalize CSS then "border-color" would become "border-colour" in en-uk. It's like proposing this change with a lang attribute/element.

Ameer

_____________________________
Sent from my phone using flurry - Get free mobile email and news at: http://www.flurry.com

--- Original Message ---
Date: Wed Jul 02 09:43:00 PDT 2008
From: Guillaume Lebleu <guillaume@lebleu.org>
To: Microformats Discuss <microformats-discuss@microformats.org>
Subject: Re: [uf-discuss] Human and machine readable data format
---

Michael MD wrote:
> Allowing language conventions for date parsing to be determined by anything "global" sounds a bit dangerous to me.
> 
> Someone might post on a shared blog/forum site in a different country and mark it up in a way that does not match a lang attribute somewhere else on the page!
> 
> also - who is going to say that all replies to the post or comments that might also appear on that same page are going to follow the same language rules 
Sorry if I didn't express myself clearly. What I meant here was that a lang="..." attribute on the element of class vevent or dstart is recommended at all times (to deal with the very ambiguity you are referring to), but is optional to comply with DRY. If not present, its value may be inferred from the closest containing/ancestor element with a lang attribute, for instance a lang attribute value at the level of the html element.

In other words, if I want to write my date in French in an en-us html document, I'd have to attach lang="fr" to my date or its containing content, but if I want to write my date in American English in the same document, I don't have to attach lang="en-us", although it wouldn't hurt to.

Do you still see this as dangerous practice?

G
_______________________________________________
microformats-discuss mailing list
microformats-discuss@microformats.org
http://microformats.org/mailman/listinfo/microformats-discuss

From bjonkman at sobac.com  Wed Jul  2 13:24:27 2008
From: bjonkman at sobac.com (Bob Jonkman)
Date: Wed Jul  2 13:25:14 2008
Subject: [uf-discuss] Current state of grouping proposal
In-Reply-To: <73b889410807010908m162c2117v907d9d56b9d228d8@mail.gmail.com>
References: <73b889410807010908m162c2117v907d9d56b9d228d8@mail.gmail.com>
Message-ID: <486BABBB.11260.1234965@bjonkman.sobac.com>

I think the grouping mechanism can be accomplished with XOXO, 
http://microformats.org/wiki/xoxo

--Bob.

>>> 1 Jul 2008 18:08  Xavier Badosa <microformats-
discuss@microformats.org>  >>>

> I'm a little confused about the current state of the grouping
> proposal. I'm not sure even if the uf-community is working on a
> general solution (a microformat for grouping any sort of items) (+1
> vote) or a particular solution for some of the existing microformats
> (0 votes).
> 
> I think some sort of grouping is needed in hReview if we want to
> follow the principle of adapting to current behaviors and usage
> patterns. Usually, webpages include more than one review for a single
> item. To solve this, hReview forces us:
> 
> 1) to repeat an unnecessary hidden item for every hreview (this
> somehow violates the hidden (meta)data principle);
> 
> or
> 
> 2) to use the include-pattern (empty anchor, accessibility issues).
> 
> A grouping mechanism could come to the rescue. Something like:
> 
> <div class="hset">
>   <h1 class="item"><a href="IMDB_URI" class="fn url">The Godfather
>   II</a></h1> <div class="hreview">
>      <blockquote class="description">The best!</blockquote>
>      <p class="reviewer vcard"><em class="fn">Some guy</em></p>
>   </div>
>   <div class="hreview">
>      <blockquote class="description">Soooooo good!</blockquote>
>      <p class="reviewer vcard"><em class="fn">Enthusiastic
>      girl</em></p>
>   </div>
> </div>
> 
> could be interpreted by a parser that the same item should be
> associated with every hreview. In fact, a grouping microformat would
> be an alternative (easy to parse) include-pattern mechanism.
> 
> X.


-- -- -- --
Bob Jonkman <bjonkman@sobac.com>         http://sobac.com/sobac/    
SOBAC Microcomputer Services              Voice: +1-519-669-0388       
6 James Street, Elmira ON  Canada  N3B 1L5  Cel: +1-519-635-9413
Software   ---   Office & Business Automation   ---   Consulting


From bjonkman at sobac.com  Wed Jul  2 15:37:05 2008
From: bjonkman at sobac.com (Bob Jonkman)
Date: Wed Jul  2 16:09:20 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <B42A4FFE-FEF8-404E-BA86-CC84CDAFB48A@randomchaos.com>
References: <36A319113CF910438942741C4727ADFF02132B0D@MOBY.Clarence.local>,
	<ae2b2ba80806302212o7530876fxdd46708c0ce64dcd@mail.gmail.com>,
	<B42A4FFE-FEF8-404E-BA86-CC84CDAFB48A@randomchaos.com>
Message-ID: <486BCAD1.20690.19CB840@bjonkman.sobac.com>

>>> 1 Jul 2008 6:28  Scott Reynen <microformats-
discuss@microformats.org>  >>>

>  The difference with ISO dates is we've previously defined them as
> content; I'm suggesting that's a mistaken definition, as these dates
> don't function as content in our reference standard iCalendar. 

I disagree.  In an appointment, the date IS the content.  The metadata 
is the markup that identifies the date and its purpose, eg. 
class="dtstart".

With an <abbr> the date content is represented in two different ways, 
one as prose ("tomorrow at noon"), and once as an expansion.  In 
prosaic HTML it is valid (and appropriate) to write

  <abbr title="Noon, July 3rd, 2008">tomorrow at noon</abbr>

but that's not a suitable machine readable format.

Microformats have properly used <abbr> to expand prosaic dates, but the 
syntax has been friendly to neither screen readers nor title popups.  
So, the compromise is to have an expansion that's friendly to both 
screen readers and title popups, and is also machine readable.  

Splitting dates and time into separate <abbr> chunks accomplishes most 
of that.

  <abbr title="2008-06-30">tomorrow</abbr> <abbr title="12:00">at 
noon</abbr>


For those who think this violates the semantic intent of <abbr> I'm all 
in favour of a <span title="2008-07-03"> element.  This can be combined 
nicely with <abbr> for the screen reader and popup crowd:


  <div class="vevent">
   <span class="summary">
     Big blowout lunch party
   </span>
   <span class="dtstart">
    <span class="value" title="2008-07-03">
     <abbr title="July 3rd, 2008">
       tomorrow
     </abbr>
    </span>
    <span class="value" title="12:00">
      at noon
    </span>
   </span>
  </div>

(using the newly proposed date and time value excerpts)

I've put <abbr> inside <span> to speak/display the innermost title 
(this needs testing!)

--Bob.


-- -- -- --
Bob Jonkman <bjonkman@sobac.com>         http://sobac.com/sobac/    
SOBAC Microcomputer Services              Voice: +1-519-669-0388       
6 James Street, Elmira ON  Canada  N3B 1L5  Cel: +1-519-635-9413
Software   ---   Office & Business Automation   ---   Consulting


From karl at w3.org  Wed Jul  2 17:02:29 2008
From: karl at w3.org (Karl Dubost)
Date: Wed Jul  2 17:02:37 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <486BAE8E.90401@lebleu.org>
References: <36A319113CF910438942741C4727ADFF02132B0D@MOBY.Clarence.local><1214820941.3171.25.camel@localhost.localdomain><61CCA888-066E-4F3E-8023-85F0EFBB37B5@adactio.com><28AAF834-9517-42AF-9FD0-982891C9AA50@ben-ward.co.uk><36A13CC9-03D2-46B6-AA3E-5DBDAFB7940A@adactio.com><FAFFA1A4-1EF8-4CE0-8DFB-300ECFB9780D@randomchaos.com><AEF59CBB-82C7-41BD-B04A-340DDF459372@adactio.com><3BCE3C9D-2E84-4DFF-AD18-891C8CB492FB@randomchaos.com>	<ae2b2ba80806302212o7530876fxdd46708c0ce64dcd@mail.gmail.com><36A319113CF910438942741C4727ADFF02132F1F@MOBY.Clarence.local>	<486A54DD.2030105@lebleu.org>
	<001401c8dc18$327bf750$116bacca@COMCEN> <486BAE8E.90401@lebleu.org>
Message-ID: <A8919C28-FE1D-49FB-9D2F-DD88550F4EF2@w3.org>


Le 3 juil. 2008 ? 01:36, Guillaume Lebleu a ?crit :
> In other words, if I want to write my date in French in an en-us  
> html document, I'd have to attach lang="fr" to my date or its  
> containing content,
[?]
> Do you still see this as dangerous practice?


not dangerous but unpractical in the case of editions through web  
forms. Because of the state of art of browser implementations, there  
is no real and interoperable editing tool in the browser context. I  
guess it's one of the major blows for interesting authoring on the  
Web, now.

-- 
Karl Dubost - W3C
http://www.w3.org/QA/
Be Strict To Be Cool







From zen at zenpsycho.com  Wed Jul  2 19:04:44 2008
From: zen at zenpsycho.com (Breton Slivka)
Date: Wed Jul  2 19:04:48 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <17993138.377781215029040996.JavaMail.flurry@web1>
References: <17993138.377781215029040996.JavaMail.flurry@web1>
Message-ID: <ae2b2ba80807021904q1f89c26amab10eae919684966@mail.gmail.com>

This is not internationalization of metadata, this is
internationalization of Data, as in Content. We are talking about a
date format that will be read out to users not just in the US, but
potentially anywhere in the world. I honestly believe the "bloat" to
parsers would be significant. Particularly if our precious parser
authors are not incompetant, and we hope they are not. Please note
that many programs (Excel is one example off the top of my head)
provides exactly this type date parsing dependant on locale. Many more
programs and operating systems provide services for the generation of
Locale specific dates. The ecmascript standard includes such a
facility for both generation, and parsing of locale specific dates.
Ecmascript parsers must be light enough to work on a mobile device
with a browser.

I hope I have been persuasive in demonstrating that more sophisticated
parsers will be necessary if we are to satisfy the "No Information
Hiding" and "Humans First, Machines Second" principles of the
microformat community. I find it frustrating that we still have people
being sensitive about the bloat.

I offer the challenge to those developers: If you sincerely believe
that simple internationalized date parsing is an unsolvable or
difficult problem (which, as I have pointed out has been solved
numerous times already, with two examples), please present your
evidence. Why is avoiding this work more important than Accessibility?
Why is avoiding this work more important than avoiding hidden
metadata?





On Thu, Jul 3, 2008 at 6:04 AM, Ameer Dawood <ameer1234567890@gmail.com> wrote:
> Hi G,
> Internationalization of metadata is a bad and ineffective concept. It would not only result bloat in pharsers, but also bloat in the data format itself. You are proposing to internationalize the machine readable date which is metadata (and not content, in this case). A very clear example would be; if we are to internationalize CSS then "border-color" would become "border-colour" in en-uk. It's like proposing this change with a lang attribute/element.
>
> Ameer
>
> _____________________________
> Sent from my phone using flurry - Get free mobile email and news at: http://www.flurry.com
>
> --- Original Message ---
> Date: Wed Jul 02 09:43:00 PDT 2008
> From: Guillaume Lebleu <guillaume@lebleu.org>
> To: Microformats Discuss <microformats-discuss@microformats.org>
> Subject: Re: [uf-discuss] Human and machine readable data format
> ---
>
> Michael MD wrote:
>> Allowing language conventions for date parsing to be determined by anything "global" sounds a bit dangerous to me.
>>
>> Someone might post on a shared blog/forum site in a different country and mark it up in a way that does not match a lang attribute somewhere else on the page!
>>
>> also - who is going to say that all replies to the post or comments that might also appear on that same page are going to follow the same language rules
> Sorry if I didn't express myself clearly. What I meant here was that a lang="..." attribute on the element of class vevent or dstart is recommended at all times (to deal with the very ambiguity you are referring to), but is optional to comply with DRY. If not present, its value may be inferred from the closest containing/ancestor element with a lang attribute, for instance a lang attribute value at the level of the html element.
>
> In other words, if I want to write my date in French in an en-us html document, I'd have to attach lang="fr" to my date or its containing content, but if I want to write my date in American English in the same document, I don't have to attach lang="en-us", although it wouldn't hurt to.
>
> Do you still see this as dangerous practice?
>
> G
> _______________________________________________
> microformats-discuss mailing list
> microformats-discuss@microformats.org
> http://microformats.org/mailman/listinfo/microformats-discuss
>
> _______________________________________________
> microformats-discuss mailing list
> microformats-discuss@microformats.org
> http://microformats.org/mailman/listinfo/microformats-discuss
>
From zen at zenpsycho.com  Wed Jul  2 19:06:24 2008
From: zen at zenpsycho.com (Breton Slivka)
Date: Wed Jul  2 19:06:28 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <ae2b2ba80807021904q1f89c26amab10eae919684966@mail.gmail.com>
References: <17993138.377781215029040996.JavaMail.flurry@web1>
	<ae2b2ba80807021904q1f89c26amab10eae919684966@mail.gmail.com>
Message-ID: <ae2b2ba80807021906m3df386e7x4a1c61ef116f130d@mail.gmail.com>

"I honestly believe the "bloat" to
> parsers would be significant"

sorry, I meant  "I Honestly believe the 'bloat' to parsers would _not_
be significant"
From xbadosa at gmail.com  Thu Jul  3 01:09:31 2008
From: xbadosa at gmail.com (Xavier Badosa)
Date: Thu Jul  3 01:09:36 2008
Subject: [uf-discuss] Current state of grouping proposal
In-Reply-To: <486BABBB.11260.1234965@bjonkman.sobac.com>
References: <73b889410807010908m162c2117v907d9d56b9d228d8@mail.gmail.com>
	<486BABBB.11260.1234965@bjonkman.sobac.com>
Message-ID: <73b889410807030109g8608cd8k7070773ba2a656c1@mail.gmail.com>

> I think the grouping mechanism can be accomplished with XOXO,

XOXO as it is now (or as I understand it) is based on list elements
(ol ul dl), and these are not suited for "grouping" purposes, not in
my sense at least: maybe I should say "classing" instead of "grouping"
because in my meaning the idea of inheritance is important.

List elements don't allow to associate data to the group itself. In my
previous example,

>> <div class="hset">
>>   <h1 class="item"><a href="IMDB_URI" class="fn url">The Godfather
>>   II</a></h1> <div class="hreview">
>>      <blockquote class="description">The best!</blockquote>
>>      <p class="reviewer vcard"><em class="fn">Some guy</em></p>
>>   </div>
>>   <div class="hreview">
>>      <blockquote class="description">Soooooo good!</blockquote>
>>      <p class="reviewer vcard"><em class="fn">Enthusiastic
>>      girl</em></p>
>>   </div>

"item" is associated with the group ("hset"), telling implicitly the
machine that it must be replicated for every member ("hreview") of the
"group" (or "class"). It's a sort of include-pattern mechanism that
happens to follow the principle of adapting to current behaviors in
the publication of reviews on webpages. I think you can't do that with
XOXO.

X.

On Wed, Jul 2, 2008 at 10:24 PM, Bob Jonkman <bjonkman@sobac.com> wrote:
> I think the grouping mechanism can be accomplished with XOXO,
> http://microformats.org/wiki/xoxo
>
> --Bob.
>
>>>> 1 Jul 2008 18:08  Xavier Badosa <microformats-
> discuss@microformats.org>  >>>
>
>> I'm a little confused about the current state of the grouping
>> proposal. I'm not sure even if the uf-community is working on a
>> general solution (a microformat for grouping any sort of items) (+1
>> vote) or a particular solution for some of the existing microformats
>> (0 votes).
>>
>> I think some sort of grouping is needed in hReview if we want to
>> follow the principle of adapting to current behaviors and usage
>> patterns. Usually, webpages include more than one review for a single
>> item. To solve this, hReview forces us:
>>
>> 1) to repeat an unnecessary hidden item for every hreview (this
>> somehow violates the hidden (meta)data principle);
>>
>> or
>>
>> 2) to use the include-pattern (empty anchor, accessibility issues).
>>
>> A grouping mechanism could come to the rescue. Something like:
>>
>> <div class="hset">
>>   <h1 class="item"><a href="IMDB_URI" class="fn url">The Godfather
>>   II</a></h1> <div class="hreview">
>>      <blockquote class="description">The best!</blockquote>
>>      <p class="reviewer vcard"><em class="fn">Some guy</em></p>
>>   </div>
>>   <div class="hreview">
>>      <blockquote class="description">Soooooo good!</blockquote>
>>      <p class="reviewer vcard"><em class="fn">Enthusiastic
>>      girl</em></p>
>>   </div>
>> </div>
>>
>> could be interpreted by a parser that the same item should be
>> associated with every hreview. In fact, a grouping microformat would
>> be an alternative (easy to parse) include-pattern mechanism.
>>
>> X.
>
>
> -- -- -- --
> Bob Jonkman <bjonkman@sobac.com>         http://sobac.com/sobac/
> SOBAC Microcomputer Services              Voice: +1-519-669-0388
> 6 James Street, Elmira ON  Canada  N3B 1L5  Cel: +1-519-635-9413
> Software   ---   Office & Business Automation   ---   Consulting
>
>
> _______________________________________________
> microformats-discuss mailing list
> microformats-discuss@microformats.org
> http://microformats.org/mailman/listinfo/microformats-discuss
>
From danbri at danbri.org  Thu Jul  3 02:04:10 2008
From: danbri at danbri.org (Dan Brickley)
Date: Thu Jul  3 02:04:17 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <ae2b2ba80807021904q1f89c26amab10eae919684966@mail.gmail.com>
References: <17993138.377781215029040996.JavaMail.flurry@web1>
	<ae2b2ba80807021904q1f89c26amab10eae919684966@mail.gmail.com>
Message-ID: <486C960A.5080305@danbri.org>

Breton Slivka wrote:

> I offer the challenge to those developers: If you sincerely believe
> that simple internationalized date parsing is an unsolvable or
> difficult problem (which, as I have pointed out has been solved
> numerous times already, with two examples), please present your
> evidence. Why is avoiding this work more important than Accessibility?
> Why is avoiding this work more important than avoiding hidden
> metadata?

The examples you gave (ecmascript, spreadsheets) relate to the 
interpretation of a single simple date string. Much of the discussion 
here has instead been about the interpretation of marked up paragraphs 
of natural language prose where dates are mentioned. The former is a big 
enough job, as you point out. But the latter is a substantially larger 
undertaking.

Imagine the English language permutations of "Tuesday the forteenth of 
July, next year" in terms of word order. Then allow for all natural 
languages (in all written scripts). And don't forget we use a variety of 
calendars. Big job. In theory it could be attempted; but the culture 
around here is averse to 'theoretical' solutions.

While there is value in minimising "hidden metadata", this isn't an all 
or nothing decision. Having the data within the HTML document itself is 
   already a big win in many cases, compared to putting it in a separate 
XML file. Having the data local to the paragraph within the HTML 
document (rather than in the head section) is also a major achievement 
w.r.t. maintainability. Both of these factors reduce the hiddenness of 
data; putting info in an attribute is not the end of the world. Given 
the other tradeoffs, I think it should be seriously considered.

cheers,

Dan

--
http://danbri.org/
From zen at zenpsycho.com  Thu Jul  3 05:39:32 2008
From: zen at zenpsycho.com (Breton Slivka)
Date: Thu Jul  3 05:39:36 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <486C960A.5080305@danbri.org>
References: <17993138.377781215029040996.JavaMail.flurry@web1>
	<ae2b2ba80807021904q1f89c26amab10eae919684966@mail.gmail.com>
	<486C960A.5080305@danbri.org>
Message-ID: <ae2b2ba80807030539ga8e615ap43be5fa09776aa56@mail.gmail.com>

On Thu, Jul 3, 2008 at 7:04 PM, Dan Brickley <danbri@danbri.org> wrote:
> Breton Slivka wrote:
>
>> I offer the challenge to those developers: If you sincerely believe
>> that simple internationalized date parsing is an unsolvable or
>> difficult problem (which, as I have pointed out has been solved
>> numerous times already, with two examples), please present your
>> evidence. Why is avoiding this work more important than Accessibility?
>> Why is avoiding this work more important than avoiding hidden
>> metadata?

> Imagine the English language permutations of "Tuesday the forteenth of July,
> next year" in terms of word order. Then allow for all natural languages (in
> all written scripts). And don't forget we use a variety of calendars. Big
> job. In theory it could be attempted; but the culture around here is averse
> to 'theoretical' solutions.
>

Once again this straw man is trotted out. Who is discussing this type
of solution other than to specifically discredit the approach as too
hard?

I certainly am not suggesting this kind of wide ranging natural
language parser. I haven't seen anyone else seriously suggesting it
It's a foolish undertaking, and it's obviously a foolish undertaking.
Then WHY OH WHY does this keep being brought up as though it were
being seriously discussed? Where does this idea keep popping out from?

Let me give an example in pseudocode of a parser that would work, and
would be simple to write, and whose format could be read by a screen
reader.

function parser ( datestring, locale ) {

  en-months = [January, February, March, April, May, June, July,
August, September, October, November, December]

  if locale === "en-us"
       dateparse[month, day, year] = regex(datestring, "([A-Za-z]+)
([1-3]?[0-9])s|n|r|tt|d|h, ([0-9]{1, 4}));

  if locale === "en-au"
       dateparse[day, month, year] = regex(datestring,
"([1-3]?[0-9])s|n|r|tt|d|h ([A-Za-z]+), ([0-9]{1, 4}));
  if locale === "en-uk"
       dateparse[day, month, year] = regex(datestring,
"([1-3]?[0-9])s|n|r|tt|d|h ([A-Za-z]+), ([0-9]{1, 4}));

  if locale.contains("en")
       dateparse.month = en-months.indexOf(dateparse.month);

  return dateparse AS [year, month, day];

}


This is a simple example. There are likely better techniques for doing
this than regexes, (or not) but the point is, that you can make a
human READABLE format without having to cover the whole spectrum of
human expression. Instead, you have ONE precise format for US dates,
ONE precise format for UK dates, ONE precise format for japanese
dates, etc, etc.  You stick this format of date in the title of an
ABBR, and you can say whatever you want about the date in whatever
language you like in the contents of the ABBR. The parser shouldn't
care about the contents. IT's just looking at the title. IT already
is. The only change from the current pattern is that we'd be using a
less geeky and obscure format than ISO-8601. The lang attribute of the
ABBR element provides the format in use.

Honestly how difficult is it for a parser author to collect one format
for each locale? I've seen far more heroic efforts on simpler things.
How difficult is it for content publishers to learn ONE format? (The
one for their own locale) ?
How difficult is it to ask content authors to learn a format like
this? We're already asking them to learn a more difficult format!

Yes it's more complicated than parsing ISO 8601. But it's not boiling
the ocean. This isn't a binary decision we're facing. It's not a
choice between "I could implement it in an hour" level of simplicity
and "Human level" AI. Comprimise has to be made if we are to make any
progress.
From qidydl at gmail.com  Thu Jul  3 07:04:51 2008
From: qidydl at gmail.com (David O)
Date: Thu Jul  3 07:04:55 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <ae2b2ba80807030539ga8e615ap43be5fa09776aa56@mail.gmail.com>
References: <17993138.377781215029040996.JavaMail.flurry@web1>
	<ae2b2ba80807021904q1f89c26amab10eae919684966@mail.gmail.com>
	<486C960A.5080305@danbri.org>
	<ae2b2ba80807030539ga8e615ap43be5fa09776aa56@mail.gmail.com>
Message-ID: <ee0909a60807030704o18c2d81dnc463ecf6635850a8@mail.gmail.com>

On Thu, Jul 3, 2008 at 8:39 AM, Breton Slivka <zen@zenpsycho.com> wrote:
> On Thu, Jul 3, 2008 at 7:04 PM, Dan Brickley <danbri@danbri.org> wrote:
>> Breton Slivka wrote:
>>
>>> I offer the challenge to those developers: If you sincerely believe
>>> that simple internationalized date parsing is an unsolvable or
>>> difficult problem (which, as I have pointed out has been solved
>>> numerous times already, with two examples), please present your
>>> evidence. Why is avoiding this work more important than Accessibility?
>>> Why is avoiding this work more important than avoiding hidden
>>> metadata?

> This is a simple example. There are likely better techniques for doing
> this than regexes, (or not) but the point is, that you can make a
> human READABLE format without having to cover the whole spectrum of
> human expression. Instead, you have ONE precise format for US dates,
> ONE precise format for UK dates, ONE precise format for japanese
> dates, etc, etc.  You stick this format of date in the title of an
> ABBR, and you can say whatever you want about the date in whatever
> language you like in the contents of the ABBR. The parser shouldn't
> care about the contents. IT's just looking at the title. IT already
> is. The only change from the current pattern is that we'd be using a
> less geeky and obscure format than ISO-8601. The lang attribute of the
> ABBR element provides the format in use.

http://en.wikipedia.org/wiki/List_of_languages_by_name
http://en.wikipedia.org/wiki/List_of_ISO_639-2_codes
http://en.wikipedia.org/wiki/ISO_3166-1_alpha-2

Feel free to get started.  I'm sure you can start a wiki page with a
listing of language/region codes and the suggested date format for
each.  Since the current system handles every one of those languages
and countries/regions, it would only be logical to expect the same of
a suggested replacement.
From Scott at randomchaos.com  Thu Jul  3 09:03:08 2008
From: Scott at randomchaos.com (Scott Reynen)
Date: Thu Jul  3 09:03:14 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <486BCAD1.20690.19CB840@bjonkman.sobac.com>
References: <36A319113CF910438942741C4727ADFF02132B0D@MOBY.Clarence.local>,
	<ae2b2ba80806302212o7530876fxdd46708c0ce64dcd@mail.gmail.com>,
	<B42A4FFE-FEF8-404E-BA86-CC84CDAFB48A@randomchaos.com>
	<486BCAD1.20690.19CB840@bjonkman.sobac.com>
Message-ID: <2708D12A-BE7C-4E75-9028-AA23E2A0B19D@randomchaos.com>

On [Jul 2], at [ Jul 2] 4:37 , Bob Jonkman wrote:

>> The difference with ISO dates is we've previously defined them as
>> content; I'm suggesting that's a mistaken definition, as these dates
>> don't function as content in our reference standard iCalendar.
>
> I disagree.  In an appointment, the date IS the content.

*A* date is, but not the ISO date.  I think that's a subtle but  
important distinction we've overlooked too often.  You never see ISO  
dates presented to (nor entered by) people in applications that work  
with iCalendar.  They're only used to *produce* content.  I think HTML  
entities are probably the closest analogy.  The entities themselves  
are not the content; they're merely used to produce the content in  
various contexts (i.e. character sets).  We don't display entities; we  
only display the content they're used (by machines) to produce.  If we  
recognize that ISO dates are the same type of information ("metadata"  
or whatever you want to call it), then not displaying them isn't a  
compromise; it's just the obvious way to treat that type of  
information, the same way it's treated everywhere else.

Peace,
Scott

From guillaume at lebleu.org  Thu Jul  3 09:54:51 2008
From: guillaume at lebleu.org (Guillaume Lebleu)
Date: Thu Jul  3 09:54:55 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <486BCAD1.20690.19CB840@bjonkman.sobac.com>
References: <36A319113CF910438942741C4727ADFF02132B0D@MOBY.Clarence.local>,
	<ae2b2ba80806302212o7530876fxdd46708c0ce64dcd@mail.gmail.com>,
	<B42A4FFE-FEF8-404E-BA86-CC84CDAFB48A@randomchaos.com>
	<486BCAD1.20690.19CB840@bjonkman.sobac.com>
Message-ID: <486D045B.60009@lebleu.org>

Bob Jonkman wrote:
>    <span class="dtstart">
>     <span class="value" title="2008-07-03">
>      <abbr title="July 3rd, 2008">
>        tomorrow
>      </abbr>
>     </span>
Bob, assuming that screen readers only read out the content of abbr's 
@title, this solution looks promising (I've tried with VoiceOver, but 
the title content is ignored.)

The only problem of course is for human content authors who are 
effectively asked to write the same information 3 times in 3 different 
formats (not very DRY)! I just don't see myself doing that manually. For 
this to work, I'd expect at least an extra button in my HTML editor to 
tag "tomorrow" as 2008-07-23 by selecting a date in a calendar widget, 
or better, for my HTML editor to detect some of these date shortcuts 
automatically for me, and suggest machine data for them, which I can 
confirm before publishing, something similar to [1]. It seems to me that 
it would be a practical way to distribute the CPU-intensive task of 
semantically tagging Web content [3].

BTW, on the use of abbr for dates, I've researched a number of style 
guides such as [2]. It seems that "2/03/2005" is legitimate as an 
abbreviated form of the inline format "February 3, 2005".

So, <abbr title="February 3, 2005">2/03/2005</abbr> seems correct, but 
<abbr title="2005-02-03">February 3, 2005</abbr> isn't (at least 
according to the style guide below).

Guillaume

[1] http://wordpress.org/extend/plugins/yahoo-shortcuts/
[2] http://web.mit.edu/comdor/editguide/style-matters/date_time.html#dates
[3] http://gigaom.com/2008/07/02/the-real-reason-powerset-sold-out/

From jim at eatyourgreens.org.uk  Thu Jul  3 15:03:35 2008
From: jim at eatyourgreens.org.uk (Jim O'Donnell)
Date: Thu Jul  3 15:03:42 2008
Subject: [uf-discuss] hoard.it
Message-ID: <07843653-3749-4C33-97CF-95A4BAC93710@eatyourgreens.org.uk>

Hello,

This might be of interest to members of this group, as it deals with  
extracting data from semantic HTML. Prior to this year's Mashed  
Museum event at the University of Leicester, Dan Zambonini put  
together a prototype which aggregates data by spidering online museum  
catalogues:
http://hoardit.pbwiki.com/
It's a pretty fantastic demo of how information can be extracted from  
well-structured HTML, even before you think of putting microformats  
etc. on top.

In particular, it does a pretty good job of figuring out when an  
object was made:
http://feeds.boxuk.com/museums/object_100yrs.php
The date parser is based on some code Dan & I knocked together at  
Mashed Museum 2007, which  looks at strings like 'late Victorian',  
'early 20th Century', '4th January 1853' and so on, and converts them  
to machine-readable ISO dates.

Our original idea, which we never got round to actually implementing,  
was that this would be useful as a web service - you give it a  
string, it gives you a machine-parsable representation of that  
string. The recent discussion here about dates has made me wonder if  
such a web service woud be useful for microformats parsers. What do  
others think?

Cheers
Jim

Jim O'Donnell
jim@eatyourgreens.org.uk
http://eatyourgreens.org.uk
http://flickr.com/photos/eatyourgreens



From bjonkman at sobac.com  Sat Jul  5 12:07:30 2008
From: bjonkman at sobac.com (Bob Jonkman)
Date: Sat Jul  5 12:09:58 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <2708D12A-BE7C-4E75-9028-AA23E2A0B19D@randomchaos.com>
References: <36A319113CF910438942741C4727ADFF02132B0D@MOBY.Clarence.local>,
	<486BCAD1.20690.19CB840@bjonkman.sobac.com>,
	<2708D12A-BE7C-4E75-9028-AA23E2A0B19D@randomchaos.com>
Message-ID: <486F8E32.14417.482D3F8@bjonkman.sobac.com>



On 3 Jul 2008 at 10:03, Scott Reynen wrote:

> On [Jul 2], at [ Jul 2] 4:37 , Bob Jonkman wrote:
>
> > In an appointment, the date IS the content.
> 
> *A* date is, but not the ISO date.  I think that's a subtle but 
> important distinction we've overlooked too often.  You never see ISO 
> dates presented to (nor entered by) people in applications that work 
> with iCalendar.  They're only used to *produce* content.  I think HTML
>  entities are probably the closest analogy.  The entities themselves 
> are not the content; they're merely used to produce the content in 
> various contexts (i.e. character sets).  We don't display entities; we
>  only display the content they're used (by machines) to produce.  If
> we  recognize that ISO dates are the same type of information
> ("metadata"  or whatever you want to call it), then not displaying
> them isn't a  compromise; it's just the obvious way to treat that type
> of  information, the same way it's treated everywhere else.

In that case it should be acceptable avoid the use of <abbr> 
altogether, so that neither sighted nor hearing people have to put 
up with seeing or hearing the metadata. 

  <span class="dtstart" title="2008-07-06">
    tomorrow
  </span>

The title text still shows a popup in my browser (FF3), but I don't 
believe screen readers speak it.  It also doesn't distract sighted 
users since a <span> element is by default undecorated, while <abbr> 
shows with a dotted underline in FF3.  However, styling is dependent 
on the browser implemention and can always be specified with CSS 
anyway.



I believe that an ISO date is a valid expansion of prosaic dates, so 
that <span> is less semantic than using 

  <abbr class="dtstart" title="2008-07-06">
    tomorrow
  </abbr>

but that debate appears to have no resolution and I'm willing to 
cede just to move along.



--Bob.
-- -- -- --
Bob Jonkman <bjonkman@sobac.com>         http://sobac.com/sobac/
SOBAC Microcomputer Services              Voice: +1-519-669-0388
6 James Street, Elmira ON  Canada  N3B 1L5  Cel: +1-519-635-9413
Software   ---   Office & Business Automation   ---   Consulting


From bjonkman at sobac.com  Sat Jul  5 13:15:44 2008
From: bjonkman at sobac.com (Bob Jonkman)
Date: Sat Jul  5 13:17:09 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <486D045B.60009@lebleu.org>
References: <36A319113CF910438942741C4727ADFF02132B0D@MOBY.Clarence.local>,
	<486BCAD1.20690.19CB840@bjonkman.sobac.com>,
	<486D045B.60009@lebleu.org>
Message-ID: <486F9E30.15175.4C14ADD@bjonkman.sobac.com>



On 3 Jul 2008 at 9:54, Guillaume Lebleu wrote:

> Bob, assuming that screen readers only read out the content of abbr's
> @title, this solution looks promising (I've tried with VoiceOver, but
> the title content is ignored.)
> 
> The only problem of course is for human content authors who are 
> effectively asked to write the same information 3 times in 3 different
> formats (not very DRY)! 

Agreed.  So, based on Scott Reynen's observation that these 'date 
entities' don't need to be displayed (either visually or aurally) I 
propose that we dispense with the <abbr> tag altogether (and, IMHO, 
the semantic value of the date expansion).  We move on, the BBC 
publishes hCalendar again, and someone gets around to developing a 
genealogy microformat now that the date issue is settled.


> BTW, on the use of abbr for dates, I've researched a number of style
> guides such as [2]. It seems that "2/03/2005" is legitimate as an
> abbreviated form of the inline format "February 3, 2005".

> [2]
> http://web.mit.edu/comdor/editguide/style-matters/date_time.html#dates

I'm not sure that any particular style guide is authoritative. I had 
a look around some other sources, and while they mostly agree 
there's enough variation to make any date-parser author shudder in 
fear.  A most disturbing trend is the use of spelled out dates, eg. 
"the sixth of July 2008" [1].



A humourous aside:  I create computer systems validation 
documentation for a European consulting firm.  Oddly enough, they've 
decided on the American date format MM/DD/YY for all their systems 
documentation, not the ISO date standard.  My documents are 
constantly being returned to me for invalid dates -- my first 
inclination is to always write the date as YYYY-MM-DD, and 
DD/MM/YYYY as a second inclination.  Even MM/DD/YYYY gets returned 
as an invalid date.  Participation in the Microformats community 
hasn't helped my professional career :-)

--Bob.


[1] National Geographic Style Manual: DATES 

http://stylemanual.ngs.org/Intranet/styleman.nsf/024cc3c609acdb02852
56648004af446/f0d90cec94e539c78525668a006dacd0?OpenDocument

or http://natgeodatestyle.notlong.com for the word-wrap challenged.

-- -- -- --
Bob Jonkman <bjonkman@sobac.com>         http://sobac.com/sobac/
SOBAC Microcomputer Services              Voice: +1-519-669-0388
6 James Street, Elmira ON  Canada  N3B 1L5  Cel: +1-519-635-9413
Software   ---   Office & Business Automation   ---   Consulting


From lists at ben-ward.co.uk  Sun Jul  6 06:22:08 2008
From: lists at ben-ward.co.uk (Ben Ward)
Date: Sun Jul  6 06:22:36 2008
Subject: [uf-discuss] Wiki Documentation of recent date-time discussion
Message-ID: <61AB090A-75B8-45C4-B582-AE461389B50E@ben-ward.co.uk>

Hi all,

Recently discussion of solutions to the datetime issues has been  
massive and become difficult to track the current state of issues and  
counterpoints as threads have become interleaved.

I have *attempted* to document the most recent points on the wiki,  
under the following pages:

   * http://microformats.org/wiki/datetime-design-pattern (most stuff)
   * http://microformats.org/wiki/hcalendar-issues#2008 (for the HTML5  
<time> element)
   * http://microformats.org/wiki/value-excerption-pattern-issues

I've credited the points added with the name of the post author. If  
you feel the part I've quoted doesn't fully represent your view,  
please edit it. I've done the best I can, but these threads have  
become very difficult to follow so I sincerely apologise if I've  
missed something, or lost track of something.

If a point is missing from the wiki threads, please add them so we  
have a reference.

Ideally, wiki discussion should not be a carbon copy of the mailing  
list logs. They should be a concise representation of issues, and each  
issue and its counterpoints should of course only be listed once. It's  
a reference. Someone new to discussions should be able to read the  
wiki and be up to speed on what is currently being discussed on other  
mediums in the community.

My edits today don't achieve that in the optimal way; they're largely  
quotes. It's better than no documentation at all, though.

Something which didn't really happen this week was follow up in  
documenting the key points of discussion on the wiki. When you are an  
advocate of a particular solution, please take the responsibility to  
make sure issues raised against it and the resolutions are accurately  
documented. Otherwise suggestions will be lost without documentation  
and inevitably repeated without reference.

Thank you,

Ben
From codepo8 at gmail.com  Mon Jul  7 03:40:46 2008
From: codepo8 at gmail.com (Christian Heilmann)
Date: Mon Jul  7 03:40:58 2008
Subject: [uf-discuss] Microformats search engine: virel
Message-ID: <4871F2AE.5040401@gmail.com>

I just got several automated emails from http://www.virel.org/index.php 
that they found uF of mine on sites and indexed them.

Does anybody know the people behind it? I am not sure if that is cool or 
creepy :)

Chris

From ameer1234567890 at gmail.com  Mon Jul  7 07:48:23 2008
From: ameer1234567890 at gmail.com (Ameer Dawood)
Date: Mon Jul  7 07:48:26 2008
Subject: [uf-discuss] Microformats search engine: virel
In-Reply-To: <4871F2AE.5040401@gmail.com>
References: <4871F2AE.5040401@gmail.com>
Message-ID: <19abcbf20807070748p262ade3bhcae2ebf57a4131ac@mail.gmail.com>

Hi,

The site looks authentic and useful. I don't know about the emails. I
have added my blog to the site. Let's see what happens next.

Ameer

On Mon, Jul 7, 2008 at 4:40 PM, Christian Heilmann <codepo8@gmail.com> wrote:
> I just got several automated emails from http://www.virel.org/index.php that
> they found uF of mine on sites and indexed them.
>
> Does anybody know the people behind it? I am not sure if that is cool or
> creepy :)
>
> Chris
>
> _______________________________________________
> microformats-discuss mailing list
> microformats-discuss@microformats.org
> http://microformats.org/mailman/listinfo/microformats-discuss
>



-- 
Paul Lynde  - "I sang in the choir for years, even though my family
belonged to another church."
From ameer1234567890 at gmail.com  Mon Jul  7 07:50:52 2008
From: ameer1234567890 at gmail.com (Ameer Dawood)
Date: Mon Jul  7 07:50:56 2008
Subject: [uf-discuss] Microformats search engine: virel
In-Reply-To: <19abcbf20807070748p262ade3bhcae2ebf57a4131ac@mail.gmail.com>
References: <4871F2AE.5040401@gmail.com>
	<19abcbf20807070748p262ade3bhcae2ebf57a4131ac@mail.gmail.com>
Message-ID: <19abcbf20807070750g20e3953cs36b1ca36f68b89fa@mail.gmail.com>

Hi,

look what happened now. I jusst got the same kind of email. Looks like
they are sending emails to email addresses found in hCards. *It's just
like spam. I just dropped them a mail saying so.

Ameer


On Mon, Jul 7, 2008 at 8:48 PM, Ameer Dawood <ameer1234567890@gmail.com> wrote:
> Hi,
>
> The site looks authentic and useful. I don't know about the emails. I
> have added my blog to the site. Let's see what happens next.
>
> Ameer
>
> On Mon, Jul 7, 2008 at 4:40 PM, Christian Heilmann <codepo8@gmail.com> wrote:
>> I just got several automated emails from http://www.virel.org/index.php that
>> they found uF of mine on sites and indexed them.
>>
>> Does anybody know the people behind it? I am not sure if that is cool or
>> creepy :)
>>
>> Chris
>>
>> _______________________________________________
>> microformats-discuss mailing list
>> microformats-discuss@microformats.org
>> http://microformats.org/mailman/listinfo/microformats-discuss
>>
>
>
>
> --
> Paul Lynde  - "I sang in the choir for years, even though my family
> belonged to another church."
>



-- 
George Burns  - "Don't stay in bed, unless you can make money in bed."
From jmyers at visi.com  Mon Jul  7 09:11:11 2008
From: jmyers at visi.com (Jay Myers)
Date: Mon Jul  7 09:11:19 2008
Subject: [uf-discuss] Reviving hProduct
Message-ID: <web-10093556@mailback2.g2host.com>

All,

It seems like the hProduct microformat hasn't seen a lot 
of revisions since it's initial brainstorming in 2006 
(feel free to correct me on this if there are current 
efforts taking place :-) ). I'm attempting to revise the 
schema for use in an upcoming August / September project 
release. I have taken the current brainstorming schema and 
added on some new items. I would like to open this up to 
discussion and move this format forward, and any 
assistance the community would be able to provide would be 
helpful.

Altered schema: 
http://jay.beweep.com/hproduct/hproduct-schema.txt
Unstyled HTML example: 
http://jay.beweep.com/hproduct/hproduct-example.html

The altered schema:

hProduct
     * version. optional. text.

     * name. required.

     * image. optional. IMG element or rel='image'. could 
be further refined as image type ( thumb || full, photo || 
illo).

     * description. optional. could be denoted as 
'summary' or 'extended'.

     * brand. text | hCard

     * uri. optional. URI to product page, href could 
contain rel='product'.

     * price. optional. could be further refined as 
specific type (sale || regular || msrp || clearance || 
savings). should follow currency format.

     * p-v. optional. opens up possibilities for custom 
property-value pairs in more complex examples.

             o property. required. property types could 
include:

             - artist

             - author

             - released - hCal event for date of release

             - upc

             - isbn

             - sku

             - sn

             - vin

             - batch

             - size

             - color

             - uid - unique id, item number as provided by 
manufacturer or retailer

             - offer

             - others. possibly around product specs, 
features.

         o value. required. (label may be implied)
     * availability. optional.

     * shipping. optional. shipping messaging.

     * reviews. text | hReview

     * buy. optional. purchase URL.

Thanks,

Jay Myers

(e)jmyers@visi.com

From lists at ben-ward.co.uk  Mon Jul  7 09:37:48 2008
From: lists at ben-ward.co.uk (Ben Ward)
Date: Mon Jul  7 09:37:56 2008
Subject: [uf-discuss] Re: Reviving hProduct
In-Reply-To: <web-10093556@mailback2.g2host.com>
References: <web-10093556@mailback2.g2host.com>
Message-ID: <F719E4C3-060B-4A90-9921-DA89108BF0BC@ben-ward.co.uk>

Hi Jay,

On 7 Jul 2008, at 17:11, Jay Myers wrote:

> All,
>
> It seems like the hProduct microformat hasn't seen a lot of  
> revisions since it's initial brainstorming in 2006 (feel free to  
> correct me on this if there are current efforts taking place :-) ).  
> I'm attempting to revise the schema for use in an upcoming August /  
> September project release. I have taken the current brainstorming  
> schema and added on some new items. I would like to open this up to  
> discussion and move this format forward, and any assistance the  
> community would be able to provide would be helpful.

It's great that you're keen to take a lead on further brainstorming.  
Please conduct work on new microformats on the [uf-new] mailing list,  
rather than discuss. Thanks!

> Altered schema: http://jay.beweep.com/hproduct/hproduct-schema.txt
> Unstyled HTML example: http://jay.beweep.com/hproduct/hproduct-example.html

The old product brainstorm is discarded, so please edit the wiki.  
Either start a fresh brainstorm section on the current page, or work  
with the existing text.

> The altered schema:

For reference, much of the schema you describe there has been rolled  
into hListing, which whilst also technically a proposal, is more  
mature and has been implemented successfully by a number of people.  
That covers the price/merchant side of things.

The documentation for listing, and interating on it to reach draft  
also needs doing, I apologise for not following through my intent to  
take a lead on that. Lots of other ?f things have come up that always  
seem more urgent.

I'd suggest that product-specific fields focus on the product _item_,  
e.g. where you have .hListing > .item, or .hReview > .item, you could  
insert an ?hProduct? there, enhancing the semantics, and achieving  
listing products with prices through the formats being used in  
combination.

Regards,

Ben

(This post has been cross-posted to ?f-new. Please reply *only* to ?f- 
new)


From angus at pobox.com  Mon Jul  7 13:56:49 2008
From: angus at pobox.com (Angus McIntyre)
Date: Mon Jul  7 09:59:49 2008
Subject: [uf-discuss] Microformats search engine: virel
Message-ID: <3967.66.17.182.210.1215464209.squirrel@webmail.nomadcode.com>



Ameer Dawood wrote:
> look what happened now. I jusst got the same kind of email. Looks like
they are sending emails to email addresses found in hCards. *It's just
like spam. I just dropped them a mail saying so.

As a hardliner on this issue, my feeling is that any sentence that reads
"that's {like/almost like/a kind of/close to/etc} spam" can be reduced to
"that's spam" without loss of meaning or accuracy.

The issue of spam and microformats is a dead horse that's already taken a
fair amount of punishment, and I think the words "out of scope" were used
last time the question came up. Still, I wanted to add a couple of
comments.

As far as consumers of microformats are concerned, I think that any system
that generates automated mail to an address included in an hCard has
crossed the line. Outside various rather improbable scenarios, there's no
justification for doing this.

As far as users of microformats are concerned, the choice is (a) include
your address and expect to get spam, (b) leave your address out, or (c)
obscure your address. I currently favor options (b) and (c). For (c), I
actually recommend having a human-intelligible version (e.g. 'myaddress at
example dot com') and then - if you like - having a run-on-document-ready
Javascript function to convert it to a mailto: link for human consumption.

Crawlers - both benign and malign - typically don't execute JS, so they
won't see the actual email address. I don't think that's a bad thing for
reasons indicated above. Tools that actually run in a browser context,
such as Operator, should get the right result (Operator does).

Angus



-- 


From paul.kinlan at gmail.com  Mon Jul  7 10:23:03 2008
From: paul.kinlan at gmail.com (Paul Kinlan)
Date: Mon Jul  7 10:23:05 2008
Subject: [uf-discuss] Microformats search engine: virel
In-Reply-To: <1f8270600807071022h3af4bc3ch3317560d1f4b33c9@mail.gmail.com>
References: <3967.66.17.182.210.1215464209.squirrel@webmail.nomadcode.com>
	<1f8270600807071022h3af4bc3ch3317560d1f4b33c9@mail.gmail.com>
Message-ID: <1f8270600807071023ib717620hf938f492587e4a87@mail.gmail.com>

Hi,

Personally, I don't get the use of microformats if the data is
obscured.  I mean, how far would you go to retain your privacy with
hCard?  There has to be a point where it becomes useless.  So, I
suppose that puts me in favour of case A.

Paul


> 2008/7/7 Angus McIntyre <angus@pobox.com>:
>>
>>
>> Ameer Dawood wrote:
>> > look what happened now. I jusst got the same kind of email. Looks like
>> they are sending emails to email addresses found in hCards. *It's just
>> like spam. I just dropped them a mail saying so.
>>
>> As a hardliner on this issue, my feeling is that any sentence that reads
>> "that's {like/almost like/a kind of/close to/etc} spam" can be reduced to
>> "that's spam" without loss of meaning or accuracy.
>>
>> The issue of spam and microformats is a dead horse that's already taken a
>> fair amount of punishment, and I think the words "out of scope" were used
>> last time the question came up. Still, I wanted to add a couple of
>> comments.
>>
>> As far as consumers of microformats are concerned, I think that any system
>> that generates automated mail to an address included in an hCard has
>> crossed the line. Outside various rather improbable scenarios, there's no
>> justification for doing this.
>>
>> As far as users of microformats are concerned, the choice is (a) include
>> your address and expect to get spam, (b) leave your address out, or (c)
>> obscure your address. I currently favor options (b) and (c). For (c), I
>> actually recommend having a human-intelligible version (e.g. 'myaddress at
>> example dot com') and then - if you like - having a run-on-document-ready
>> Javascript function to convert it to a mailto: link for human consumption.
>>
>> Crawlers - both benign and malign - typically don't execute JS, so they
>> won't see the actual email address. I don't think that's a bad thing for
>> reasons indicated above. Tools that actually run in a browser context,
>> such as Operator, should get the right result (Operator does).
>>
>> Angus
>>
>>
>>
>> --
>>
>>
>> _______________________________________________
>> microformats-discuss mailing list
>> microformats-discuss@microformats.org
>> http://microformats.org/mailman/listinfo/microformats-discuss
>
From hayes at appozite.com  Mon Jul  7 10:27:48 2008
From: hayes at appozite.com (Hayes Davis)
Date: Mon Jul  7 10:27:51 2008
Subject: [uf-discuss] Reviving hProduct
In-Reply-To: <web-10093556@mailback2.g2host.com>
References: <web-10093556@mailback2.g2host.com>
Message-ID: <cd04ae140807071027j5c76bd14xeee154da2d0836d9@mail.gmail.com>

I'm glad to see someone else showing an interest in this. I'd love to
see a product microformat revived as well for an application I'm
working on.

We might want to move this discussion to microformats-dev but I'm not
sure... Anyway, some comments on your proposed changes:
1) The rel="product". Does that really describe the relationship
between this instance/description of the product and the product page?
I'm not sure it does. I get what you're saying though, I think, which
is that this is the "canonical product page".
2) I think the change from an msrp attribute described on the wiki to
a "price" attribute makes sense. I'm not sure about including
"savings" in there. It's not technically the price of the item. It's
more of an "adjustment".
3) What would you see as a value for the availability? I see in your
example this is a text description of how the user can acquire it
(e.g. "in store pickup"). When I initially read that attribute, it
made me think more of an inventory quantity. Might there need to be an
attribute for that as well?
4) As for the URL, I would think that might be better represented as a
rel "purchase" value indicating that the target URI is for purchasing
this item.

Hayes
From codepo8 at gmail.com  Mon Jul  7 10:31:13 2008
From: codepo8 at gmail.com (Christian Heilmann)
Date: Mon Jul  7 10:31:32 2008
Subject: [uf-discuss] Microformats search engine: virel
In-Reply-To: <3967.66.17.182.210.1215464209.squirrel@webmail.nomadcode.com>
References: <3967.66.17.182.210.1215464209.squirrel@webmail.nomadcode.com>
Message-ID: <487252E1.1060600@gmail.com>


> As far as users of microformats are concerned, the choice is (a) include
> your address and expect to get spam, (b) leave your address out, or (c)
> obscure your address. I currently favor options (b) and (c). For (c), I
> actually recommend having a human-intelligible version (e.g. 'myaddress at
> example dot com') and then - if you like - having a run-on-document-ready
> Javascript function to convert it to a mailto: link for human consumption.
>
> Crawlers - both benign and malign - typically don't execute JS, so they
> won't see the actual email address. I don't think that's a bad thing for
> reasons indicated above. Tools that actually run in a browser context,
> such as Operator, should get the right result (Operator does).
>
> Angus
>
>
>   
That's got nothing to do with microformats but when you really think 
that any obfuscation like bla dot domain is not indexed by spammers then 
you are in for a treat. There is no way to protect emails online without 
hurting usability or accessibility. Don't waste your time with 
JavaScript (de)obfuscation, it is a glass shield or - even closer - a 
pacifier button.

What you put in microformats you should be happy with to be put out 
there to be found, indexed and converted. Obfuscated microformats that 
expect the reader technology to convert it before turning it for example 
into a vcard are just a nuisance for the end user. This is about 
unearthing information we already publish and make easier to access and 
re-use it, which is the opposite of obfuscating.

From angus at pobox.com  Mon Jul  7 15:34:13 2008
From: angus at pobox.com (Angus McIntyre)
Date: Mon Jul  7 11:37:14 2008
Subject: [uf-discuss] Microformats search engine: virel
In-Reply-To: <487252E1.1060600@gmail.com>
References: <3967.66.17.182.210.1215464209.squirrel@webmail.nomadcode.com>
	<487252E1.1060600@gmail.com>
Message-ID: <4883.66.17.182.210.1215470053.squirrel@webmail.nomadcode.com>


Christian Heilmann wrote:
> That's got nothing to do with microformats ...

With due respect, I don't completely accept that. A case could be made
that factors that influence people's adoption of microformats are
legitimate topics for discussion. Uneasiness about the 'spammability' of
addresses published in hCard is a deterrent to full adoption of that
microformat for many users. While these considerations don't belong in the
spec, they can usefully be mentioned in texts about the spec, such as
'getting started' guides.

> ... when you really think
> that any obfuscation like bla dot domain is not indexed by spammers then
> you are in for a treat. There is no way to protect emails online without
> hurting usability or accessibility. Don't waste your time with
> JavaScript (de)obfuscation, it is a glass shield or - even closer - a
> pacifier button.

Again, I'm not in complete agreement with you. My experience - and I have
actually tested this, although not as rigorously or extensively as I'd
like - is that very few spammers seem to be doing much de-obfuscation, and
even trivial obfuscations _currently_ offer a good degree of protection.
However, I don't expect that state of affairs to last, so it's a moot
point.

> What you put in microformats you should be happy with to be put out
> there to be found, indexed and converted. Obfuscated microformats that
> expect the reader technology to convert it before turning it for example
> into a vcard are just a nuisance for the end user.

In the Javascript-based approach that I mentioned, the browser takes care
of everything, with no extra work needed by the reader. However, I concede
that that might not extend to screen readers (although choosing a sane,
human-readable representation for the basic form can help here).

> ... This is about unearthing information we already publish and
> make easier to access and re-use it, which is the opposite of
> obfuscating.

OK, so there's an implicit challenge here. For users who are unwilling to
expose their email address through hCard, what alternative mechanisms can
microformats support? Many website owners use mail forms instead of
publishing their email addresses. Is there a need for something like a
simple 'rel=contactform' microformat to signal the availability and
location of a mail contact form?

Angus

From brian.suda at gmail.com  Mon Jul  7 11:52:06 2008
From: brian.suda at gmail.com (Brian Suda)
Date: Mon Jul  7 11:52:10 2008
Subject: [uf-discuss] Microformats search engine: virel
In-Reply-To: <4883.66.17.182.210.1215470053.squirrel@webmail.nomadcode.com>
References: <3967.66.17.182.210.1215464209.squirrel@webmail.nomadcode.com>
	<487252E1.1060600@gmail.com>
	<4883.66.17.182.210.1215470053.squirrel@webmail.nomadcode.com>
Message-ID: <21e770780807071152q713b4878k616fc09bdaa02e78@mail.gmail.com>

2008/7/7 Angus McIntyre <angus@pobox.com>:
>
> Christian Heilmann wrote:
>> That's got nothing to do with microformats ...
>
> With due respect, I don't completely accept that. A case could be made
> that factors that influence people's adoption of microformats are
> legitimate topics for discussion. Uneasiness about the 'spammability' of
> addresses published in hCard is a deterrent to full adoption of that
> microformat for many users.

--- the argument is orthogonal to microformats because this is not
unique to microformats. Any time you add more semantic information to
your data it potentially increases the 'spammability' of it. This goes
for RDFa, eRDF, RDF, POSH, microformats, RSS and anything else might
come along in the future.

>> ... This is about unearthing information we already publish and
>> make easier to access and re-use it, which is the opposite of
>> obfuscating.
>
> OK, so there's an implicit challenge here. For users who are unwilling to
> expose their email address through hCard, what alternative mechanisms can
> microformats support? Many website owners use mail forms instead of
> publishing their email addresses. Is there a need for something like a
> simple 'rel=contactform' microformat to signal the availability and
> location of a mail contact form?

You could simply use class="URL" with a new rel-value. You can also
mark-up your Chat profiles with their specific protocols, aim: msn:
jabber: etc. Other people only vend the data after someone has
authenticated themselves, so the microformats are NOT available to the
general public, but instead to a white-list of contacts.

-brian

-- 
brian suda
http://suda.co.uk
From codepo8 at gmail.com  Mon Jul  7 11:56:28 2008
From: codepo8 at gmail.com (Christian Heilmann)
Date: Mon Jul  7 11:56:41 2008
Subject: [uf-discuss] Microformats search engine: virel
In-Reply-To: <4883.66.17.182.210.1215470053.squirrel@webmail.nomadcode.com>
References: <3967.66.17.182.210.1215464209.squirrel@webmail.nomadcode.com>	<487252E1.1060600@gmail.com>
	<4883.66.17.182.210.1215470053.squirrel@webmail.nomadcode.com>
Message-ID: <487266DC.2090508@gmail.com>

... This is about unearthing information we already publish and
>> make easier to access and re-use it, which is the opposite of
>> obfuscating.
>>     
>
> OK, so there's an implicit challenge here. For users who are unwilling to
> expose their email address through hCard, what alternative mechanisms can
> microformats support? Many website owners use mail forms instead of
> publishing their email addresses. Is there a need for something like a
> simple 'rel=contactform' microformat to signal the availability and
> location of a mail contact form?
>
> Angus
>   
Again: we are marking up content that is already published. If that 
person is taking steps to prevent the email or contact form to the 
available there is nothing microformats (or well, hcard for email) can 
do for that person.

I love people that use email forms to make sure spammers can't get their 
mails, especially those that don't protect their forms against XSS and 
SQL injection and thus become a spam hub themselves :)

Microformats are nothing that needs to be "sold". A person that is 
unhappy to disclose information on the web will certainly not get them 
anyways.



From csarven at gmail.com  Mon Jul  7 12:01:20 2008
From: csarven at gmail.com (Sarven Capadisli)
Date: Mon Jul  7 12:01:25 2008
Subject: [uf-discuss] Microformats search engine: virel
In-Reply-To: <4883.66.17.182.210.1215470053.squirrel@webmail.nomadcode.com>
References: <3967.66.17.182.210.1215464209.squirrel@webmail.nomadcode.com>
	<487252E1.1060600@gmail.com>
	<4883.66.17.182.210.1215470053.squirrel@webmail.nomadcode.com>
Message-ID: <d4154bcf0807071201t2ce7dcecu164f903d65f58853@mail.gmail.com>

On Mon, Jul 7, 2008 at 6:34 PM, Angus McIntyre <angus@pobox.com> wrote:
>
> Christian Heilmann wrote:
>> That's got nothing to do with microformats ...
>
> With due respect, I don't completely accept that. A case could be made
> that factors that influence people's adoption of microformats are
> legitimate topics for discussion. Uneasiness about the 'spammability' of
> addresses published in hCard is a deterrent to full adoption of that
> microformat for many users. While these considerations don't belong in the
> spec, they can usefully be mentioned in texts about the spec, such as
> 'getting started' guides.
>
>> ... when you really think
>> that any obfuscation like bla dot domain is not indexed by spammers then
>> you are in for a treat. There is no way to protect emails online without
>> hurting usability or accessibility. Don't waste your time with
>> JavaScript (de)obfuscation, it is a glass shield or - even closer - a
>> pacifier button.
>
> Again, I'm not in complete agreement with you. My experience - and I have
> actually tested this, although not as rigorously or extensively as I'd
> like - is that very few spammers seem to be doing much de-obfuscation, and
> even trivial obfuscations _currently_ offer a good degree of protection.
> However, I don't expect that state of affairs to last, so it's a moot
> point.
>
>> What you put in microformats you should be happy with to be put out
>> there to be found, indexed and converted. Obfuscated microformats that
>> expect the reader technology to convert it before turning it for example
>> into a vcard are just a nuisance for the end user.
>
> In the Javascript-based approach that I mentioned, the browser takes care
> of everything, with no extra work needed by the reader. However, I concede
> that that might not extend to screen readers (although choosing a sane,
> human-readable representation for the basic form can help here).
>

Actually, Christian is bang on with "There is no way to protect emails
online without hurting usability or accessibility."

I've documented a fair number of ways to obfuscate (depends on how you
interpret it) email addresses in source and they all have pros and
cons, and all are dependent on various factors [1].

>> ... This is about unearthing information we already publish and
>> make easier to access and re-use it, which is the opposite of
>> obfuscating.
>
> OK, so there's an implicit challenge here. For users who are unwilling to
> expose their email address through hCard, what alternative mechanisms can
> microformats support? Many website owners use mail forms instead of
> publishing their email addresses. Is there a need for something like a
> simple 'rel=contactform' microformat to signal the availability and
> location of a mail contact form?

I would also stress that this is not a problem that microformats
should be solving. The principal in solving something like this is
also not solely about "emails" but any data for that matter (e.g.,  Do
you want the "bad" guys to know your fn and street-address?)

[1] http://www.csarven.ca/hiding-email-addresses
From andr3.pt at gmail.com  Mon Jul  7 12:06:42 2008
From: andr3.pt at gmail.com (=?ISO-8859-1?Q?Andr=E9_Lu=EDs?=)
Date: Mon Jul  7 12:07:12 2008
Subject: [uf-discuss] Microformats search engine: virel
In-Reply-To: <21e770780807071152q713b4878k616fc09bdaa02e78@mail.gmail.com>
References: <3967.66.17.182.210.1215464209.squirrel@webmail.nomadcode.com>
	<487252E1.1060600@gmail.com>
	<4883.66.17.182.210.1215470053.squirrel@webmail.nomadcode.com>
	<21e770780807071152q713b4878k616fc09bdaa02e78@mail.gmail.com>
Message-ID: <dc1a17860807071206jee7f5d6q19b56b5843900789@mail.gmail.com>

On Mon, Jul 7, 2008 at 7:52 PM, Brian Suda <brian.suda@gmail.com> wrote:
Other people only vend the data after someone has
> authenticated themselves, so the microformats are NOT available to the
> general public, but instead to a white-list of contacts.
>

If you host some service that allows connections between users, you
can use these connections and only reveal sensitive data to connected
users... (if they require authorization by the invitee and you inform
them of this)

--
Andr? Lu?s

From danbri at danbri.org  Mon Jul  7 12:13:15 2008
From: danbri at danbri.org (Dan Brickley)
Date: Mon Jul  7 12:13:23 2008
Subject: [uf-discuss] Microformats search engine: virel
In-Reply-To: <21e770780807071152q713b4878k616fc09bdaa02e78@mail.gmail.com>
References: <3967.66.17.182.210.1215464209.squirrel@webmail.nomadcode.com>	<487252E1.1060600@gmail.com>	<4883.66.17.182.210.1215470053.squirrel@webmail.nomadcode.com>
	<21e770780807071152q713b4878k616fc09bdaa02e78@mail.gmail.com>
Message-ID: <48726ACB.3080302@danbri.org>

Brian Suda wrote:
> 2008/7/7 Angus McIntyre <angus@pobox.com>:
>> Christian Heilmann wrote:
>>> That's got nothing to do with microformats ...
>> With due respect, I don't completely accept that. A case could be made
>> that factors that influence people's adoption of microformats are
>> legitimate topics for discussion. Uneasiness about the 'spammability' of
>> addresses published in hCard is a deterrent to full adoption of that
>> microformat for many users.
> 
> --- the argument is orthogonal to microformats because this is not
> unique to microformats. Any time you add more semantic information to
> your data it potentially increases the 'spammability' of it. This goes
> for RDFa, eRDF, RDF, POSH, microformats, RSS and anything else might
> come along in the future.

Yup. And we need to get much better (across various of these projects) 
in making clear to users what's going on, including the bad things that 
might happen. If user understanding and consent is handled better, 
downstream sites will know what they can or can't do with the data.

Some examples:

1. tribe.net FOAF was repackaged on ex.plode.us; users freaked out:
	What is ex.plode.us and have we been sold out?
	topic posted Thu, February 28, 2008 - 6:02 PM
http://brainstorm.tribe.net/thread/34fb1a79-351d-4251-8318-829623c1c9cb

Result: tribe.net switched off their FOAF feeds. This could just as 
easily have been microformats.

2. Google Social Graph API (XFN and FOAF)
The Google SGAPI makes it much easier to find out who the owner is of a 
YouTube account. This is currently relevant due to the Viacom/Google 
court case, in which Google have been asked to turn over all YouTube 
viewing logs, including both IP address and usernames. The judge took 
the view that the latter are essentially anonymous, despite the fact 
that the SGAPI makes it rather easy to associate YouTube URIs with FOAF 
and microformat data from elsewhere in the Web.
Details here: http://danbri.org/words/2008/07/03/359

3. identi.ca, twitter-like microblog (opensource as laconi.ca)
This microblogging platform encourages users to attach a Creative 
Commons license to their postings, which should give downstream 
aggregators a clearer sense of what can and can't be done with the data. 
We lack similar practice for FOAF and microformat content.


Where I'd like to see this go, is via some survey of users, figuring out 
how rich an understanding of the situation we can expect of them (not 
much I fear) and some attempt to make a CC-like simplification through 
which they can express their preferences about how their profile data is 
aggregated and re-used. Considering the Tribe case, it would be nice if 
users could've said "no commercial reuse (including banner adds)" unless 
x% of profits go to <http://charityofmychoice.example.com/>. But we're a 
long way from that now. If the only concrete affect on users is spam and 
confusion, we'll find outselves back with data hidden in GIFs, I fear...

cheers,

Dan


--
http://danbri.org/
From angus at pobox.com  Mon Jul  7 16:43:28 2008
From: angus at pobox.com (Angus McIntyre)
Date: Mon Jul  7 12:46:57 2008
Subject: [uf-discuss] Microformats search engine: virel
In-Reply-To: <48726ACB.3080302@danbri.org>
References: <3967.66.17.182.210.1215464209.squirrel@webmail.nomadcode.com>
	<487252E1.1060600@gmail.com>
	<4883.66.17.182.210.1215470053.squirrel@webmail.nomadcode.com>
	<21e770780807071152q713b4878k616fc09bdaa02e78@mail.gmail.com>
	<48726ACB.3080302@danbri.org>
Message-ID: <1316.66.17.182.210.1215474208.squirrel@webmail.nomadcode.com>


Dan Brickley wrote:
> ... we need to get much better (across various of these projects)
> in making clear to users what's going on, including the bad things that
> might happen.

Agreed.

> Some examples:

These are good examples, not least because they relate to 'good actors'
(rather than 'bad actors', such as spammers, who can be expected to behave
badly). Even well-intentioned (re-)use has implications and consequences.

> 3. identi.ca, twitter-like microblog (opensource as laconi.ca)
> This microblogging platform encourages users to attach a Creative
> Commons license to their postings, which should give downstream
> aggregators a clearer sense of what can and can't be done with the data.
> We lack similar practice for FOAF and microformat content.

I think this is an interesting point. It might be worth reflecting on some
other mechanisms that are used for expressing directives as to how content
can be used.

Of the mechanisms that I've come across, the most obvious are the
CC-licenses that Dan mentioned. Next up is the robots exclusion protocol
[1], and the extensions to it now supported by Google and Yahoo! [2]. The
X-Robots-Tag with its 'noarchive' and 'nosnippet' directives provides
fairly granular control over what may be done with content. Finally,
there's the 'media:restriction' element used in mediaRSS [3]. In the
standard, that's limited to specifying a country and "deny" to indicate
that a given piece of media isn't for distribution to that country.
However, some video hosting services overload it to specify restrictions
on how their content may or may not be aggregated (and by whom).

Possible directives governing use might include:

  individual only - for use by tools like Operator, but not to be crawled
  do not republish - allows automated processing, but not republishing
  non-commercial - only non-commercial republishing allowed
  no-spam - commercial republishing OK, but don't make unsolicited contact
  unrestricted - any legal use permissible

If this actually represents a continuum, then you can make it a principle
that data can only be republished under the same or more restrictive
terms: if A publishes data with 'non-commercial' republishing allowed,
then B may only republish it as 'non-commercial', 'do-not-republish' or
'individual only'.

Angus


[1] http://www.robotstxt.org/

[2]
http://googleblog.blogspot.com/2007/07/robots-exclusion-protocol-now-with-even.html

[3] https://www.google.com/webmasters/tools/video/en/video.html


From bjonkman at sobac.com  Mon Jul  7 22:02:39 2008
From: bjonkman at sobac.com (Bob Jonkman)
Date: Mon Jul  7 22:04:04 2008
Subject: [uf-discuss] hoard.it
In-Reply-To: <07843653-3749-4C33-97CF-95A4BAC93710@eatyourgreens.org.uk>
References: <07843653-3749-4C33-97CF-95A4BAC93710@eatyourgreens.org.uk>
Message-ID: <4872BCAF.23695.22CA6E96@bjonkman.sobac.com>

Sounds great!  How does it deal with dates commonly found in 
genealogy, such as "ABT 7 July 1950" or "AFT 25 Dec 2000" or "BEF 
Jan 1925"? or even  "ABT 2000 ?

--Bob.

On 3 Jul 2008 at 23:03, Jim O'Donnell wrote:

> Hello,
> 
> This might be of interest to members of this group, as it deals with 
> extracting data from semantic HTML. Prior to this year's Mashed 
> Museum event at the University of Leicester, Dan Zambonini put 
> together a prototype which aggregates data by spidering online museum 
> catalogues: http://hoardit.pbwiki.com/ It's a pretty fantastic demo of
> how information can be extracted from  well-structured HTML, even
> before you think of putting microformats  etc. on top.
> 
> In particular, it does a pretty good job of figuring out when an 
> object was made: http://feeds.boxuk.com/museums/object_100yrs.php The
> date parser is based on some code Dan & I knocked together at  Mashed
> Museum 2007, which  looks at strings like 'late Victorian',  'early
> 20th Century', '4th January 1853' and so on, and converts them  to
> machine-readable ISO dates.
> 
> Our original idea, which we never got round to actually implementing, 
> was that this would be useful as a web service - you give it a 
> string, it gives you a machine-parsable representation of that 
> string. The recent discussion here about dates has made me wonder if 
> such a web service woud be useful for microformats parsers. What do 
> others think?
> 
> Cheers
> Jim
> 
> Jim O'Donnell
> jim@eatyourgreens.org.uk
> http://eatyourgreens.org.uk
> http://flickr.com/photos/eatyourgreens
> 
> 
> 
> _______________________________________________
> microformats-discuss mailing list
> microformats-discuss@microformats.org
> http://microformats.org/mailman/listinfo/microformats-discuss


-- -- -- --
Bob Jonkman <bjonkman@sobac.com>         http://sobac.com/sobac/
SOBAC Microcomputer Services              Voice: +1-519-669-0388
6 James Street, Elmira ON  Canada  N3B 1L5  Cel: +1-519-635-9413
Software   ---   Office & Business Automation   ---   Consulting


From guillaume at lebleu.org  Mon Jul  7 22:45:08 2008
From: guillaume at lebleu.org (Guillaume Lebleu)
Date: Mon Jul  7 22:45:20 2008
Subject: [uf-discuss] hoard.it
In-Reply-To: <07843653-3749-4C33-97CF-95A4BAC93710@eatyourgreens.org.uk>
References: <07843653-3749-4C33-97CF-95A4BAC93710@eatyourgreens.org.uk>
Message-ID: <4872FEE4.7050307@lebleu.org>

Jim O'Donnell wrote:
> The recent discussion here about dates has made me wonder if such a 
> web service woud be useful for microformats parsers. What do others 
> think?
It seems to me that this type of date extraction might present risks if 
used by uf parsers to extract date/time from published content (and lead 
to the "people showing up on the wrong date" error mentioned in earlier 
posts).

On the other hand, it might be great at the time content is authored, to 
convert ambiguous natural language dates into unambiguous microformats, 
as a way to reduce the pain of micro-formatting content (especially it 
can detect dates in plain text rather than parsing something it knows is 
a date). Authors could confirm the generated microformats before 
publishing in a way similar to how Yahoo! shortcuts Wordpress plugin 
works [1]

Guillaume

[1] http://lebleu.org/blog/2008/02/09/trying-out-yahoo-shortcuts/
From ameer1234567890 at gmail.com  Tue Jul  8 00:41:03 2008
From: ameer1234567890 at gmail.com (Ameer Dawood)
Date: Tue Jul  8 00:41:06 2008
Subject: Fwd: Re: [uf-discuss] Microformats search engine: virel
In-Reply-To: <20080707150749.43001E39017@rex10.flatbooster.com>
References: <20080707150749.43001E39017@rex10.flatbooster.com>
Message-ID: <19abcbf20807080041i31379718hc86a4c07c4cc6907@mail.gmail.com>

Hi all,

I just thought that this is worth sharing.

Ameer


---------- Forwarded message ----------
From:  <contact@virtualart-online.de>
Date: Mon, Jul 7, 2008 at 9:07 PM
Subject: Re: Re: [uf-discuss] Microformats search engine: virel
To: ameer1234567890@gmail.com


Hi Ameer,

I got this E-Mail since im on the mailing list for uF too.

I'm the author of the virel search engine. You are right about the
E-Mails. right now we are sending automatically an email for people who
have published theire vcards in the internet and if the vcard was found
by virel.

I hope you do not feel stalked by this, since we actually just want to
let ppl know when theire vcards are tracked by us.

The major goal is to push the spread of microformats, especially vcards.

kind regards,

Simon Theophil



Am 07.07.2008 um  Uhr haben Sie geschrieben:
> Hi,
>
> look what happened now. I jusst got the same kind of email. Looks like
> they are sending emails to email addresses found in hCards. *It's just
> like spam. I just dropped them a mail saying so.
>
> Ameer
>
>
> On Mon, Jul 7, 2008 at 8:48 PM, Ameer Dawood
<ameer1234567890@gmail.com>
> wrote:
> > Hi,
> >
> > The site looks authentic and useful. I don't know about the emails.
I
> > have added my blog to the site. Let's see what happens next.
> >
> > Ameer
> >
> > On Mon, Jul 7, 2008 at 4:40 PM, Christian Heilmann
<codepo8@gmail.com>
> wrote:
> >> I just got several automated emails from
http://www.virel.org/index.php
> that
> >> they found uF of mine on sites and indexed them.
> >>
> >> Does anybody know the people behind it? I am not sure if that is
cool or
> >> creepy :)
> >>
> >> Chris
> >>
> >> _______________________________________________
> >> microformats-discuss mailing list
> >> microformats-discuss@microformats.org
> >> http://microformats.org/mailman/listinfo/microformats-discuss
> >>
> >
> >
> >
> > --
> > Paul Lynde  - "I sang in the choir for years, even though my family
> > belonged to another church."
> >
>
>
>
> --
> George Burns  - "Don't stay in bed, unless you can make money in bed."
> _______________________________________________
> microformats-discuss mailing list
> microformats-discuss@microformats.org
> http://microformats.org/mailman/listinfo/microformats-discuss
>
>
>






-- 
Paul Lynde  - "I sang in the choir for years, even though my family
belonged to another church."
From jim at eatyourgreens.org.uk  Wed Jul  9 14:06:55 2008
From: jim at eatyourgreens.org.uk (Jim O'Donnell)
Date: Wed Jul  9 14:06:58 2008
Subject: [uf-discuss] hoard.it
In-Reply-To: <4872BCAF.23695.22CA6E96@bjonkman.sobac.com>
References: <07843653-3749-4C33-97CF-95A4BAC93710@eatyourgreens.org.uk>
	<4872BCAF.23695.22CA6E96@bjonkman.sobac.com>
Message-ID: <21F95185-8795-4786-ADF5-5AE111EF1130@eatyourgreens.org.uk>

Thanks. I don't know what Dan did for hoard.it, but our original  
script treated 'about' or 'circa' as the date plus/minus five years.  
So 'circa 1800' would be returned as '1795/1805'. For 'before' or  
'after', you could return a pair of dates with either the first or  
second blank, accordingly. This is assuming we encode time periods as  
per the guidelines in the PNDS application profile:
http://www.ukoln.ac.uk/metadata/pns/pndsdcap/ 
#DctermsTemporalDctermsPeriod

Jim

On 8 Jul 2008, at 06:02, Bob Jonkman wrote:

> Sounds great!  How does it deal with dates commonly found in
> genealogy, such as "ABT 7 July 1950" or "AFT 25 Dec 2000" or "BEF
> Jan 1925"? or even  "ABT 2000 ?
>
> --Bob.

Jim O'Donnell
jim@eatyourgreens.org.uk
http://eatyourgreens.org.uk
http://flickr.com/photos/eatyourgreens



From jim at eatyourgreens.org.uk  Wed Jul  9 14:30:26 2008
From: jim at eatyourgreens.org.uk (Jim O'Donnell)
Date: Wed Jul  9 14:30:30 2008
Subject: [uf-discuss] hoard.it
In-Reply-To: <4872FEE4.7050307@lebleu.org>
References: <07843653-3749-4C33-97CF-95A4BAC93710@eatyourgreens.org.uk>
	<4872FEE4.7050307@lebleu.org>
Message-ID: <C3192A5B-1FB4-48F4-A244-91527295C28A@eatyourgreens.org.uk>

On 8 Jul 2008, at 06:45, Guillaume Lebleu wrote:

> Jim O'Donnell wrote:
>> The recent discussion here about dates has made me wonder if such  
>> a web service woud be useful for microformats parsers. What do  
>> others think?
> It seems to me that this type of date extraction might present  
> risks if used by uf parsers to extract date/time from published  
> content (and lead to the "people showing up on the wrong date"  
> error mentioned in earlier posts).
>
I don't think it's so risky. The inspiration for this particular work  
was Dan's experience on the 20th century London site: http://www. 
20thcenturylondon.org.uk/ which involved parsing and normalising text  
dates across four different collections. Granted it's tedious to  
analyse all the different patterns that have been used, but it isn't  
impossible to extract accurate ISO dates. The fact that archive was  
created from those four collections is a testament to that.

Museum catalogue records always have some sort of absolute date,  
though, which makes things easier for me. If people are marking up  
phrases like 'this Saturday' or '25th June' then I can see that  
extracting a date would be tricky - the parser would need the context  
within which to place the date, in order to get the year or month.

That said, I don't how often people use hcalendar to mark up phrases  
like 'next weekend' vs, say, 'Saturday 19th July 2008'. If we had  
some idea of how microformats are being used to mark up dates in  
real, online text, then we could make some meaningful statements  
about how risky, or even impossible, it might be to extract ISO dates  
automatically.


> On the other hand, it might be great at the time content is  
> authored, to convert ambiguous natural language dates into  
> unambiguous microformats, as a way to reduce the pain of micro- 
> formatting content (especially it can detect dates in plain text  
> rather than parsing something it knows is a date). Authors could  
> confirm the generated microformats before publishing in a way  
> similar to how Yahoo! shortcuts Wordpress plugin works [1]
>
Decent authoring tools would be brilliant. Not just for dates but  
locations and possibly other types of microformatted text. For  
instance, I can link a UK street address to Google maps and get back  
a precise point on a map of the UK. So do I really need to manually  
write a lat/long into the HTML to tell a microformats tool how to  
place the address on a map? The text contains all the necessary  
information to perform this operation already.

I think microformats should be relatively easy for a non-technical  
author to add to their text. Decent tools that generate the machine- 
readable data would be an enormous aid here.

Jim

Jim O'Donnell
jim@eatyourgreens.org.uk
http://eatyourgreens.org.uk
http://flickr.com/photos/eatyourgreens



From martin.mcevoy at contentstate.co.uk  Thu Jul 10 17:26:11 2008
From: martin.mcevoy at contentstate.co.uk (Martin McEvoy)
Date: Thu Jul 10 17:26:07 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <61F0DF46-CD31-43A3-AF97-1D357E74B431@ben-ward.co.uk>
References: <36A319113CF910438942741C4727ADFF02132B0D@MOBY.Clarence.local>
	<1214820941.3171.25.camel@localhost.localdomain>
	<61CCA888-066E-4F3E-8023-85F0EFBB37B5@adactio.com>
	<28AAF834-9517-42AF-9FD0-982891C9AA50@ben-ward.co.uk>
	<36A13CC9-03D2-46B6-AA3E-5DBDAFB7940A@adactio.com>
	<FAFFA1A4-1EF8-4CE0-8DFB-300ECFB9780D@randomchaos.com>
	<AEF59CBB-82C7-41BD-B04A-340DDF459372@adactio.com>
	<3BCE3C9D-2E84-4DFF-AD18-891C8CB492FB@randomchaos.com>
	<ae2b2ba80806302212o7530876fxdd46708c0ce64dcd@mail.gmail.com>
	<36A319113CF910438942741C4727ADFF02132F1F@MOBY.Clarence.local>
	<486A54DD.2030105@lebleu.org>
	<61F0DF46-CD31-43A3-AF97-1D357E74B431@ben-ward.co.uk>
Message-ID: <1215735971.4288.24.camel@localhost.localdomain>

Hello Ben

On Tue, 2008-07-01 at 17:42 +0100, Ben Ward wrote:
> At the core, in breaking with the semantics of an HTML element,
> we've  
> broken the behaviour of technologies using the element correctly and  
> intelligently (hence my strong opposition to continuing to stretch  
> ABBR outside of textual abbreviations as commonly described by  
> dictionaries: ?An abbreviation is a shortened form of a word or  
> phrase.? ? Wikipedia, Apple OSX Dictionary, Dictionary.com)

I dont believe that *2008-07-11T00:01+0100* belongs anywhere where a
human can read it, the only place I have found where this data sits
nicely is either stuffed in the head of a document or in a class.

I have been "playing around" with the various solutions proposed on both
this and uf-dev over the past few weeks, none of which turned out too
good when it came to parsing (for me).

anyway I tried something different by just re-using existing
microformats "item" and "value"...

 <div class="item updated"> 
	<p>Date <span class="value 2008-07-11T00:01+0100">Friday, July the 11th 2008</span></p>
</div>

There are a few more examples of how Item and Value work together in a
demo available here:

http://weborganics.co.uk/demo/machine-and-human-readable-data-format.html


It seems workable to me, but I guess that will be up to the rest of the community ;)

Thanks 

-- 
Martin McEvoy <martin.mcevoy@contentstate.co.uk>
ContentState <http://contentstate.co.uk/>

From paul.m.wilkins at gmail.com  Thu Jul 10 18:47:00 2008
From: paul.m.wilkins at gmail.com (Paul Wilkins)
Date: Thu Jul 10 18:47:04 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <1215735971.4288.24.camel@localhost.localdomain>
References: <36A319113CF910438942741C4727ADFF02132B0D@MOBY.Clarence.local>
	<36A13CC9-03D2-46B6-AA3E-5DBDAFB7940A@adactio.com>
	<FAFFA1A4-1EF8-4CE0-8DFB-300ECFB9780D@randomchaos.com>
	<AEF59CBB-82C7-41BD-B04A-340DDF459372@adactio.com>
	<3BCE3C9D-2E84-4DFF-AD18-891C8CB492FB@randomchaos.com>
	<ae2b2ba80806302212o7530876fxdd46708c0ce64dcd@mail.gmail.com>
	<36A319113CF910438942741C4727ADFF02132F1F@MOBY.Clarence.local>
	<486A54DD.2030105@lebleu.org>
	<61F0DF46-CD31-43A3-AF97-1D357E74B431@ben-ward.co.uk>
	<1215735971.4288.24.camel@localhost.localdomain>
Message-ID: <be0a36470807101847yc37e161jb81cc76f87a68f13@mail.gmail.com>

On Fri, Jul 11, 2008 at 12:26 PM, Martin McEvoy
<martin.mcevoy@contentstate.co.uk> wrote:
>  <div class="item updated">
>        <p>Date <span class="value 2008-07-11T00:01+0100">Friday, July the 11th 2008</span></p>
> </div>

We should leverage the computers ability to do the hard work for us.

<p>Date <span class="date">Friday, July the 11th 2008</span></p>

The date can be easily parsed by the system, in a number of limited
formats at first but growing in capabilities over time.

-- 
Paul Wilkins
From john at westciv.com  Thu Jul 10 20:51:00 2008
From: john at westciv.com (John Allsopp)
Date: Thu Jul 10 20:51:06 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <be0a36470807101847yc37e161jb81cc76f87a68f13@mail.gmail.com>
References: <36A319113CF910438942741C4727ADFF02132B0D@MOBY.Clarence.local>
	<36A13CC9-03D2-46B6-AA3E-5DBDAFB7940A@adactio.com>
	<FAFFA1A4-1EF8-4CE0-8DFB-300ECFB9780D@randomchaos.com>
	<AEF59CBB-82C7-41BD-B04A-340DDF459372@adactio.com>
	<3BCE3C9D-2E84-4DFF-AD18-891C8CB492FB@randomchaos.com>
	<ae2b2ba80806302212o7530876fxdd46708c0ce64dcd@mail.gmail.com>
	<36A319113CF910438942741C4727ADFF02132F1F@MOBY.Clarence.local>
	<486A54DD.2030105@lebleu.org>
	<61F0DF46-CD31-43A3-AF97-1D357E74B431@ben-ward.co.uk>
	<1215735971.4288.24.camel@localhost.localdomain>
	<be0a36470807101847yc37e161jb81cc76f87a68f13@mail.gmail.com>
Message-ID: <CD807776-46E4-47C5-B020-24445F741CBA@westciv.com>

Paul,


> we should leverage the computers ability to do the hard work for us.
>
> <p>Date <span class="date">Friday, July the 11th 2008</span></p>
>
> The date can be easily parsed by the system, in a number of limited
> formats at first but growing in capabilities over time.


that date can be easily parsed. But what about "tomorrow", "three  
weeks ago", and so on? Any solution that requires some kind of parsing  
of dates and times will fail in many instances of human to human  
communication.

john



John Allsopp

style master :: css editor :: http://westciv.com/style_master
about me :: http://johnfallsopp.com
Web Directions Conferences :: http://webdirections.org
My Microformats book :: http://microformatique.com/book


From mail at tobyinkster.co.uk  Fri Jul 11 01:17:54 2008
From: mail at tobyinkster.co.uk (Toby A Inkster)
Date: Fri Jul 11 01:18:22 2008
Subject: [uf-discuss] Human and machine readable data format
Message-ID: <18E049A6-3659-4DEA-B950-1BA5399AE936@tobyinkster.co.uk>

Martin McEvoy wrote:

> <div class="item updated">
> 	<p>Date <span class="value 2008-07-11T00:01+0100">Friday, July the  
> 11th 2008</span></p>
> </div>

There are a couple of problems with this:

Firstly, the class element may contain more than two classes - e.g.  
it may contain some others that have been added for styling or  
Javascript purposes. When there are more than two classes, parsers  
will need to have some kind of heuristic to figure out which one to  
parse as the value. This may be pretty easy for dates, but if someone  
wanted to use the pattern for one of the other problematic properties  
that have been identified (e.g "type" in hCard tel, or in hReview),  
this would become harder.

Secondly, and more importantly, it breaks the existing interpretation  
of class="value", which on <span> elements is currently used to mean  
that the textual content of the element should be used as the value.  
Faced with a "value" class, how should parsers know whether to parse  
by the old method (take a value from the class attribute) or the new  
method (from the element contents)? And yes, they will need to  
continue to support the old method because of the existing corpus of  
published data out there.

Frances' proposal with the "data-" prefix can suffer from the first  
problem (if there are two classes with a "data-" prefix), but that is  
easily spec'ed around by saying that in those situations, the longest  
such value is to be used. And it doesn't suffer from the second  
problem at all - the existence of a class with a data-prefix is a  
clear heuristic for parsers to determine whether to use the old  
method or new method.

-- 
Toby A Inkster
<mailto:mail@tobyinkster.co.uk>
<http://tobyinkster.co.uk>


From mail at tobyinkster.co.uk  Fri Jul 11 01:38:13 2008
From: mail at tobyinkster.co.uk (Toby A Inkster)
Date: Fri Jul 11 01:38:40 2008
Subject: [uf-discuss] Human and machine readable data format
Message-ID: <B754D0A3-B01F-48AD-8C5D-882DEC535431@tobyinkster.co.uk>

Paul Wilkins wrote:

> We should leverage the computers ability to do the hard work for us.
> <p>Date <span class="date">Friday, July the 11th 2008</span></p>

As I've said before, although my parser does support dates in this  
format, I strongly recommend *not* allowing these per spec, as it  
will lead to unpredictable and inconsistent results.

Yes, many programming languages do have libraries to do natural  
language parsing of dates, but these all differ subtly in what  
formats they support, how they interpret certain ambiguous dates, and  
how well they internationalise. e.g. I know that Perl's  
DateTime::Format::Natural, while it can perform very sophisticated  
parsing ("Saturday evening 3 months ago" => 2008-05-12T19:00:00,  
"thursday morning last week" => 2008-07-03T09:00:00) only includes  
English in the distributed module (though it has hooks allowing  
support for other languages). PHP's strtotime function is English  
only too, and there are differences in how it interprets some natural  
language dates, not just with Perl, but between different versions of  
PHP.

Natural language parsing is really not the way to go, nor is a  
limited range of date formats that *look* like NLP, because  
publishers will believe them to *be* NLP and start publishing in any  
old date format. ISO8601 is what we must stick with - we just must  
agree a better way of embedding it than <abbr>.

-- 
Toby A Inkster
<mailto:mail@tobyinkster.co.uk>
<http://tobyinkster.co.uk>


From danbri at danbri.org  Fri Jul 11 01:47:35 2008
From: danbri at danbri.org (Dan Brickley)
Date: Fri Jul 11 01:47:40 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <B754D0A3-B01F-48AD-8C5D-882DEC535431@tobyinkster.co.uk>
References: <B754D0A3-B01F-48AD-8C5D-882DEC535431@tobyinkster.co.uk>
Message-ID: <48771E27.5050709@danbri.org>

Toby A Inkster wrote:
> Paul Wilkins wrote:
> 
>> We should leverage the computers ability to do the hard work for us.
>> <p>Date <span class="date">Friday, July the 11th 2008</span></p>
> 
> As I've said before, although my parser does support dates in this 
> format, I strongly recommend *not* allowing these per spec, as it will 
> lead to unpredictable and inconsistent results.
> 
> Yes, many programming languages do have libraries to do natural language 
> parsing of dates, but these all differ subtly in what formats they 
> support, how they interpret certain ambiguous dates, and how well they 
> internationalise. e.g. I know that Perl's DateTime::Format::Natural, 
> while it can perform very sophisticated parsing ("Saturday evening 3 
> months ago" => 2008-05-12T19:00:00, "thursday morning last week" => 
> 2008-07-03T09:00:00) only includes English in the distributed module 
> (though it has hooks allowing support for other languages). PHP's 
> strtotime function is English only too, and there are differences in how 
> it interprets some natural language dates, not just with Perl, but 
> between different versions of PHP.
> 
> Natural language parsing is really not the way to go, nor is a limited 
> range of date formats that *look* like NLP, because publishers will 
> believe them to *be* NLP and start publishing in any old date format. 
> ISO8601 is what we must stick with - we just must agree a better way of 
> embedding it than <abbr>.

Thank you for spelling this out so clearly. Please let's not slip into 
treating the non-English-speaking Web as a corner case. ISO8601's the 
thing. And it won't always be what the party reading the page expects 
(either in terms of language, script or even calendar).

cheers,

Dan

-
http://danbri.org/
From ameer1234567890 at gmail.com  Fri Jul 11 07:44:58 2008
From: ameer1234567890 at gmail.com (Ameer Dawood)
Date: Fri Jul 11 07:45:04 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <CD807776-46E4-47C5-B020-24445F741CBA@westciv.com>
References: <36A319113CF910438942741C4727ADFF02132B0D@MOBY.Clarence.local>
	<AEF59CBB-82C7-41BD-B04A-340DDF459372@adactio.com>
	<3BCE3C9D-2E84-4DFF-AD18-891C8CB492FB@randomchaos.com>
	<ae2b2ba80806302212o7530876fxdd46708c0ce64dcd@mail.gmail.com>
	<36A319113CF910438942741C4727ADFF02132F1F@MOBY.Clarence.local>
	<486A54DD.2030105@lebleu.org>
	<61F0DF46-CD31-43A3-AF97-1D357E74B431@ben-ward.co.uk>
	<1215735971.4288.24.camel@localhost.localdomain>
	<be0a36470807101847yc37e161jb81cc76f87a68f13@mail.gmail.com>
	<CD807776-46E4-47C5-B020-24445F741CBA@westciv.com>
Message-ID: <19abcbf20807110744r1f8faa9v5d115637bc6e188@mail.gmail.com>

Hi,

Just one more thing to add. Microformats should be designed in such a
way that authors are not obliqued to wrrite up a spcific date format
for display to users. If we are to follow the idea of a
machine-readable as well as human-readable date format, then authors
would be obliqued to use that specific format for users.


Ameer


On Fri, Jul 11, 2008 at 9:51 AM, John Allsopp <john@westciv.com> wrote:
> Paul,
>
>
>> we should leverage the computers ability to do the hard work for us.
>>
>> <p>Date <span class="date">Friday, July the 11th 2008</span></p>
>>
>> The date can be easily parsed by the system, in a number of limited
>> formats at first but growing in capabilities over time.
>
>
> that date can be easily parsed. But what about "tomorrow", "three weeks
> ago", and so on? Any solution that requires some kind of parsing of dates
> and times will fail in many instances of human to human communication.
>
> john
>
>
>
> John Allsopp
>
> style master :: css editor :: http://westciv.com/style_master
> about me :: http://johnfallsopp.com
> Web Directions Conferences :: http://webdirections.org
> My Microformats book :: http://microformatique.com/book
>
>
> _______________________________________________
> microformats-discuss mailing list
> microformats-discuss@microformats.org
> http://microformats.org/mailman/listinfo/microformats-discuss
>
From tom.hamshere at gmail.com  Fri Jul 11 08:56:17 2008
From: tom.hamshere at gmail.com (Tom)
Date: Fri Jul 11 08:56:21 2008
Subject: [uf-discuss] Microformats to describe a broadcast
Message-ID: <94165a1c0807110856i76f63520w90767dfc448a5f66@mail.gmail.com>

Hi,
I'm working for Channel4 on a new programme guide and am interested to
know if there was any resolution made on the discussion the BBC took
part in early last year...
http://microformats.org/discuss/mail/microformats-discuss/2007-January/008129.html

Is anyone aware of any other programme guides which have employed the
hCalendar microformat to date? I note that this doesn't seem to have
been acheived in the current BBC listings.

Regards,
Tom Hamshere
From Michael.Smethurst at bbc.co.uk  Fri Jul 11 09:49:33 2008
From: Michael.Smethurst at bbc.co.uk (Michael Smethurst)
Date: Fri Jul 11 09:49:39 2008
Subject: [uf-discuss] Microformats to describe a broadcast
In-Reply-To: <94165a1c0807110856i76f63520w90767dfc448a5f66@mail.gmail.com>
Message-ID: <C49D4DAD.DBBF%Michael.Smethurst@bbc.co.uk>

Hi Tom

They were there

Then they weren't

http://www.bbc.co.uk/blogs/radiolabs/2008/06/removing_microformats_from_bbc.
shtml


On 11/7/08 16:56, "Tom" <tom.hamshere@gmail.com> wrote:

> Hi,
> I'm working for Channel4 on a new programme guide and am interested to
> know if there was any resolution made on the discussion the BBC took
> part in early last year...
> http://microformats.org/discuss/mail/microformats-discuss/2007-January/008129.
> html
> 
> Is anyone aware of any other programme guides which have employed the
> hCalendar microformat to date? I note that this doesn't seem to have
> been acheived in the current BBC listings.
> 
> Regards,
> Tom Hamshere
> _______________________________________________
> microformats-discuss mailing list
> microformats-discuss@microformats.org
> http://microformats.org/mailman/listinfo/microformats-discuss


http://www.bbc.co.uk/
This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.
					
From lists at ben-ward.co.uk  Fri Jul 11 10:01:13 2008
From: lists at ben-ward.co.uk (Ben Ward)
Date: Fri Jul 11 10:01:20 2008
Subject: [uf-discuss] Microformats to describe a broadcast
In-Reply-To: <94165a1c0807110856i76f63520w90767dfc448a5f66@mail.gmail.com>
References: <94165a1c0807110856i76f63520w90767dfc448a5f66@mail.gmail.com>
Message-ID: <EC6E6971-5EAA-4087-A792-F6E4B0BEFCD1@ben-ward.co.uk>

Hi Tom,

> Is anyone aware of any other programme guides which have employed the
> hCalendar microformat to date? I note that this doesn't seem to have
> been acheived in the current BBC listings.	


We've got hCalendar on all of Yahoo's UK TV listings:

   ? http://uk.tv.yahoo.com/listings/2008-07-11/20-30/
   ? http://uk.tv.yahoo.com/listings/2008-07-11/20-30/by-hour/
   ? http://uk.tv.yahoo.com/listings/bbc-1/2008-07-11/

We implemented it using a somewhat icky hack (an empty ABBR element  
with the ISO date in the title), but it parses in most parsers whislt  
not exposing the ISO date as a tooltip to browsers. Most recent  
testing suggests that it will still be revealed in screen readers set  
to read ABBR titles though, which is unfortunate.

Oh, there's a bug in our output that's putting a translation string  
(%z) into the ISO date where the timezone should go. That'll be fixed  
shortly!

B
From tom at tommorris.org  Fri Jul 11 16:20:57 2008
From: tom at tommorris.org (Tom Morris)
Date: Fri Jul 11 16:21:01 2008
Subject: [uf-discuss] Microformats to describe a broadcast
In-Reply-To: <94165a1c0807110856i76f63520w90767dfc448a5f66@mail.gmail.com>
References: <94165a1c0807110856i76f63520w90767dfc448a5f66@mail.gmail.com>
Message-ID: <d375f00f0807111620s58997c92mf704dc2217286b7a@mail.gmail.com>

On Fri, Jul 11, 2008 at 4:56 PM, Tom <tom.hamshere@gmail.com> wrote:
> Hi,
> I'm working for Channel4 on a new programme guide and am interested to
> know if there was any resolution made on the discussion the BBC took
> part in early last year...
> http://microformats.org/discuss/mail/microformats-discuss/2007-January/008129.html
>
> Is anyone aware of any other programme guides which have employed the
> hCalendar microformat to date? I note that this doesn't seem to have
> been acheived in the current BBC listings.
>

May I point out that you could also think about using the BBC
Programmes Ontology to describe your programmes in RDF:
http://www.bbc.co.uk/ontologies/programmes/2008-02-28.shtml

It'd be great if - once the hCalendar date-time issues are solved - an
HTML to Programmes Ontology mapping could be made that used hCard,
hCalendar and other microformats.

Yours,

-- 
Tom Morris
http://tommorris.org/
From zen at zenpsycho.com  Sat Jul 12 06:50:31 2008
From: zen at zenpsycho.com (Breton Slivka)
Date: Sat Jul 12 06:57:06 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <48771E27.5050709@danbri.org>
References: <B754D0A3-B01F-48AD-8C5D-882DEC535431@tobyinkster.co.uk>
	<48771E27.5050709@danbri.org>
Message-ID: <ae2b2ba80807120650q5f878e02ped917007f51347c0@mail.gmail.com>

On Fri, Jul 11, 2008 at 6:47 PM, Dan Brickley <danbri@danbri.org> wrote:
> Toby A Inkster wrote:
>>
>> Paul Wilkins wrote:
>>
>>> We should leverage the computers ability to do the hard work for us.
>>> <p>Date <span class="date">Friday, July the 11th 2008</span></p>
>>
>> As I've said before, although my parser does support dates in this format,
>> I strongly recommend *not* allowing these per spec, as it will lead to
>> unpredictable and inconsistent results.
>>
>> Yes, many programming languages do have libraries to do natural language
>> parsing of dates, but these all differ subtly in what formats they support,
>> how they interpret certain ambiguous dates, and how well they
>> internationalise. e.g. I know that Perl's DateTime::Format::Natural, while
>> it can perform very sophisticated parsing ("Saturday evening 3 months ago"
>> => 2008-05-12T19:00:00, "thursday morning last week" => 2008-07-03T09:00:00)
>> only includes English in the distributed module (though it has hooks
>> allowing support for other languages). PHP's strtotime function is English
>> only too, and there are differences in how it interprets some natural
>> language dates, not just with Perl, but between different versions of PHP.
>>
>> Natural language parsing is really not the way to go, nor is a limited
>> range of date formats that *look* like NLP, because publishers will believe
>> them to *be* NLP and start publishing in any old date format. ISO8601 is
>> what we must stick with - we just must agree a better way of embedding it
>> than <abbr>.
>
> Thank you for spelling this out so clearly. Please let's not slip into
> treating the non-English-speaking Web as a corner case. ISO8601's the thing.
> And it won't always be what the party reading the page expects (either in
> terms of language, script or even calendar).
>
> cheers,
>
> Dan
>


In what way is ISO 8601 more friendly to non english speakers than any
other date format?
Please realise that by insisting that no natural language style will
be a solution, you are essentially saying that there is no solution to
this problem.

1. metadata and information hiding is out of the question
2. putting ISO 8601 style dates ("machine dates") in any place where a
human can see it or have it read to them  is "the problem" that we are
trying to solve, so we can't do that.
3. The date cannot resemble anything a human might want to read.

I find it terribly frustrating how many people cannot see that this
set of constraints yeilds NO solution. At least, when the constraints
are held to the level of strictness that this community is holding
them to.


>> Natural language parsing is really not the way to go, nor is a limited
>> range of date formats that *look* like NLP, because publishers will believe
>> them to *be* NLP and start publishing in any old date format. ISO8601 is
>> what we must stick with - we just must agree a better way of embedding it
>> than <abbr>.
>
The premise that publishers will pick any old format is merely an
assertion with no evidence. Please show us an example somewhere else
where this has happened, or perhaps a better argument than merely
insisting on the "obvious" truth of it.

The way I see it, if they publish in the wrong format, then the
parsers won't pick up the date. This is what happens with microformats
already. I don't know about anyone else, but when I publish a
microformat, I test whether parsers can read it correctly. I do the
same thing with any html. If a publisher can't take the time to test,
and publish in the correct format then they take the consequences.
it's exactly the same with any other technology. Why should
microformats be any different? Why do you think making a microformat
resemble natural language drastically changes this set of rules?

As to the person who was concerned about forcing a particular format
in a place where a human can read it, I have not seen a single
proposed solution which does not do this, without violating the "no
information hiding" principle

You may not like it, but too bad. Making a date resemble natural
languge is the only way to go. I don't say this because it's my
opinion. This is merely a fact, due to the nature of the problem, and
the constraints that the community has enforced on possible solutions.
Accept it, or doom yourselves to reasoning around in circles some
more, as you have already done.
From jason.karns at gmail.com  Sat Jul 12 10:23:20 2008
From: jason.karns at gmail.com (Jason Karns)
Date: Sat Jul 12 10:23:22 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <ae2b2ba80807120650q5f878e02ped917007f51347c0@mail.gmail.com>
References: <B754D0A3-B01F-48AD-8C5D-882DEC535431@tobyinkster.co.uk>
	<48771E27.5050709@danbri.org>
	<ae2b2ba80807120650q5f878e02ped917007f51347c0@mail.gmail.com>
Message-ID: <1005d65f0807121023k59843f70o4c716a3490886f87@mail.gmail.com>

> The premise that publishers will pick any old format is merely an
> assertion with no evidence. Please show us an example somewhere else
> where this has happened, or perhaps a better argument than merely
> insisting on the "obvious" truth of it.
>
> The way I see it, if they publish in the wrong format, then the
> parsers won't pick up the date. This is what happens with microformats
> already. I don't know about anyone else, but when I publish a
> microformat, I test whether parsers can read it correctly. I do the
> same thing with any html. If a publisher can't take the time to test,
> and publish in the correct format then they take the consequences.
> it's exactly the same with any other technology. Why should
> microformats be any different? Why do you think making a microformat
> resemble natural language drastically changes this set of rules?
>
The problem is as simple as testing in a parser to verify that the
format is correct.  NLP is too difficult to easily solved in every
parser.  The outcome will be that different parsers will handle
different levels of NLP, parsing only subsets of accepted 'native
language formats'. This is similar to the way many parsers are now.
(Many parsers handle different portions of the specs. Few handle the
entire spec. Case in point: the include pattern.)  Even assuming the
very extreme case that all parsers handle the same string formats, no
parser will ever handle every possible language permutation.

The only solution that will result in practical parser use will
*require* some amount of data duplication.  Just as you stated:
1. metadata and information hiding is out of the question
2. putting ISO 8601 style dates ("machine dates") in any place where a
human can see it or have it read to them  is "the problem" that we are
trying to solve, so we can't do that.
3. The date cannot resemble anything a human might want to read.

One of the above rules must be broken. #2 is the problem as you said.
#3 will result in a 'spec' that will never be fully implemented in all
parsers and will thus never be practical for publishing. #1 therefore
must be broken.  I don't understand why this is even an argument at
this point. The abbr-pattern was already accepted though it violates
this principle. The only reason it is rejected now is because of the
semantics of the @title attribute. Thus any solution that violates
principle #1 in the same way as the abbr-pattern should also be
acceptable so long as it does not suffer the same accessibility issue.

Any sort of class="data-*" solution seems to be an acceptable
compromise (and a compromise is what is required). It keeps the data
machine-readable without making parsing impractical. It keeps the
machine data out of human-readable context (@title). And it keeps the
duplicate data near the human-readable version for maintenance.
(Though I take exception with the duplicate-data principle as most
publishers use automated tools that easily duplicate data without
causing stale-issues.)

~Jason
From mail at tobyinkster.co.uk  Sat Jul 12 11:29:07 2008
From: mail at tobyinkster.co.uk (Toby A Inkster)
Date: Sat Jul 12 11:29:45 2008
Subject: [uf-discuss] Human and machine readable data format
Message-ID: <DC685A3E-71FE-4811-9C5F-6BA664B26C3C@tobyinkster.co.uk>

Breton Slivka wrote:

> The premise that publishers will pick any old format is merely an
> assertion with no evidence. Please show us an example somewhere else
> where this has happened, or perhaps a better argument than merely
> insisting on the "obvious" truth of it.

I have previously mentioned the example of RFC 822. This standard  
defined a very specific human-readable date format for use in e- 
mails, and despite the fact that only a handful of people had to deal  
with it (i.e. the people writing mail clients, not the people *using*  
them), it quickly fragmented.

Examples of specific deviations are that the RFC defines years to be  
two digits, whereas implementors quickly started using four digit  
years. (Fair enough I suppose as two digit years were a poor choice  
to begin with. The revised specification RFC 2822 switched to four  
digit years.) It also used strings like "Mon", "Tue", etc for days  
and "Jan", "Feb", etc for months. Despite the fact that these exact  
three letter strings were required by the specification, implementors  
often localised them. Lastly, although it used +/-NNNN for timezones,  
it also defined alphabetic codes like "GMT" and so forth for about a  
dozen commonly used timezones. However, many implementations started  
using other alphanumeric timezones not defined in the spec.

ISO 8601 is a good, well-defined spec for dates and times, with many  
existing and interoperable implementations. It is clearly the best  
choice to standardise on as a date format. However, it's important  
that we offer publishers the option of hiding the ISO date from their  
visitors and displaying the date in a different format (perhaps even  
in a different calendar!) for them.

-- 
Toby A Inkster
<mailto:mail@tobyinkster.co.uk>
<http://tobyinkster.co.uk>

From zack.carter at gmail.com  Sat Jul 12 11:39:44 2008
From: zack.carter at gmail.com (Zachary Carter)
Date: Sat Jul 12 11:39:46 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <1005d65f0807121023k59843f70o4c716a3490886f87@mail.gmail.com>
References: <B754D0A3-B01F-48AD-8C5D-882DEC535431@tobyinkster.co.uk>
	<48771E27.5050709@danbri.org>
	<ae2b2ba80807120650q5f878e02ped917007f51347c0@mail.gmail.com>
	<1005d65f0807121023k59843f70o4c716a3490886f87@mail.gmail.com>
Message-ID: <f56dbfbd0807121139h2d48b6a5gdec46b66853ac3fe@mail.gmail.com>

+1 for class="data-"
Hidden metadata isn't going away anytime soon. HTML 5 features it,
RDF/RDFa uses it, the empty abbr pattern already does it, and many
others.

Best,

Zach Carter


On Sat, Jul 12, 2008 at 1:23 PM, Jason Karns <jason.karns@gmail.com> wrote:
>> The premise that publishers will pick any old format is merely an
>> assertion with no evidence. Please show us an example somewhere else
>> where this has happened, or perhaps a better argument than merely
>> insisting on the "obvious" truth of it.
>>
>> The way I see it, if they publish in the wrong format, then the
>> parsers won't pick up the date. This is what happens with microformats
>> already. I don't know about anyone else, but when I publish a
>> microformat, I test whether parsers can read it correctly. I do the
>> same thing with any html. If a publisher can't take the time to test,
>> and publish in the correct format then they take the consequences.
>> it's exactly the same with any other technology. Why should
>> microformats be any different? Why do you think making a microformat
>> resemble natural language drastically changes this set of rules?
>>
> The problem is as simple as testing in a parser to verify that the
> format is correct.  NLP is too difficult to easily solved in every
> parser.  The outcome will be that different parsers will handle
> different levels of NLP, parsing only subsets of accepted 'native
> language formats'. This is similar to the way many parsers are now.
> (Many parsers handle different portions of the specs. Few handle the
> entire spec. Case in point: the include pattern.)  Even assuming the
> very extreme case that all parsers handle the same string formats, no
> parser will ever handle every possible language permutation.
>
> The only solution that will result in practical parser use will
> *require* some amount of data duplication.  Just as you stated:
> 1. metadata and information hiding is out of the question
> 2. putting ISO 8601 style dates ("machine dates") in any place where a
> human can see it or have it read to them  is "the problem" that we are
> trying to solve, so we can't do that.
> 3. The date cannot resemble anything a human might want to read.
>
> One of the above rules must be broken. #2 is the problem as you said.
> #3 will result in a 'spec' that will never be fully implemented in all
> parsers and will thus never be practical for publishing. #1 therefore
> must be broken.  I don't understand why this is even an argument at
> this point. The abbr-pattern was already accepted though it violates
> this principle. The only reason it is rejected now is because of the
> semantics of the @title attribute. Thus any solution that violates
> principle #1 in the same way as the abbr-pattern should also be
> acceptable so long as it does not suffer the same accessibility issue.
>
> Any sort of class="data-*" solution seems to be an acceptable
> compromise (and a compromise is what is required). It keeps the data
> machine-readable without making parsing impractical. It keeps the
> machine data out of human-readable context (@title). And it keeps the
> duplicate data near the human-readable version for maintenance.
> (Though I take exception with the duplicate-data principle as most
> publishers use automated tools that easily duplicate data without
> causing stale-issues.)
>
> ~Jason
> _______________________________________________
> microformats-discuss mailing list
> microformats-discuss@microformats.org
> http://microformats.org/mailman/listinfo/microformats-discuss
>
From paul.m.wilkins at gmail.com  Sat Jul 12 18:47:23 2008
From: paul.m.wilkins at gmail.com (Paul Wilkins)
Date: Sat Jul 12 18:47:27 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <19abcbf20807110744r1f8faa9v5d115637bc6e188@mail.gmail.com>
References: <36A319113CF910438942741C4727ADFF02132B0D@MOBY.Clarence.local>
	<3BCE3C9D-2E84-4DFF-AD18-891C8CB492FB@randomchaos.com>
	<ae2b2ba80806302212o7530876fxdd46708c0ce64dcd@mail.gmail.com>
	<36A319113CF910438942741C4727ADFF02132F1F@MOBY.Clarence.local>
	<486A54DD.2030105@lebleu.org>
	<61F0DF46-CD31-43A3-AF97-1D357E74B431@ben-ward.co.uk>
	<1215735971.4288.24.camel@localhost.localdomain>
	<be0a36470807101847yc37e161jb81cc76f87a68f13@mail.gmail.com>
	<CD807776-46E4-47C5-B020-24445F741CBA@westciv.com>
	<19abcbf20807110744r1f8faa9v5d115637bc6e188@mail.gmail.com>
Message-ID: <be0a36470807121847u15ec55a1keb51ee92f30da38e@mail.gmail.com>

On Sat, Jul 12, 2008 at 2:44 AM, Ameer Dawood <ameer1234567890@gmail.com> wrote:
> Just one more thing to add. Microformats should be designed in such a
> way that authors are not obliqued to wrrite up a spcific date format
> for display to users. If we are to follow the idea of a
> machine-readable as well as human-readable date format, then authors
> would be obliqued to use that specific format for users.

With the current system authors are obliged to write up a specific
date format for computers to parse, as well as one for humans to read.

They should not have to produce both types on every occasion.
If a parser isn't able to work out tomorrow or next week from context,
then that date could be made more explicit in the code until solutions
are devised.

It's not right though to demand content authors to duplicate the dates
they're entering.
Humans first, machines second. That's how it should be.

-- 
Paul Wilkins
From qidydl at gmail.com  Sat Jul 12 19:10:12 2008
From: qidydl at gmail.com (David O)
Date: Sat Jul 12 19:10:17 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <be0a36470807121847u15ec55a1keb51ee92f30da38e@mail.gmail.com>
References: <36A319113CF910438942741C4727ADFF02132B0D@MOBY.Clarence.local>
	<ae2b2ba80806302212o7530876fxdd46708c0ce64dcd@mail.gmail.com>
	<36A319113CF910438942741C4727ADFF02132F1F@MOBY.Clarence.local>
	<486A54DD.2030105@lebleu.org>
	<61F0DF46-CD31-43A3-AF97-1D357E74B431@ben-ward.co.uk>
	<1215735971.4288.24.camel@localhost.localdomain>
	<be0a36470807101847yc37e161jb81cc76f87a68f13@mail.gmail.com>
	<CD807776-46E4-47C5-B020-24445F741CBA@westciv.com>
	<19abcbf20807110744r1f8faa9v5d115637bc6e188@mail.gmail.com>
	<be0a36470807121847u15ec55a1keb51ee92f30da38e@mail.gmail.com>
Message-ID: <ee0909a60807121910p2b65b4d3h2803e085eaacd82e@mail.gmail.com>

On Sat, Jul 12, 2008 at 9:47 PM, Paul Wilkins <paul.m.wilkins@gmail.com> wrote:
> With the current system authors are obliged to write up a specific
> date format for computers to parse, as well as one for humans to read.
>
> They should not have to produce both types on every occasion.
> If a parser isn't able to work out tomorrow or next week from context,
> then that date could be made more explicit in the code until solutions
> are devised.

Unfortunately, natural language processing is very, very hard, even
when you're only focusing on one language, and we're attempting to
create a solution for every single language in which (X)HTML content
is publically published.  As far as I can tell, the choice here is
between an ideal solution which cannot currently be implemented, or a
kludge that can.  Also, as far as I can tell, one of the basic
principles of the microformats community from the outset was to value
things that actually worked over idealized, perfect solutions,
precisely because such solutions almost never actually get built.

> It's not right though to demand content authors to duplicate the dates
> they're entering.
> Humans first, machines second. That's how it should be.

The key word here is "should."
From lists at ben-ward.co.uk  Sun Jul 13 10:25:05 2008
From: lists at ben-ward.co.uk (Ben Ward)
Date: Sun Jul 13 10:31:06 2008
Subject: [uf-discuss] Scheduling Wiki Downtime
Message-ID: <544B0D75-1B21-4126-A1B1-F61BD1286E40@ben-ward.co.uk>

Hi everyone,

One of my admin tasks at the moment is to run an upgrade of the  
Microformats Wiki. Critically, this is to upgrade our install of  
MediaWiki to the most recent release, but also taking the opportunity  
to add some new microformat-friendly enhancements to the install, and  
to the theme.

You can see the list of features/improvements I've been hacking on  
here: http://microformats.org/wiki/todo#Wiki_2.0

The work is nearly done, and running the upgrade is possible in  
multiple steps, but it's going to require some ? hopefully short ?  
periods of downtime to apply the updates. Critically, MediaWiki has to  
change the structure of the database between versions, so the wiki  
will need to be taken offline whilst that process runs.

My intention is to run this upgrade on Sunday 20th July (next  
weekend), sometime in the afternoon GMT. As such, if you're planning  
any work for that afternoon that requires you to refer to microformats  
documentation, please take an offline copy.

The wiki should only be offline for a couple of hours.

Thanks,

Ben
From nirmal at gatech.edu  Sun Jul 13 08:04:13 2008
From: nirmal at gatech.edu (Nirmal Patel)
Date: Sun Jul 13 11:58:45 2008
Subject: [uf-discuss] Generically converting JSON to POSH
Message-ID: <a450e4a80807130804h542c5d32i9e0eda47aea5bb03@mail.gmail.com>

Hello,

I've been experimenting with some ideas on ways of properly turning
JSON[1] objects into POSH. I have posted my first attempt at
http://nirmalpatel.com/json2posh along with further details. Any
feedback is greatly appreciated.

I used the code to convert the JSON object returned by Twitter and
then used CSS to style the results. The CSS can be toggled to see the
output before/after.

I am converting arrays into ordered lists and dictionaries into
unordered lists where each list item has a classname that is equal to
the key in the key/value pair.

Strings are converted to text nodes with the exception that links are
found through regex and automatically converted to anchor nodes.

Numbers, booleans and null are converted to text nodes

Thank you for your time.

[1] http://www.json.org

-- 
Nirmal Patel
From mail at tobyinkster.co.uk  Mon Jul 14 01:52:28 2008
From: mail at tobyinkster.co.uk (Toby A Inkster)
Date: Mon Jul 14 01:52:53 2008
Subject: [uf-discuss] Generically converting JSON to POSH
Message-ID: <D1E97365-7972-4AEB-A295-E0ECDED416CB@tobyinkster.co.uk>

Nirmal Patel wrote:

> I am converting arrays into ordered lists and dictionaries into
> unordered lists where each list item has a classname that is equal to
> the key in the key/value pair.

Looks quite good. Personally I'd have converted JSON objects into  
definition lists using DT for the key and DD for the value though.  
Then perhaps add the key to the class of each *as well*. So, for  
instance:

{
   "key1" : "value1" ,
   "key2" : "value2"
}

becomes:

<dl class="json-object">
   <dt class="key1">key1</dt>
   <dd class="key1">value1</dt>
   <dt class="key2">key2</dt>
   <dd class="key2">value2</dt>
</dl>

Which should be more flexible with regards to styling, and probably  
more useful when looked at without CSS. If you happen to like the  
unordered list look, then:

dl.json-object dt { display: none; }
dl.json-object dd { list-style-type: circle; }

However, have you considered what happens when the object keys  
contain whitespace characters? e.g.

{
   "some key" : "some value"
}

-- 
Toby A Inkster
<mailto:mail@tobyinkster.co.uk>
<http://tobyinkster.co.uk>



From andr3.pt at gmail.com  Mon Jul 14 02:06:38 2008
From: andr3.pt at gmail.com (=?ISO-8859-1?Q?Andr=E9_Lu=EDs?=)
Date: Mon Jul 14 02:06:41 2008
Subject: [uf-discuss] Generically converting JSON to POSH
In-Reply-To: <D1E97365-7972-4AEB-A295-E0ECDED416CB@tobyinkster.co.uk>
References: <D1E97365-7972-4AEB-A295-E0ECDED416CB@tobyinkster.co.uk>
Message-ID: <dc1a17860807140206l69741cbdqcde25486ae1ed269@mail.gmail.com>

Wouldn't that throw off parsers? If the JSON was an hcard, for
example, you'd have two .fn elements. Only one with the true value.

If you removed the class of the dt, you'd still be able to do:

dl.json-object dt { display: none; }

--
Andr? Lu?s

On Mon, Jul 14, 2008 at 9:52 AM, Toby A Inkster <mail@tobyinkster.co.uk> wrote:
> Nirmal Patel wrote:
>
>> I am converting arrays into ordered lists and dictionaries into
>> unordered lists where each list item has a classname that is equal to
>> the key in the key/value pair.
>
> Looks quite good. Personally I'd have converted JSON objects into definition
> lists using DT for the key and DD for the value though. Then perhaps add the
> key to the class of each *as well*. So, for instance:
>
> {
>  "key1" : "value1" ,
>  "key2" : "value2"
> }
>
> becomes:
>
> <dl class="json-object">
>  <dt class="key1">key1</dt>
>  <dd class="key1">value1</dt>
>  <dt class="key2">key2</dt>
>  <dd class="key2">value2</dt>
> </dl>
>
> Which should be more flexible with regards to styling, and probably more
> useful when looked at without CSS. If you happen to like the unordered list
> look, then:
>
> dl.json-object dt { display: none; }
> dl.json-object dd { list-style-type: circle; }
>
> However, have you considered what happens when the object keys contain
> whitespace characters? e.g.
>
> {
>  "some key" : "some value"
> }
>
> --
> Toby A Inkster
> <mailto:mail@tobyinkster.co.uk>
> <http://tobyinkster.co.uk>
>
>
>
> _______________________________________________
> microformats-discuss mailing list
> microformats-discuss@microformats.org
> http://microformats.org/mailman/listinfo/microformats-discuss
>

From mail at ciaranmcnulty.com  Mon Jul 14 02:18:33 2008
From: mail at ciaranmcnulty.com (Ciaran McNulty)
Date: Mon Jul 14 02:18:36 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <f56dbfbd0807121139h2d48b6a5gdec46b66853ac3fe@mail.gmail.com>
References: <B754D0A3-B01F-48AD-8C5D-882DEC535431@tobyinkster.co.uk>
	<48771E27.5050709@danbri.org>
	<ae2b2ba80807120650q5f878e02ped917007f51347c0@mail.gmail.com>
	<1005d65f0807121023k59843f70o4c716a3490886f87@mail.gmail.com>
	<f56dbfbd0807121139h2d48b6a5gdec46b66853ac3fe@mail.gmail.com>
Message-ID: <cdc278e10807140218x79b655e7q3c609c267666538c@mail.gmail.com>

On Sat, Jul 12, 2008 at 7:39 PM, Zachary Carter <zack.carter@gmail.com> wrote:
> +1 for class="data-"
> Hidden metadata isn't going away anytime soon. HTML 5 features it,
> RDF/RDFa uses it, the empty abbr pattern already does it, and many
> others.

I think consensus seems to be that hidden data is ok for machine data,
as long as it's right next to the human-readable date in the HTML (so
that it's less likely to be overlooked when editing).

However, -1 from me for using @class in that way - I think it breaks
the semantics completely.

-Ciaran McNulty
From Michael.Smethurst at bbc.co.uk  Mon Jul 14 02:39:10 2008
From: Michael.Smethurst at bbc.co.uk (Michael Smethurst)
Date: Mon Jul 14 02:39:18 2008
Subject: [uf-discuss] Microformats to describe a broadcast
In-Reply-To: <d375f00f0807111620s58997c92mf704dc2217286b7a@mail.gmail.com>
Message-ID: <C4A0DD4E.DC71%Michael.Smethurst@bbc.co.uk>

Hi Tom + Tom


On 12/7/08 00:20, "Tom Morris" <tom@tommorris.org> wrote:

> On Fri, Jul 11, 2008 at 4:56 PM, Tom <tom.hamshere@gmail.com> wrote:
>> Hi,
>> I'm working for Channel4 on a new programme guide and am interested to
>> know if there was any resolution made on the discussion the BBC took
>> part in early last year...
>> http://microformats.org/discuss/mail/microformats-discuss/2007-January/008129
>> .html
>> 
> 
> May I point out that you could also think about using the BBC
> Programmes Ontology to describe your programmes in RDF:
> http://www.bbc.co.uk/ontologies/programmes/2008-02-28.shtml

We spoke to Orpheus Warr at C4 sometime last year about mapping their
programme data to the Programmes Ontology. Perhaps now's the time to reopen
conversations - or at least a chat and a beer ;-)

Might make Kangeroo [1] integration a little easier if we shared the same
ontology
> 
> It'd be great if - once the hCalendar date-time issues are solved - an
> HTML to Programmes Ontology mapping could be made that used hCard,
> hCalendar and other microformats.

That indeed would be very cool
> 
> Yours,

[1]http://www.pocket-lint.co.uk/news/news.phtml/11469/12493/BBC-ITV-C4-Proje
ct-Kangeroo.phtml


http://www.bbc.co.uk/
This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.
					
From Michael.Smethurst at bbc.co.uk  Mon Jul 14 04:19:27 2008
From: Michael.Smethurst at bbc.co.uk (Michael Smethurst)
Date: Mon Jul 14 04:19:35 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <ae2b2ba80807120650q5f878e02ped917007f51347c0@mail.gmail.com>
Message-ID: <C4A0F4CF.DC8F%Michael.Smethurst@bbc.co.uk>

Hello


On 12/7/08 14:50, "Breton Slivka" <zen@zenpsycho.com> wrote:

> On Fri, Jul 11, 2008 at 6:47 PM, Dan Brickley <danbri@danbri.org> wrote:
>> Toby A Inkster wrote:
>>> 
>>> Paul Wilkins wrote:
>>> 
>>>> We should leverage the computers ability to do the hard work for us.
>>>> <p>Date <span class="date">Friday, July the 11th 2008</span></p>
>>> 
>>> As I've said before, although my parser does support dates in this format,
>>> I strongly recommend *not* allowing these per spec, as it will lead to
>>> unpredictable and inconsistent results.
>>> 
>>> Natural language parsing is really not the way to go, nor is a limited
>>> range of date formats that *look* like NLP, because publishers will believe
>>> them to *be* NLP and start publishing in any old date format. ISO8601 is
>>> what we must stick with - we just must agree a better way of embedding it
>>> than <abbr>.
>> 
>> Thank you for spelling this out so clearly. Please let's not slip into
>> treating the non-English-speaking Web as a corner case. ISO8601's the thing.
>> And it won't always be what the party reading the page expects (either in
>> terms of language, script or even calendar).
>> 


Not sure if this thread is only covering datetimes in abbreviations. The
title seems to suggest that it's more general so thought I'd chip in with a
thought on geo as an example. How would a parser deal with natural
(non_English) language here? Would it be expected to be able to parse
Manchester or Salford or London or Londres or Londinium?

Whilst it's just about possible to imagine NLP of dates and trickier to
imagine NLP of multi-language date formats it's just beyond the realms of
feasibility to consider NLP of place names


> 
> 
> In what way is ISO 8601 more friendly to non english speakers than any
> other date format?
> Please realise that by insisting that no natural language style will
> be a solution, you are essentially saying that there is no solution to
> this problem.
> 
> 1. metadata and information hiding is out of the question
> 2. putting ISO 8601 style dates ("machine dates") in any place where a
> human can see it or have it read to them  is "the problem" that we are
> trying to solve, so we can't do that.

I thought the problem was any non human readable data where humans can 'see'
it - not confined to datetimes

> 3. The date cannot resemble anything a human might want to read.
> 
> I find it terribly frustrating how many people cannot see that this
> set of constraints yeilds NO solution. At least, when the constraints
> are held to the level of strictness that this community is holding
> them to.

Seems to me there are 2 solutions:

1. relax the data hiding constraint (tricky because it's fundamental to the
uf design philosophy and it's relaxation has been rejected many times)

2. maintain the status quo. Keep the abbreviation design pattern for machine
friendly data and leave it up to publishers to decide if this is an issue
for them - or not. It would probably need the microformats community to
promote the design philosophy and potential issues a little higher than at
present. But the wiki already documents much of this - just a bit more
prominent linking and  some padding out of /about to be a little more
neutral.
> 
> 
>>> Natural language parsing is really not the way to go, nor is a limited
>>> range of date formats that *look* like NLP, because publishers will believe
>>> them to *be* NLP and start publishing in any old date format. ISO8601 is
>>> what we must stick with - we just must agree a better way of embedding it
>>> than <abbr>.
>> 
> 
> You may not like it, but too bad. Making a date resemble natural
> languge is the only way to go. I don't say this because it's my
> opinion. This is merely a fact, due to the nature of the problem, and
> the constraints that the community has enforced on possible solutions.
> Accept it, or doom yourselves to reasoning around in circles some
> more, as you have already done.


http://www.bbc.co.uk/
This e-mail (and any attachments) is confidential and may contain personal views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.
					
From zen at zenpsycho.com  Mon Jul 14 05:39:05 2008
From: zen at zenpsycho.com (Breton Slivka)
Date: Mon Jul 14 05:39:09 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <C4A0F4CF.DC8F%Michael.Smethurst@bbc.co.uk>
References: <ae2b2ba80807120650q5f878e02ped917007f51347c0@mail.gmail.com>
	<C4A0F4CF.DC8F%Michael.Smethurst@bbc.co.uk>
Message-ID: <ae2b2ba80807140539j15d1f945kfbb4b4b4c65114ad@mail.gmail.com>

>
> Not sure if this thread is only covering datetimes in abbreviations. The
> title seems to suggest that it's more general so thought I'd chip in with a
> thought on geo as an example. How would a parser deal with natural
> (non_English) language here? Would it be expected to be able to parse
> Manchester or Salford or London or Londres or Londinium?
>
> Whilst it's just about possible to imagine NLP of dates and trickier to
> imagine NLP of multi-language date formats it's just beyond the realms of
> feasibility to consider NLP of place names
>

I'm confused, I'm afraid I don't understand the point of this thought excercise.

>

> I thought the problem was any non human readable data where humans can 'see'
> it - not confined to datetimes
>

One step at a time.


>> I find it terribly frustrating how many people cannot see that this
>> set of constraints yeilds NO solution. At least, when the constraints
>> are held to the level of strictness that this community is holding
>> them to.
>
> Seems to me there are 2 solutions:
>
> 1. relax the data hiding constraint (tricky because it's fundamental to the
> uf design philosophy and it's relaxation has been rejected many times)
>
> 2. maintain the status quo. Keep the abbreviation design pattern for machine
> friendly data and leave it up to publishers to decide if this is an issue
> for them - or not. It would probably need the microformats community to
> promote the design philosophy and potential issues a little higher than at
> present. But the wiki already documents much of this - just a bit more
> prominent linking and  some padding out of /about to be a little more
> neutral.
>>
>>


There is another solution that I have been trying to advocate, which
is not metadata, and it's not natural language parsing. It is quite
simply, to define a strict date format that IS human readable, which
can optionally be used in place of ISO 8601 in the title attribute of
an ABBR tag.  You can keep the percieved benefits of ISO 8601 for
international users, because the current pattern will continue to
work. However, for users in languages with a well defined date format,
a screen reader will not trip up on the date.

Whenever I mention this though, everyone seems to think I'm advocating
natural language processing. Let me just say again that this is not
the case.

I'm highly suspicious of the counterargument that such a solution
would need to support every language that ISO 8601 supports. This
argument does not make sense to me for two reasons: The first, iso
8601 doesn't support ANY language, it is only one date format among
many, based on an anglicised calendar, with the only multilingual
benefit owing to the fact that happens to be an international
standard. To someone with a different calendar, ISO8601 may make just
as much sense as "July 1st, 2007." that is: very little.

I like ISO 8601, but placing it in the title attribute of the ABBR has
clearly been a failure, if not a practical failure, it has been a
failure to the public image of microformats, and it has ultimately
shown the failure of the microformats community structure to be able
to deal with an issue such as this effectively.

The other reason I'm suspicious of this reason is that such a format
would practically only need  to support as many languages as there are
screen readers. Unless a screen reader supports iso8601 in a title
attribute specifically, it's going to read out gibberish, and if it
encounters a date written in the wrong language it will read out
gibberish. No difference. However, in what I believe is the 80% case,
it reads out a date written in the correct language, then we've just
improved the experience for more people than we were able to
satisfactorally publish to before. What's the counterargument to that?


Another solution is to lobby the screen reader vendors to add explicit
support for ISO 8601 dates. It's a popular pattern for markup, and
adding support for reading them more humanely would provide a clear
benefit for their customers. I personally feel that this solution
would see more success than trying to wrangle the whole of the
microformats community into agreement on this issue.
From nirmal at gatech.edu  Mon Jul 14 06:01:24 2008
From: nirmal at gatech.edu (Nirmal Patel)
Date: Mon Jul 14 06:01:27 2008
Subject: [uf-discuss] Generically converting JSON to POSH
In-Reply-To: <D1E97365-7972-4AEB-A295-E0ECDED416CB@tobyinkster.co.uk>
References: <D1E97365-7972-4AEB-A295-E0ECDED416CB@tobyinkster.co.uk>
Message-ID: <a450e4a80807140601w619406e6tc4843c873e7af406@mail.gmail.com>

On Mon, Jul 14, 2008 at 4:52 AM, Toby A Inkster <mail@tobyinkster.co.uk> wrote:
<snip>
> Looks quite good. Personally I'd have converted JSON objects into definition
> lists using DT for the key and DD for the value though. Then perhaps add the
> key to the class of each *as well*. So, for instance:
>
> {
>  "key1" : "value1" ,
>  "key2" : "value2"
> }
>
> becomes:
>
> <dl class="json-object">
>  <dt class="key1">key1</dt>
>  <dd class="key1">value1</dt>
>  <dt class="key2">key2</dt>
>  <dd class="key2">value2</dt>
> </dl>

I had initially done this but came to the conclusion that this put key1 into
the html as content and that seemed wrong.

>
> Which should be more flexible with regards to styling, and probably more
> useful when looked at without CSS. If you happen to like the unordered list
> look, then:

I never intended to use the results of this library for exploring the JS. I
suppose that this should be made more explicit.

>
> dl.json-object dt { display: none; }
> dl.json-object dd { list-style-type: circle; }

And this was exactly how I styled the result. :)

>
> However, have you considered what happens when the object keys contain
> whitespace characters? e.g.
>
> {
>  "some key" : "some value"
> }
>
> --
> Toby A Inkster
> <mailto:mail@tobyinkster.co.uk>
> <http://tobyinkster.co.uk>
>
>
>
> _______________________________________________
> microformats-discuss mailing list
> microformats-discuss@microformats.org
> http://microformats.org/mailman/listinfo/microformats-discuss
>



-- 
Nirmal Patel
www.nirmalpatel.com
From kevinmarks at gmail.com  Mon Jul 14 07:24:33 2008
From: kevinmarks at gmail.com (Kevin Marks)
Date: Mon Jul 14 07:24:36 2008
Subject: [uf-discuss] Generically converting JSON to POSH
In-Reply-To: <73766b160807140723g5c7bcd5drdf2cd55871ce85b2@mail.gmail.com>
References: <D1E97365-7972-4AEB-A295-E0ECDED416CB@tobyinkster.co.uk>
	<a450e4a80807140601w619406e6tc4843c873e7af406@mail.gmail.com>
	<73766b160807140723g5c7bcd5drdf2cd55871ce85b2@mail.gmail.com>
Message-ID: <73766b160807140724v7da199b6i30bfdc34b45c48f5@mail.gmail.com>

XOXO is the generic way to turn JSON into HTML (and back) -  see

http://www.mail-archive.com/microformats-discuss@microformats.org/msg06827.html

The problem is knowing what are user-visible keys and what aren't

On Mon, Jul 14, 2008 at 6:01 AM, Nirmal Patel <nirmal@gatech.edu> wrote:
>
> On Mon, Jul 14, 2008 at 4:52 AM, Toby A Inkster <mail@tobyinkster.co.uk> wrote:
> <snip>
> > Looks quite good. Personally I'd have converted JSON objects into definition
> > lists using DT for the key and DD for the value though. Then perhaps add the
> > key to the class of each *as well*. So, for instance:
> >
> > {
> >  "key1" : "value1" ,
> >  "key2" : "value2"
> > }
> >
> > becomes:
> >
> > <dl class="json-object">
> >  <dt class="key1">key1</dt>
> >  <dd class="key1">value1</dt>
> >  <dt class="key2">key2</dt>
> >  <dd class="key2">value2</dt>
> > </dl>
>
> I had initially done this but came to the conclusion that this put key1 into
> the html as content and that seemed wrong.
>
> >
> > Which should be more flexible with regards to styling, and probably more
> > useful when looked at without CSS. If you happen to like the unordered list
> > look, then:
>
> I never intended to use the results of this library for exploring the JS. I
> suppose that this should be made more explicit.
>
> >
> > dl.json-object dt { display: none; }
> > dl.json-object dd { list-style-type: circle; }
>
> And this was exactly how I styled the result. :)
>
> >
> > However, have you considered what happens when the object keys contain
> > whitespace characters? e.g.
> >
> > {
> >  "some key" : "some value"
> > }
> >
> > --
> > Toby A Inkster
> > <mailto:mail@tobyinkster.co.uk>
> > <http://tobyinkster.co.uk>
> >
> >
> >
> > _______________________________________________
> > microformats-discuss mailing list
> > microformats-discuss@microformats.org
> > http://microformats.org/mailman/listinfo/microformats-discuss
> >
>
>
>
> --
> Nirmal Patel
> www.nirmalpatel.com
> _______________________________________________
> microformats-discuss mailing list
> microformats-discuss@microformats.org
> http://microformats.org/mailman/listinfo/microformats-discuss
From nirmal at gatech.edu  Mon Jul 14 07:31:39 2008
From: nirmal at gatech.edu (Nirmal Patel)
Date: Mon Jul 14 07:31:41 2008
Subject: [uf-discuss] Generically converting JSON to POSH
In-Reply-To: <73766b160807140724v7da199b6i30bfdc34b45c48f5@mail.gmail.com>
References: <D1E97365-7972-4AEB-A295-E0ECDED416CB@tobyinkster.co.uk>
	<a450e4a80807140601w619406e6tc4843c873e7af406@mail.gmail.com>
	<73766b160807140723g5c7bcd5drdf2cd55871ce85b2@mail.gmail.com>
	<73766b160807140724v7da199b6i30bfdc34b45c48f5@mail.gmail.com>
Message-ID: <a450e4a80807140731h6d7954fcsfe5618f76540f276@mail.gmail.com>

On Mon, Jul 14, 2008 at 10:24 AM, Kevin Marks <kevinmarks@gmail.com> wrote:
> XOXO is the generic way to turn JSON into HTML (and back) -  see
>
> http://www.mail-archive.com/microformats-discuss@microformats.org/msg06827.html
>
> The problem is knowing what are user-visible keys and what aren't

My current solution is about quickly styling all of the content. So
you can explore
your styling options. Maybe for deployment it should be possible to pass
a xpath style filter to limit which nodes are converted.

I like your server side script (Go Python!) but I wanted to stick to JS because
I wanted something to use with results of JS Badge calls where JSON returns
are the norm.

>
> On Mon, Jul 14, 2008 at 6:01 AM, Nirmal Patel <nirmal@gatech.edu> wrote:
>>
>> On Mon, Jul 14, 2008 at 4:52 AM, Toby A Inkster <mail@tobyinkster.co.uk> wrote:
>> <snip>
>> > Looks quite good. Personally I'd have converted JSON objects into definition
>> > lists using DT for the key and DD for the value though. Then perhaps add the
>> > key to the class of each *as well*. So, for instance:
>> >
>> > {
>> >  "key1" : "value1" ,
>> >  "key2" : "value2"
>> > }
>> >
>> > becomes:
>> >
>> > <dl class="json-object">
>> >  <dt class="key1">key1</dt>
>> >  <dd class="key1">value1</dt>
>> >  <dt class="key2">key2</dt>
>> >  <dd class="key2">value2</dt>
>> > </dl>
>>
>> I had initially done this but came to the conclusion that this put key1 into
>> the html as content and that seemed wrong.
>>
>> >
>> > Which should be more flexible with regards to styling, and probably more
>> > useful when looked at without CSS. If you happen to like the unordered list
>> > look, then:
>>
>> I never intended to use the results of this library for exploring the JS. I
>> suppose that this should be made more explicit.
>>
>> >
>> > dl.json-object dt { display: none; }
>> > dl.json-object dd { list-style-type: circle; }
>>
>> And this was exactly how I styled the result. :)
>>
>> >
>> > However, have you considered what happens when the object keys contain
>> > whitespace characters? e.g.
>> >
>> > {
>> >  "some key" : "some value"
>> > }
>> >
>> > --
>> > Toby A Inkster
>> > <mailto:mail@tobyinkster.co.uk>
>> > <http://tobyinkster.co.uk>
>> >
>> >
>> >
>> > _______________________________________________
>> > microformats-discuss mailing list
>> > microformats-discuss@microformats.org
>> > http://microformats.org/mailman/listinfo/microformats-discuss
>> >
>>
>>
>>
>> --
>> Nirmal Patel
>> www.nirmalpatel.com
>> _______________________________________________
>> microformats-discuss mailing list
>> microformats-discuss@microformats.org
>> http://microformats.org/mailman/listinfo/microformats-discuss
> _______________________________________________
> microformats-discuss mailing list
> microformats-discuss@microformats.org
> http://microformats.org/mailman/listinfo/microformats-discuss
>



-- 
Nirmal Patel
www.nirmalpatel.com
From mdagn at spraci.com  Mon Jul 14 07:36:07 2008
From: mdagn at spraci.com (Michael)
Date: Mon Jul 14 07:36:12 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <C4A0F4CF.DC8F%Michael.Smethurst@bbc.co.uk>
References: <C4A0F4CF.DC8F%Michael.Smethurst@bbc.co.uk>
Message-ID: <487B6457.1020609@spraci.com>


>
> Seems to me there are 2 solutions:
>
> 1. relax the data hiding constraint (tricky because it's fundamental to the
> uf design philosophy and it's relaxation has been rejected many times)
>
> 2. maintain the status quo. Keep the abbreviation design pattern for machine
> friendly data and leave it up to publishers to decide if this is an issue
> for them - or not. It would probably need the microformats community to
> promote the design philosophy and potential issues a little higher than at
> present. But the wiki already documents much of this - just a bit more
> prominent linking and  some padding out of /about to be a little more
> neutral.
>   

actually the suggestion of splitting the datetime into date, time and 
timezone marked up in separate elements seems to me like a good compromise.

yyyy-mm-dd would certainly not be as scary for humans as a full datetime 
with timezone
and it would avoid needing to hide data and be much easier to do than 
trying to cope with lots of different date formats or trying to do NLP.

In fact it might even help a human in cases where the "human date" is 
ambiguous!






From Scott at randomchaos.com  Mon Jul 14 10:05:28 2008
From: Scott at randomchaos.com (Scott Reynen)
Date: Mon Jul 14 10:27:11 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <ae2b2ba80807140539j15d1f945kfbb4b4b4c65114ad@mail.gmail.com>
References: <ae2b2ba80807120650q5f878e02ped917007f51347c0@mail.gmail.com>
	<C4A0F4CF.DC8F%Michael.Smethurst@bbc.co.uk>
	<ae2b2ba80807140539j15d1f945kfbb4b4b4c65114ad@mail.gmail.com>
Message-ID: <1D57CC87-0A69-4F82-8CA9-EA941B44CFC7@randomchaos.com>

On [Jul 14], at [ Jul 14] 6:39 , Breton Slivka wrote:

> To someone with a different calendar, ISO8601 may make just
> as much sense as "July 1st, 2007." that is: very little.


I'm assuming by "different calendar," you mean non-Gregorian?  If so,  
what are the use cases for non-Gregorian dates in hCalendar?  The use  
cases for Gregorian calendars in hCalendar are well-established:  
iCalendar uses Gregorian ISO 8601, so any iCalendar-supporting  
applications can make use of these dates.  But I have no idea what the  
use cases are for non-Gregorian dates   Are there many applications  
that can use such dates?  The use cases are crucial for evaluating  
whether hCalendar should support non-Gregorian dates, and if so, how  
that should work.

Peace,
Scott
From mail at tobyinkster.co.uk  Mon Jul 14 13:54:57 2008
From: mail at tobyinkster.co.uk (Toby A Inkster)
Date: Mon Jul 14 13:55:21 2008
Subject: [uf-discuss] Human and machine readable data format
Message-ID: <FA54BB6C-2AD6-49E8-93C0-2792E407B17F@tobyinkster.co.uk>

Scott Reynen wrote:

> I'm assuming by "different calendar," you mean non-Gregorian?  If so,
> what are the use cases for non-Gregorian dates in hCalendar?

It's not so much the case of wanting to encode non-Gregorian dates in  
hCalendar, but wanting to include non-Gregorian dates on the web page.

   <abbr class="dtstart" title="2008-07-14">11 Rajab 1429</abbr>

Is '2008-07-14' to be considered an appropriate expansion of the  
"abbreviation" '11 Rajab 1429'?

In case anyone is wondering whether non-Gregorian calendars are used  
in practice, the Islamic calendar (used in the example above) is the  
official calendar for Saudi Arabia, and used in religious contexts in  
many other countries; the Julian calendar is still used in religious  
contexts by Orthodox Christian churches, and frequently used by  
historians to refer to many older dates; the Chinese calendar is used  
for various religious and cultural reasons not just in China, but in  
some other Asian countries, but not for any official purposes.

I would cite specific pages that use these calendars, but I don't  
speak Arabic, Russian or Mandarin, so don't know the correct terms to  
Google for.

So there will be cases where people want to publish non-Gregorian  
dates, but for interoperability with iCalendar, they'll need to  
include a machine-readable Gregorian equivalent date. This is an  
example of where you're going to have very significant differences  
between the human and machine-readable representations of the same  
dates.

(It's also interesting to note that automatic translation from the  
Islamic calendar to Gregorian is impossible to perform reliably, as  
it is based on human observation of the movements of the sun and  
moon, not on the actual -- predictable -- movements of the sun and  
the moon. Thus the exact numbering of dates is not usually known very  
far in advance.)

-- 
Toby A Inkster
<mailto:mail@tobyinkster.co.uk>
<http://tobyinkster.co.uk>



From zen at zenpsycho.com  Mon Jul 14 15:06:53 2008
From: zen at zenpsycho.com (Breton Slivka)
Date: Mon Jul 14 15:06:57 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <487B6457.1020609@spraci.com>
References: <C4A0F4CF.DC8F%Michael.Smethurst@bbc.co.uk>
	<487B6457.1020609@spraci.com>
Message-ID: <ae2b2ba80807141506t5dc49a1fq834613647acd0eab@mail.gmail.com>

On Tue, Jul 15, 2008 at 12:36 AM, Michael <mdagn@spraci.com> wrote:
>
>>
>> Seems to me there are 2 solutions:
>>
>> 1. relax the data hiding constraint (tricky because it's fundamental to
>> the
>> uf design philosophy and it's relaxation has been rejected many times)
>>
>> 2. maintain the status quo. Keep the abbreviation design pattern for
>> machine
>> friendly data and leave it up to publishers to decide if this is an issue
>> for them - or not. It would probably need the microformats community to
>> promote the design philosophy and potential issues a little higher than at
>> present. But the wiki already documents much of this - just a bit more
>> prominent linking and  some padding out of /about to be a little more
>> neutral.
>>
>
> actually the suggestion of splitting the datetime into date, time and
> timezone marked up in separate elements seems to me like a good compromise.
>
> yyyy-mm-dd would certainly not be as scary for humans as a full datetime
> with timezone
> and it would avoid needing to hide data and be much easier to do than trying
> to cope with lots of different date formats or trying to do NLP.
>
> In fact it might even help a human in cases where the "human date" is
> ambiguous!
>

It might be a good comprimise, but does it actually solve the problem?
If they're all in a row aren't we right back where we started? The
screen reader would read them all in order, would it not? Or would it
add extra pauses by virtue of them being in seperate elements, or
having spaces between them?
From john at westciv.com  Mon Jul 14 16:57:07 2008
From: john at westciv.com (John Allsopp)
Date: Mon Jul 14 16:57:16 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <1D57CC87-0A69-4F82-8CA9-EA941B44CFC7@randomchaos.com>
References: <ae2b2ba80807120650q5f878e02ped917007f51347c0@mail.gmail.com>
	<C4A0F4CF.DC8F%Michael.Smethurst@bbc.co.uk>
	<ae2b2ba80807140539j15d1f945kfbb4b4b4c65114ad@mail.gmail.com>
	<1D57CC87-0A69-4F82-8CA9-EA941B44CFC7@randomchaos.com>
Message-ID: <0DFCC49B-6B07-44F6-8E29-35B4100113FD@westciv.com>

Scott,

> But I have no idea what the use cases are for non-Gregorian dates    
> Are there many applications that can use such dates? The use cases  
> are crucial for evaluating whether hCalendar should support non- 
> Gregorian dates, and if so, how that should work.


I recently learnt that in Japan there are two year numbering systems.  
The western style one is more common, but it far from uncommon to use  
the traditional Japanese year numbering system as well.

john

John Allsopp

style master :: css editor :: http://westciv.com/style_master
about me :: http://johnfallsopp.com
Web Directions Conferences :: http://webdirections.org
My Microformats book :: http://microformatique.com/book


From Charles.Belov at sfmta.com  Mon Jul 14 17:25:50 2008
From: Charles.Belov at sfmta.com (Belov, Charles)
Date: Mon Jul 14 17:25:54 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <200807142357.m6ENvhng028068@microformats.org>
References: <200807142357.m6ENvhng028068@microformats.org>
Message-ID: <E17F75B6E86AE842A57B4534F82D0376201199@MTAMAIL.muni.sfgov.org>

See below.  

Hope this helps,
Charles Belov 
SFMTA Webmaster
> 
> ------------------------------
> 
> Message: 4
> Date: Tue, 15 Jul 2008 00:36:07 +1000
> From: Michael <mdagn@spraci.com>
> Subject: Re: [uf-discuss] Human and machine readable data format
> To: Microformats Discuss <microformats-discuss@microformats.org>
> Message-ID: <487B6457.1020609@spraci.com>
> Content-Type: text/plain; charset=US-ASCII; format=flowed
> 
> 
>>   
> 
> actually the suggestion of splitting the datetime into date, time and 
> timezone marked up in separate elements seems to me like a good
compromise.
> 
> yyyy-mm-dd would certainly not be as scary for humans as a full
datetime 
> with timezone

It would still not be pleasant.

Month, day, year, hour, minute, second, time zone, and optional am/pm
could all be split up, removing ambiguity.

> ------------------------------
> 
> Message: 6
> Date: Mon, 14 Jul 2008 21:54:57 +0100
> From: Toby A Inkster <mail@tobyinkster.co.uk>
> Subject: [uf-discuss] Human and machine readable data format
> To: microformats-discuss@microformats.org
> Message-ID: <FA54BB6C-2AD6-49E8-93C0-2792E407B17F@tobyinkster.co.uk>
> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed
> 
> Scott Reynen wrote:
> 
>> I'm assuming by "different calendar," you mean non-Gregorian?  If so,
>> what are the use cases for non-Gregorian dates in hCalendar?
> 
> It's not so much the case of wanting to encode non-Gregorian dates in

> hCalendar, but wanting to include non-Gregorian dates on the web page.
> 
>    <abbr class="dtstart" title="2008-07-14">11 Rajab 1429</abbr>
> 
> Is '2008-07-14' to be considered an appropriate expansion of the  
> "abbreviation" '11 Rajab 1429'?
> 
> In case anyone is wondering whether non-Gregorian calendars are used  
> in practice, the Islamic calendar (used in the example above) is the  
> official calendar for Saudi Arabia, and used in religious contexts in

> many other countries; the Julian calendar is still used in religious  
> contexts by Orthodox Christian churches, and frequently used by  
> historians to refer to many older dates; the Chinese calendar is used

> for various religious and cultural reasons not just in China, but in  
> some other Asian countries, but not for any official purposes.
 
> I would cite specific pages that use these calendars, but I don't  
> speak Arabic, Russian or Mandarin, so don't know the correct terms to

> Google for.
 
> So there will be cases where people want to publish non-Gregorian  
> dates, but for interoperability with iCalendar, they'll need to  
> include a machine-readable Gregorian equivalent date. This is an  
> example of where you're going to have very significant differences  
> between the human and machine-readable representations of the same  
> dates.

Well, gee, if an Arabic screen reading program read out a Gregorian date
where the author was expecting an Arabic date to be read, that could be
pretty insulting.
 
In any case, you seem to be assuming a human entering a non-Gregorian
date (or, for that matter, a Gregorian date) can accurately transform
the human-readable date into a machine-readable date.  I can tell you
right now that I personally am 24-hour-calendar challenged.  I usually
get the 12-to-24 hour conversion right, or vice versa, but now and
then...

And I wouldn't want a screen reader to read the time to me using the
24-hour clock on a U.S. website.

I believe machines can do this translation more reliably than humans,
provided they are asked to do so.

In any case, there could be a parameter for alternate calendar.

> (It's also interesting to note that automatic translation from the  
> Islamic calendar to Gregorian is impossible to perform reliably, as  
> it is based on human observation of the movements of the sun and  
> moon, not on the actual -- predictable -- movements of the sun and  
> the moon. Thus the exact numbering of dates is not usually known very

> far in advance.)
> 

Then it seems there would be no way to provide a reliable ISO date for
non-impending events; therefore, requiring ISO for the hCalendar record
would prevent use of hCalendar for that event.  (For that matter, you
would need latitude and longitude to eventually resolve the date and
time.)


From scott at randomchaos.com  Mon Jul 14 19:16:32 2008
From: scott at randomchaos.com (Scott Reynen)
Date: Mon Jul 14 19:16:42 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <0DFCC49B-6B07-44F6-8E29-35B4100113FD@westciv.com>
References: <ae2b2ba80807120650q5f878e02ped917007f51347c0@mail.gmail.com>
	<C4A0F4CF.DC8F%Michael.Smethurst@bbc.co.uk>
	<ae2b2ba80807140539j15d1f945kfbb4b4b4c65114ad@mail.gmail.com>
	<1D57CC87-0A69-4F82-8CA9-EA941B44CFC7@randomchaos.com>
	<0DFCC49B-6B07-44F6-8E29-35B4100113FD@westciv.com>
Message-ID: <18650D3B-5223-40FE-8E30-1F0794660949@randomchaos.com>

On [Jul 14], at [ Jul 14] 5:57 , John Allsopp wrote:

> I recently learnt that in Japan there are two year numbering  
> systems. The western style one is more common, but it far from  
> uncommon to use the traditional Japanese year numbering system as  
> well.


Do you have any examples of the non-Gregorian dates being published  
online?  Or any examples of applications that can take non-Gregorian  
dates as input?

I think we've established non-Gregorian calendars exist, but most  
countries officially adopted the Gregorian calendar several decades  
before the web existed (e.g. Japan in 1873).  Such adoption wasn't  
exclusive, but it draws into question (for me anyway) whether such  
calendars are common enough on the web and have enough potential use  
cases to warrant modeling in microformats.  I realize it's difficult  
to do such research without belonging to the cultures in which it  
would appear.  Unfortunately that just makes it more necessary to  
avoid mistakes.

Peace,
Scott

From karl at w3.org  Mon Jul 14 19:40:36 2008
From: karl at w3.org (Karl Dubost)
Date: Mon Jul 14 19:40:45 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <18650D3B-5223-40FE-8E30-1F0794660949@randomchaos.com>
References: <ae2b2ba80807120650q5f878e02ped917007f51347c0@mail.gmail.com>
	<C4A0F4CF.DC8F%Michael.Smethurst@bbc.co.uk>
	<ae2b2ba80807140539j15d1f945kfbb4b4b4c65114ad@mail.gmail.com>
	<1D57CC87-0A69-4F82-8CA9-EA941B44CFC7@randomchaos.com>
	<0DFCC49B-6B07-44F6-8E29-35B4100113FD@westciv.com>
	<18650D3B-5223-40FE-8E30-1F0794660949@randomchaos.com>
Message-ID: <D22DDA9E-5628-48EF-87CE-45800C2C164E@w3.org>


Le 15 juil. 2008 ? 11:16, Scott Reynen a ?crit :
> Do you have any examples of the non-Gregorian dates being published  
> online?  Or any examples of applications that can take non-Gregorian  
> dates as input?

For those who need to understand.
http://en.wikipedia.org/wiki/Japanese_era_name

The era system is very common on paper form, and on labels in  
supermarket at least (for those I have noticed in my daily life in  
Japan). In fact it is a mix, it is not regular. Some forms have even  
the possibility to deal with the two systems.

It is mostly used by officials organizations like governments.

For example this article in one of the main national newspapers: Yomiuri

???20???????????????????? 
??????
http://home.yomiuri.co.jp/wnews/20080711hg03.htm

??20? - this is the year 20 of Heisei Era.
The sentence says the project started at this date. You will notice  
that the article has also dates in gregorian calendar, so it mixes both.



-- 
Karl Dubost - W3C
http://www.w3.org/QA/
Be Strict To Be Cool







From karl at w3.org  Mon Jul 14 19:53:13 2008
From: karl at w3.org (Karl Dubost)
Date: Mon Jul 14 19:53:17 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <D22DDA9E-5628-48EF-87CE-45800C2C164E@w3.org>
References: <ae2b2ba80807120650q5f878e02ped917007f51347c0@mail.gmail.com>
	<C4A0F4CF.DC8F%Michael.Smethurst@bbc.co.uk>
	<ae2b2ba80807140539j15d1f945kfbb4b4b4c65114ad@mail.gmail.com>
	<1D57CC87-0A69-4F82-8CA9-EA941B44CFC7@randomchaos.com>
	<0DFCC49B-6B07-44F6-8E29-35B4100113FD@westciv.com>
	<18650D3B-5223-40FE-8E30-1F0794660949@randomchaos.com>
	<D22DDA9E-5628-48EF-87CE-45800C2C164E@w3.org>
Message-ID: <B4772AD8-9926-4C10-8060-C4F97A01EDF1@w3.org>


Another example of a form with Japanese Era Calendar
http://urakoma.com/bbs.html

following the character "?" there is a drop down menu where you can  
choose an era or the gregorian calendar.

<option value="1">??</option>
<option value="2">??</option>
<option value="3" selected>??</option>
<option value="4">??</option>
<option value="5">??19</option>
<option value="6">??20</option>



-- 
Karl Dubost - W3C
http://www.w3.org/QA/
Be Strict To Be Cool







From zen at zenpsycho.com  Mon Jul 14 19:53:38 2008
From: zen at zenpsycho.com (Breton Slivka)
Date: Mon Jul 14 19:53:40 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <18650D3B-5223-40FE-8E30-1F0794660949@randomchaos.com>
References: <ae2b2ba80807120650q5f878e02ped917007f51347c0@mail.gmail.com>
	<C4A0F4CF.DC8F%Michael.Smethurst@bbc.co.uk>
	<ae2b2ba80807140539j15d1f945kfbb4b4b4c65114ad@mail.gmail.com>
	<1D57CC87-0A69-4F82-8CA9-EA941B44CFC7@randomchaos.com>
	<0DFCC49B-6B07-44F6-8E29-35B4100113FD@westciv.com>
	<18650D3B-5223-40FE-8E30-1F0794660949@randomchaos.com>
Message-ID: <ae2b2ba80807141953j1bd9d54ara644e221e256e0ac@mail.gmail.com>

> Do you have any examples of the non-Gregorian dates being published online?
>  Or any examples of applications that can take non-Gregorian dates as input?
>
> I think we've established non-Gregorian calendars exist, but most countries
> officially adopted the Gregorian calendar several decades before the web
> existed (e.g. Japan in 1873).  Such adoption wasn't exclusive, but it draws
> into question (for me anyway) whether such calendars are common enough on
> the web and have enough potential use cases to warrant modeling in
> microformats.  I realize it's difficult to do such research without
> belonging to the cultures in which it would appear.  Unfortunately that just
> makes it more necessary to avoid mistakes.
>
> Peace,
> Scott
>




Just to clarify, the original point I was trying to make wasn't that
we should model every possible language/calendar in the world. Just
that it was unreasonable to expect that from a potential replacement
for ISO 8601, since ISO 8601 itself does not meet that requirement.
This was in response to "David O" who wrote:


>Feel free to get started.  I'm sure you can start a wiki page with a
>listing of language/region codes and the suggested date format for
>each.  Since the current system handles every one of those languages
>and countries/regions, it would only be logical to expect the same of
>a suggested replacement.

I hope I have convinced a few people that David O's logic falls down
at the premise. But this is not to argue that we should make a
replacement format that handles that usecase, but rather to consider
replacements that don't, since such a replacement would be no worse
than the current format, but *would* provide benefits that ISO8601
does not.
From zen at zenpsycho.com  Mon Jul 14 19:56:19 2008
From: zen at zenpsycho.com (Breton Slivka)
Date: Mon Jul 14 19:56:21 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <ae2b2ba80807141953j1bd9d54ara644e221e256e0ac@mail.gmail.com>
References: <ae2b2ba80807120650q5f878e02ped917007f51347c0@mail.gmail.com>
	<C4A0F4CF.DC8F%Michael.Smethurst@bbc.co.uk>
	<ae2b2ba80807140539j15d1f945kfbb4b4b4c65114ad@mail.gmail.com>
	<1D57CC87-0A69-4F82-8CA9-EA941B44CFC7@randomchaos.com>
	<0DFCC49B-6B07-44F6-8E29-35B4100113FD@westciv.com>
	<18650D3B-5223-40FE-8E30-1F0794660949@randomchaos.com>
	<ae2b2ba80807141953j1bd9d54ara644e221e256e0ac@mail.gmail.com>
Message-ID: <ae2b2ba80807141956s713f199ftfa86223be3cf8499@mail.gmail.com>

On Tue, Jul 15, 2008 at 12:53 PM, Breton Slivka <zen@zenpsycho.com> wrote:
>> Do you have any examples of the non-Gregorian dates being published online?
>>  Or any examples of applications that can take non-Gregorian dates as input?
>>
>> I think we've established non-Gregorian calendars exist, but most countries
>> officially adopted the Gregorian calendar several decades before the web
>> existed (e.g. Japan in 1873).  Such adoption wasn't exclusive, but it draws
>> into question (for me anyway) whether such calendars are common enough on
>> the web and have enough potential use cases to warrant modeling in
>> microformats.  I realize it's difficult to do such research without
>> belonging to the cultures in which it would appear.  Unfortunately that just
>> makes it more necessary to avoid mistakes.
>>
>> Peace,
>> Scott
>>
>
>
>
>
> Just to clarify, the original point I was trying to make wasn't that
> we should model every possible language/calendar in the world. Just
> that it was unreasonable to expect that from a potential replacement
> for ISO 8601, since ISO 8601 itself does not meet that requirement.
> This was in response to "David O" who wrote:
>
>
>>Feel free to get started.  I'm sure you can start a wiki page with a
>>listing of language/region codes and the suggested date format for
>>each.  Since the current system handles every one of those languages
>>and countries/regions, it would only be logical to expect the same of
>>a suggested replacement.
>
> I hope I have convinced a few people that David O's logic falls down
> at the premise. But this is not to argue that we should make a
> replacement format that handles that usecase, but rather to consider
> replacements that don't, since such a replacement would be no worse
> than the current format, but *would* provide benefits that ISO8601
> does not.
>

And just for the record, I would happily construct such a wikipage,
but I am overcommitted as it is! Perhaps in time, once some things are
calmed down.
From bjonkman at sobac.com  Mon Jul 14 20:51:26 2008
From: bjonkman at sobac.com (Bob Jonkman)
Date: Mon Jul 14 21:22:23 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <FA54BB6C-2AD6-49E8-93C0-2792E407B17F@tobyinkster.co.uk>
References: <FA54BB6C-2AD6-49E8-93C0-2792E407B17F@tobyinkster.co.uk>
Message-ID: <487BE67E.5085.52C75A7@bjonkman.sobac.com>

On 14 Jul 2008 at 21:54, Toby A Inkster wrote:

> So there will be cases where people want to publish non-Gregorian 
> dates, but for interoperability with iCalendar, they'll need to 
> include a machine-readable Gregorian equivalent date. 

Actually, not necessary.  The iCalendar spec [1] contains a property 
CALSCALE that can be used to specify the "scale" of the calendar.  
I'm not sure if any CALSCALE property values other than "GREGORIAN" 
are defined, but that's the way to use alternate calendars.


Right now CALSCALE is not in the hCalendar property list [2] but it 
could be...

--Bob.


[1] http://tools.ietf.org/html/rfc2445#section-4.7.1

[2] http://microformats.org/wiki/hcalendar#Property_List
-- -- -- --
Bob Jonkman <bjonkman@sobac.com>         http://sobac.com/sobac/
SOBAC Microcomputer Services              Voice: +1-519-669-0388
6 James Street, Elmira ON  Canada  N3B 1L5  Cel: +1-519-635-9413
Software   ---   Office & Business Automation   ---   Consulting


From bjonkman at sobac.com  Mon Jul 14 20:39:57 2008
From: bjonkman at sobac.com (Bob Jonkman)
Date: Mon Jul 14 21:22:47 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <ae2b2ba80807140539j15d1f945kfbb4b4b4c65114ad@mail.gmail.com>
References: <ae2b2ba80807120650q5f878e02ped917007f51347c0@mail.gmail.com>,
	<C4A0F4CF.DC8F%Michael.Smethurst@bbc.co.uk>,
	<ae2b2ba80807140539j15d1f945kfbb4b4b4c65114ad@mail.gmail.com>
Message-ID: <487BE3CD.29299.521F3F9@bjonkman.sobac.com>

On 14 Jul 2008 at 22:39, Breton Slivka wrote:

> There is another solution that I have been trying to advocate, which
> is not metadata, and it's not natural language parsing. It is quite
> simply, to define a strict date format that IS human readable, 

But there already IS a strict date format, and is IS human readable 
without any language barriers.  It's the ISO8601 date format, YYYY-
MM-DD.  Same for the time format, hh:mm:ss  

No, it's not the prettiest format, but it exists now and is 
universally accessible.  A conversation on IRC some days ago 
suggested that just the date or time by itself pose no access 
barrier to speech readers (which is why the idea of splitting date 
from time is so attrative).


> which can optionally be used in place of ISO 8601 in the title
> attribute of an ABBR tag. 

And that's the OTHER problem people are objecting to, the (alleged) 
mis-use of the <abbr> tag.


> Unless a screen reader supports iso8601 in a title
> attribute specifically, 

...and I believe that Jaws does, as long as the date is separated 
with dashes, eg. 2008-07-14  and the time is separated with colons, 
eg. 22:30:00


So, it's for these reasons that I am not in favour of any prosaic 
date formats.  Besides, the microformats community doesn't exist to 
create date format standards -- it adopts existing standards to make 
existing content more accessible (both for people and programs).

--Bob.


-- -- -- --
Bob Jonkman <bjonkman@sobac.com>         http://sobac.com/sobac/
SOBAC Microcomputer Services              Voice: +1-519-669-0388
6 James Street, Elmira ON  Canada  N3B 1L5  Cel: +1-519-635-9413
Software   ---   Office & Business Automation   ---   Consulting


From mail at ciaranmcnulty.com  Tue Jul 15 04:51:41 2008
From: mail at ciaranmcnulty.com (Ciaran McNulty)
Date: Tue Jul 15 04:51:56 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <D22DDA9E-5628-48EF-87CE-45800C2C164E@w3.org>
References: <ae2b2ba80807120650q5f878e02ped917007f51347c0@mail.gmail.com>
	<C4A0F4CF.DC8F%Michael.Smethurst@bbc.co.uk>
	<ae2b2ba80807140539j15d1f945kfbb4b4b4c65114ad@mail.gmail.com>
	<1D57CC87-0A69-4F82-8CA9-EA941B44CFC7@randomchaos.com>
	<0DFCC49B-6B07-44F6-8E29-35B4100113FD@westciv.com>
	<18650D3B-5223-40FE-8E30-1F0794660949@randomchaos.com>
	<D22DDA9E-5628-48EF-87CE-45800C2C164E@w3.org>
Message-ID: <cdc278e10807150451v37a23203k96556e197065a427@mail.gmail.com>

Another example of non-Gregorian calendaring is Saudi Arabia, where
the arabic calendar is in common usage:

http://www.sama.gov.sa/

(actually clicking the 'english' tab on that page shows the gregorian dates)

-Ciaran McNulty

On Tue, Jul 15, 2008 at 3:40 AM, Karl Dubost <karl@w3.org> wrote:
>
> Le 15 juil. 2008 ? 11:16, Scott Reynen a ?crit :
>>
>> Do you have any examples of the non-Gregorian dates being published
>> online?  Or any examples of applications that can take non-Gregorian dates
>> as input?
>
> For those who need to understand.
> http://en.wikipedia.org/wiki/Japanese_era_name
>
> The era system is very common on paper form, and on labels in supermarket at
> least (for those I have noticed in my daily life in Japan). In fact it is a
> mix, it is not regular. Some forms have even the possibility to deal with
> the two systems.
>
> It is mostly used by officials organizations like governments.
>
> For example this article in one of the main national newspapers: Yomiuri
>
> ???20??????????????????????????
> http://home.yomiuri.co.jp/wnews/20080711hg03.htm
>
> ??20? - this is the year 20 of Heisei Era.
> The sentence says the project started at this date. You will notice that the
> article has also dates in gregorian calendar, so it mixes both.
>
>
>
> --
> Karl Dubost - W3C
> http://www.w3.org/QA/
> Be Strict To Be Cool
>
>
>
>
>
>
>
> _______________________________________________
> microformats-discuss mailing list
> microformats-discuss@microformats.org
> http://microformats.org/mailman/listinfo/microformats-discuss
>

From mail at tobyinkster.co.uk  Tue Jul 15 06:48:17 2008
From: mail at tobyinkster.co.uk (Toby A Inkster)
Date: Tue Jul 15 06:48:46 2008
Subject: [uf-discuss] Human and machine readable data format
Message-ID: <389E3548-5B07-474C-9F0D-1601A787E00F@tobyinkster.co.uk>

Bob Jonkman wrote:

> On 14 Jul 2008 at 21:54, Toby A Inkster wrote:
>
> > So there will be cases where people want to publish non-Gregorian
> > dates, but for interoperability with iCalendar, they'll need to
> > include a machine-readable Gregorian equivalent date.
>
> Actually, not necessary.  The iCalendar spec [1] contains a property
> CALSCALE that can be used to specify the "scale" of the calendar.
> I'm not sure if any CALSCALE property values other than "GREGORIAN"
> are defined, but that's the way to use alternate calendars.

For practical purposes it is necessary to include a machine-readable  
Gregorian date. Although the CALSCALE property does exist, the only  
valid value defined for it is "GREGORIAN".

> Right now CALSCALE is not in the hCalendar property list [2] but it
> could be...

Cognition <http://buzzword.org.uk/cognition/> will actually parse  
class="calscale" found within an element with class="vcalendar" (but,  
not within class="vevent"), and will include it in RDF and iCalendar  
output.

-- 
Toby A Inkster
<mailto:mail@tobyinkster.co.uk>
<http://tobyinkster.co.uk>



From scott at randomchaos.com  Tue Jul 15 07:19:38 2008
From: scott at randomchaos.com (Scott Reynen)
Date: Tue Jul 15 07:19:48 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <cdc278e10807150451v37a23203k96556e197065a427@mail.gmail.com>
References: <ae2b2ba80807120650q5f878e02ped917007f51347c0@mail.gmail.com>
	<C4A0F4CF.DC8F%Michael.Smethurst@bbc.co.uk>
	<ae2b2ba80807140539j15d1f945kfbb4b4b4c65114ad@mail.gmail.com>
	<1D57CC87-0A69-4F82-8CA9-EA941B44CFC7@randomchaos.com>
	<0DFCC49B-6B07-44F6-8E29-35B4100113FD@westciv.com>
	<18650D3B-5223-40FE-8E30-1F0794660949@randomchaos.com>
	<D22DDA9E-5628-48EF-87CE-45800C2C164E@w3.org>
	<cdc278e10807150451v37a23203k96556e197065a427@mail.gmail.com>
Message-ID: <E41DED03-76CC-40FF-BAB7-BB5DFAE864BA@randomchaos.com>

On [Jul 15], at [ Jul 15] 5:51 , Ciaran McNulty wrote:

> Another example of non-Gregorian calendaring is Saudi Arabia, where
> the arabic calendar is in common usage:
>
> http://www.sama.gov.sa/

Thanks Karl and Ciaran.  I've added these examples to the wiki here:

http://microformats.org/wiki/hcalendar-brainstorming#Non-Gregorian_Calendars

Please add any more examples you find so we can keep the discussion  
focused on what would help publishers.

Peace,
Scott

From jim at eatyourgreens.org.uk  Wed Jul 16 03:25:31 2008
From: jim at eatyourgreens.org.uk (jim@eatyourgreens.org.uk)
Date: Wed Jul 16 03:25:36 2008
Subject: [uf-discuss] Human and machine readable data format
Message-ID: <380-220087316102531346@M2W042.mail2web.com>

Hello,

The English calendar prior to 1752 was a Julian calendar with the start of
the year on 25th March. Samuel Pepys diary  is an example of publishing
that calendar online (I think):
http://www.pepysdiary.com/

I imagine any historical date prior to the 20th Century is potentially a
problem, as the Julian calendar was still in use as late as the 1920s in
some parts of the world.
http://en.wikipedia.org/wiki/Gregorian_calendar has handy timelines and
tables showing when the Gregorian calendar was adopted around the world,
and when 1st January was adopted as the beginning of the year in various
countries.

TEI has some guidance on marking up historical dates, which might be
relevant.
http://www.tei-c.org/Guidelines/P4/html/CO.html#CONADA
I don't know if there are any other online guides to encoding dates and
times from different calendars.

Jim

Original Message:
-----------------
From: Scott Reynen scott@randomchaos.com
Date: Tue, 15 Jul 2008 08:19:38 -0600
To: microformats-discuss@microformats.org
Subject: Re: [uf-discuss] Human and machine readable data format


On [Jul 15], at [ Jul 15] 5:51 , Ciaran McNulty wrote:

> Another example of non-Gregorian calendaring is Saudi Arabia, where
> the arabic calendar is in common usage:
>
> http://www.sama.gov.sa/

Thanks Karl and Ciaran.  I've added these examples to the wiki here:

http://microformats.org/wiki/hcalendar-brainstorming#Non-Gregorian_Calendars

Please add any more examples you find so we can keep the discussion  
focused on what would help publishers.

Peace,
Scott

_______________________________________________
microformats-discuss mailing list
microformats-discuss@microformats.org
http://microformats.org/mailman/listinfo/microformats-discuss


--------------------------------------------------------------------
mail2web.com ? What can On Demand Business Solutions do for you?
http://link.mail2web.com/Business/SharePoint



From lucapost at gmail.com  Wed Jul 16 03:28:37 2008
From: lucapost at gmail.com (LucaP)
Date: Wed Jul 16 03:28:42 2008
Subject: [uf-discuss] HTML 5 data- attributes
Message-ID: <f436e9f80807160328q58dfc4bcx97e2a061cfe0a881@mail.gmail.com>

I believe this new HTML feature found in the HTML 5 draft
specification should be taken into account in here, since it is
relevant to many ongoing discussions...

via John Resig (jQuery main developer)

A new feature being introduced in HTML 5 is the addition of custom
data attributes. This is a, seemingly, bizarre addition to the
specification - but actually provides a number of useful benefits.

Simply, the specification for custom data attributes states that any
attribute that starts with "data-" will be treated as a storage area
for private data (private in the sense that the end user can't see it
- it doesn't affect layout or presentation).

This allows you to write valid HTML markup (passing an HTML 5
validator) while, simultaneously, embedding data within your page. A
quick example:

<li class="user" data-name="John Resig" data-city="Boston"
     data-lang="js" data-food="Bacon">
  <b>John says:</b> <span>Hello, how are you?</span>
</li>

The above will be perfectly valid HTML 5. This should be a welcome
addition to nearly every JavaScript developer. The question of the
best means of attaching raw data to HTML elements - in a valid manner
- has been a long-lingering question. Frameworks have tried to deal
with this in different manners, two solutions being:

Using HTML, but with a custom DTD.
Using XHTML, with a specific namespace.

The addition of this prefix completely routes around both issues
(including any extra markup for validation or needing to be valid
XHTML) with this effective addition.

On top of this a simple JavaScript API is presented to access these
attribute values (in addition to the normal get/setAttribute):

var user = document.getElementsByTagName("li")[0];
var pos = 0, span = user.getElementsByTagName("span")[0];

var phrases = [
  {name: "city", prefix: "I am from "},
  {name: "food", prefix: "I like to eat "},
  {name: "lang", prefix: "I like to program in "}
];

user.addEventListener( "click", function(){
  var phrase = phrases[ pos++ ];
  // Use the .dataset property
  span.innerHTML = phrase.prefix + user.dataset[ phrase.name ];
}, false);

The .dataset property behaves very similarly to the the .attributes
property (but it only works as a map of key-value pairs). While no
browsers have implemented this exact DOM property, it's not hugely
needed - the above code could be done with the critical line replaced
with:

span.innerHTML = phrase.prefix +
  user.getAttribute("data-" + phrase.name );

I think what is most enticing about this whole specification is that
you don't have to wait for any browser to implement anything in order
to begin using it. By starting to use data- prefixes on your HTML
metadata today you'll be safe in knowing that it'll continue to work
well into the future. The time at which the HTML 5 validator is
integrated into the full W3C validator your site will already be
compliant (assuming, of course, you're already valid HTML 5 and using
the HTML 5 Doctype).

http://www.w3.org/html/wg/html5/#custom
From andr3.pt at gmail.com  Wed Jul 16 04:29:34 2008
From: andr3.pt at gmail.com (=?ISO-8859-1?Q?Andr=E9_Lu=EDs?=)
Date: Wed Jul 16 04:29:37 2008
Subject: [uf-discuss] HTML 5 data- attributes
In-Reply-To: <f436e9f80807160328q58dfc4bcx97e2a061cfe0a881@mail.gmail.com>
References: <f436e9f80807160328q58dfc4bcx97e2a061cfe0a881@mail.gmail.com>
Message-ID: <dc1a17860807160429m4faf1b82i24353263ad0b1ed5@mail.gmail.com>

I agree this is a nice solution to solve, for example, the
accessibility problems with the datetime pattern. But not for the
entire set of properties.. it "darkens" the data, makes the author
repeat information, etc...

For the abbr-based design patterns, I totally agree. For the rest, not so much.

A good compromise, IMHO, that has already been sugested here, would be
to port these attributes to classnames (data-*).

Custom DTDs for HTML, adding new namespaces to XHTML... I believe this
is a whole new path for microformats that needs to be assessed whether
we actually _need_ to go.

--
Andr? Lu?s

On Wed, Jul 16, 2008 at 11:28 AM, LucaP <lucapost@gmail.com> wrote:
> I believe this new HTML feature found in the HTML 5 draft
> specification should be taken into account in here, since it is
> relevant to many ongoing discussions...
>
> via John Resig (jQuery main developer)
>
> A new feature being introduced in HTML 5 is the addition of custom
> data attributes. This is a, seemingly, bizarre addition to the
> specification - but actually provides a number of useful benefits.
>
> Simply, the specification for custom data attributes states that any
> attribute that starts with "data-" will be treated as a storage area
> for private data (private in the sense that the end user can't see it
> - it doesn't affect layout or presentation).
>
> This allows you to write valid HTML markup (passing an HTML 5
> validator) while, simultaneously, embedding data within your page. A
> quick example:
>
> <li class="user" data-name="John Resig" data-city="Boston"
>     data-lang="js" data-food="Bacon">
>  <b>John says:</b> <span>Hello, how are you?</span>
> </li>
>
> The above will be perfectly valid HTML 5. This should be a welcome
> addition to nearly every JavaScript developer. The question of the
> best means of attaching raw data to HTML elements - in a valid manner
> - has been a long-lingering question. Frameworks have tried to deal
> with this in different manners, two solutions being:
>
> Using HTML, but with a custom DTD.
> Using XHTML, with a specific namespace.
>
> The addition of this prefix completely routes around both issues
> (including any extra markup for validation or needing to be valid
> XHTML) with this effective addition.
>
> On top of this a simple JavaScript API is presented to access these
> attribute values (in addition to the normal get/setAttribute):
>
> var user = document.getElementsByTagName("li")[0];
> var pos = 0, span = user.getElementsByTagName("span")[0];
>
> var phrases = [
>  {name: "city", prefix: "I am from "},
>  {name: "food", prefix: "I like to eat "},
>  {name: "lang", prefix: "I like to program in "}
> ];
>
> user.addEventListener( "click", function(){
>  var phrase = phrases[ pos++ ];
>  // Use the .dataset property
>  span.innerHTML = phrase.prefix + user.dataset[ phrase.name ];
> }, false);
>
> The .dataset property behaves very similarly to the the .attributes
> property (but it only works as a map of key-value pairs). While no
> browsers have implemented this exact DOM property, it's not hugely
> needed - the above code could be done with the critical line replaced
> with:
>
> span.innerHTML = phrase.prefix +
>  user.getAttribute("data-" + phrase.name );
>
> I think what is most enticing about this whole specification is that
> you don't have to wait for any browser to implement anything in order
> to begin using it. By starting to use data- prefixes on your HTML
> metadata today you'll be safe in knowing that it'll continue to work
> well into the future. The time at which the HTML 5 validator is
> integrated into the full W3C validator your site will already be
> compliant (assuming, of course, you're already valid HTML 5 and using
> the HTML 5 Doctype).
>
> http://www.w3.org/html/wg/html5/#custom
> _______________________________________________
> microformats-discuss mailing list
> microformats-discuss@microformats.org
> http://microformats.org/mailman/listinfo/microformats-discuss
>

From fberriman at gmail.com  Wed Jul 16 04:37:33 2008
From: fberriman at gmail.com (Frances Berriman)
Date: Wed Jul 16 04:37:37 2008
Subject: [uf-discuss] HTML 5 data- attributes
In-Reply-To: <dc1a17860807160429m4faf1b82i24353263ad0b1ed5@mail.gmail.com>
References: <f436e9f80807160328q58dfc4bcx97e2a061cfe0a881@mail.gmail.com>
	<dc1a17860807160429m4faf1b82i24353263ad0b1ed5@mail.gmail.com>
Message-ID: <e86992a40807160437l63dbf997tcd5ffb51bd1559f5@mail.gmail.com>

On 16/07/2008, Andr? Lu?s <andr3.pt@gmail.com> wrote:
> I agree this is a nice solution to solve, for example, the
>  accessibility problems with the datetime pattern. But not for the
>  entire set of properties.. it "darkens" the data, makes the author
>  repeat information, etc...
>
>  For the abbr-based design patterns, I totally agree. For the rest, not so much.
>
>  A good compromise, IMHO, that has already been sugested here, would be
>  to port these attributes to classnames (data-*).
>
>  Custom DTDs for HTML, adding new namespaces to XHTML... I believe this
>  is a whole new path for microformats that needs to be assessed whether
>  we actually _need_ to go.

Well, as of *now* it's not about "need" so much as "want".  Custom
DTDs and new namespaces etc. just aren't in the realm of what
microformats are doing, or should be doing, at the moment.


-- 
Frances Berriman
http://fberriman.com

From brian.suda at gmail.com  Wed Jul 16 04:48:42 2008
From: brian.suda at gmail.com (Brian Suda)
Date: Wed Jul 16 04:48:48 2008
Subject: [uf-discuss] HTML 5 data- attributes
In-Reply-To: <e86992a40807160437l63dbf997tcd5ffb51bd1559f5@mail.gmail.com>
References: <f436e9f80807160328q58dfc4bcx97e2a061cfe0a881@mail.gmail.com>
	<dc1a17860807160429m4faf1b82i24353263ad0b1ed5@mail.gmail.com>
	<e86992a40807160437l63dbf997tcd5ffb51bd1559f5@mail.gmail.com>
Message-ID: <21e770780807160448s358dc177k9347aae8c2fa9426@mail.gmail.com>

2008/7/16 Frances Berriman <fberriman@gmail.com>:
> On 16/07/2008, Andr? Lu?s <andr3.pt@gmail.com> wrote:
>>  Custom DTDs for HTML, adding new namespaces to XHTML... I believe this
>>  is a whole new path for microformats that needs to be assessed whether
>>  we actually _need_ to go.
>
> Well, as of *now* it's not about "need" so much as "want".  Custom
> DTDs and new namespaces etc. just aren't in the realm of what
> microformats are doing, or should be doing, at the moment.

--- the other reason microformats work so well is that they compliment
already existing, widely deployed technologies. We should avoid
hanging any design patterns on a format that has yet little or no
adoption.  We should focus on a solution that works in HTML 4.0 on the
billions of pages that are already published. If and when other
technologies emerge, we can solve problems then, no need to spend the
effort on early adopter edge-cases at the moment.

-brian

-- 
brian suda
http://suda.co.uk

From lists at ben-ward.co.uk  Wed Jul 16 07:20:54 2008
From: lists at ben-ward.co.uk (Ben Ward)
Date: Wed Jul 16 07:21:02 2008
Subject: [uf-discuss] HTML 5 data- attributes
In-Reply-To: <f436e9f80807160328q58dfc4bcx97e2a061cfe0a881@mail.gmail.com>
References: <f436e9f80807160328q58dfc4bcx97e2a061cfe0a881@mail.gmail.com>
Message-ID: <1ADA97F4-76A9-43F3-8919-E6AE100C23E9@ben-ward.co.uk>

On 16 Jul 2008, at 11:28, LucaP wrote:

> I believe this new HTML feature found in the HTML 5 draft
> specification should be taken into account in here, since it is
> relevant to many ongoing discussions...

>
> http://www.w3.org/html/wg/html5/#custom

In addition to the existing replies, and the fact that microformats  
are currently designed to work in HTML4 and not unstable drafts,  
quoting the specification itself:

> User agents must not derive any implementation behavior from these  
> attributes or values. Specifications intended for user agents must  
> not define these attributes to have any meaningful values.

-data prefix attributes are, by design and intention, for use by  
individual applications. They are explicitly excluded as a mechanism  
for microformats and the like.

B
From csarven at gmail.com  Wed Jul 16 07:40:38 2008
From: csarven at gmail.com (Sarven Capadisli)
Date: Wed Jul 16 07:40:40 2008
Subject: [uf-discuss] HTML 5 data- attributes
In-Reply-To: <1ADA97F4-76A9-43F3-8919-E6AE100C23E9@ben-ward.co.uk>
References: <f436e9f80807160328q58dfc4bcx97e2a061cfe0a881@mail.gmail.com>
	<1ADA97F4-76A9-43F3-8919-E6AE100C23E9@ben-ward.co.uk>
Message-ID: <d4154bcf0807160740m71737220h14858245a031dc1b@mail.gmail.com>

On Wed, Jul 16, 2008 at 10:20 AM, Ben Ward <lists@ben-ward.co.uk> wrote:
>> User agents must not derive any implementation behavior from these
>> attributes or values. Specifications intended for user agents must not
>> define these attributes to have any meaningful values.
>
> -data prefix attributes are, by design and intention, for use by individual
> applications. They are explicitly excluded as a mechanism for microformats
> and the like.


Indeed. And:

"Embedding custom non-visible data" goes rather against marking "visible" data.

Brief #whatwg conversation: http://krijnhoetmer.nl/irc-logs/whatwg/20080520#l-70


-Sarven
From mail at tobyinkster.co.uk  Wed Jul 16 15:29:33 2008
From: mail at tobyinkster.co.uk (Toby A Inkster)
Date: Wed Jul 16 15:37:25 2008
Subject: [uf-discuss] Extending hCard with RDFa
Message-ID: <58D3F0DC-84A5-4CA0-A9A3-61566DB6C6BA@tobyinkster.co.uk>

I mentioned that I was working on an article about extending hCard  
with RDFa a few weeks ago on the Microformats discussion list, but  
then went on holiday and forgot about it for a while. Anyway...

http://tobyinkster.co.uk/blog/2008/07/16/hcard-rdfa/

-- 
Toby A Inkster
<mailto:mail@tobyinkster.co.uk>
<http://tobyinkster.co.uk>



From lisagoodlin at gmail.com  Thu Jul 17 16:54:02 2008
From: lisagoodlin at gmail.com (Lisa Goodlin)
Date: Thu Jul 17 16:53:16 2008
Subject: [uf-discuss] Appropriate use of hcard
Message-ID: <C4A553DA.C906%lisagoodlin@gmail.com>

I'm a newbie to the list and a newbie to microformats. In the design for a
site I've put the client's name, address, phone numbers, and email in the
footer of every page. On the contact page I've placed the info more
prominently and put an "Add to Address Book" link.

http://mimmt.com/contact

Is it appropriate to place the hcard on every page in the footer, or should
I only do so once? If I should only offer the hcard download once, should I
still markup the information using microformats?

The reason I ask is because at
http://microformats.org/wiki/hcard-examples-in-wild-with-problems
some of the problem pages listed say that hcard is used on every page or
that they are hidden.

Thanks for your help.



From supercanadian at gmail.com  Thu Jul 17 17:05:51 2008
From: supercanadian at gmail.com (Charles Iliya Krempeaux)
Date: Thu Jul 17 17:05:53 2008
Subject: [uf-discuss] Appropriate use of hcard
In-Reply-To: <C4A553DA.C906%lisagoodlin@gmail.com>
References: <C4A553DA.C906%lisagoodlin@gmail.com>
Message-ID: <84ce626f0807171705q57876cbcx58ad6932e996637f@mail.gmail.com>

Hello Lisa,

On Thu, Jul 17, 2008 at 4:54 PM, Lisa Goodlin <lisagoodlin@gmail.com> wrote:

[...]

> Is it appropriate to place the hcard on every page in the footer, or should
> I only do so once?

I tend to use it all over the place to make up contact info.

I'm not sure why someone had a problem with having hCards all over the
place.  (Can anyone elaborate?)


--
Charles Iliya Krempeaux, B.Sc.
http://ChangeLog.ca/
From brian.suda at gmail.com  Fri Jul 18 01:11:40 2008
From: brian.suda at gmail.com (Brian Suda)
Date: Fri Jul 18 01:11:42 2008
Subject: [uf-discuss] Appropriate use of hcard
In-Reply-To: <C4A553DA.C906%lisagoodlin@gmail.com>
References: <C4A553DA.C906%lisagoodlin@gmail.com>
Message-ID: <21e770780807180111n74f0808eh4e7826f041beed96@mail.gmail.com>

2008/7/17 Lisa Goodlin <lisagoodlin@gmail.com>:
> Is it appropriate to place the hcard on every page in the footer, or should
> I only do so once? If I should only offer the hcard download once, should I
> still markup the information using microformats?

--- i personally think that you should add it to every instance where
there is data to be marked-up. Visitors to your site might come from
search engine results and therefore not visit your contact page, but
with an hCard on every page they would still get the semantic mark-up
benefit.

> The reason I ask is because at
> http://microformats.org/wiki/hcard-examples-in-wild-with-problems
> some of the problem pages listed say that hcard is used on every page or
> that they are hidden.

--- that is a list of hCards with problems. The term "every page" only
appears twice, and both times it is in connection with other issues,
such as positioning off screen or not displaying the data with CSS.

It is best to keep the contact information visible. If YOU can see the
data, then errors will be corrected quicker than if the data is hidden
away and no one sees it.

As i read over the wiki page, the term "suboptimal" caught my
attention. These are people who are trying to add semantic mark-up,
and while not 100% valid we label them as "suboptimal". This is not
very inviting to new members. Can someone think of a better term,
possibly something like "incomplete" or "needs improvement" or
"missing properties". We should encourage improvement rather than
berating failure.

Does anyone have an idea of how to improve this term?

-brian
-- 
brian suda
http://suda.co.uk
From lisagoodlin at gmail.com  Fri Jul 18 11:13:53 2008
From: lisagoodlin at gmail.com (Lisa Goodlin)
Date: Fri Jul 18 11:13:08 2008
Subject: [uf-discuss] Appropriate use of hcard
In-Reply-To: <21e770780807180111n74f0808eh4e7826f041beed96@mail.gmail.com>
Message-ID: <C4A655A1.C929%lisagoodlin@gmail.com>

Thanks for the replies. They were very helpful.

On 7/18/08 4:11 AM, "Brian Suda" <brian.suda@gmail.com> wrote:

> As i read over the wiki page, the term "suboptimal" caught my
> attention. These are people who are trying to add semantic mark-up,
> and while not 100% valid we label them as "suboptimal". This is not
> very inviting to new members. Can someone think of a better term,
> possibly something like "incomplete" or "needs improvement" or
> "missing properties". We should encourage improvement rather than
> berating failure.
> 
> Does anyone have an idea of how to improve this term?
> 
> -brian

"Could use improvement" is a bit friendlier.

From john at westciv.com  Tue Jul 15 14:57:54 2008
From: john at westciv.com (John Allsopp)
Date: Sat Jul 19 02:45:46 2008
Subject: [uf-discuss] Human and machine readable data format
In-Reply-To: <18650D3B-5223-40FE-8E30-1F0794660949@randomchaos.com>
References: <ae2b2ba80807120650q5f878e02ped917007f51347c0@mail.gmail.com>
	<C4A0F4CF.DC8F%Michael.Smethurst@bbc.co.uk>
	<ae2b2ba80807140539j15d1f945kfbb4b4b4c65114ad@mail.gmail.com>
	<1D57CC87-0A69-4F82-8CA9-EA941B44CFC7@randomchaos.com>
	<0DFCC49B-6B07-44F6-8E29-35B4100113FD@westciv.com>
	<18650D3B-5223-40FE-8E30-1F0794660949@randomchaos.com>
Message-ID: <393E075E-730F-41E8-9B3D-2A2D7D2E5F4B@westciv.com>

Hi Scott,

> Do you have any examples of the non-Gregorian dates being published  
> online?  Or any examples of applications that can take non-Gregorian  
> dates as input?

I've got some Japanese folks looking into that.

I don't speak Japanese, but last week I was in a very popular Japanese  
business, and required to fill in a form (well, my colleague did it  
for me) and they used the traditional calendar. Which lead me to think  
it was quite common (my colleague said that it was not uncommon).

> I think we've established non-Gregorian calendars exist, but most  
> countries officially adopted the Gregorian calendar several decades  
> before the web existed (e.g. Japan in 1873).  Such adoption wasn't  
> exclusive, but it draws into question (for me anyway) whether such  
> calendars are common enough on the web and have enough potential use  
> cases to warrant modeling in microformats.  I realize it's difficult  
> to do such research without belonging to the cultures in which it  
> would appear. Unfortunately that just makes it more necessary to  
> avoid mistakes.

Hopefully I'll have an answer on that soon

john

John Allsopp

style master :: css editor :: http://westciv.com/style_master
about me :: http://johnfallsopp.com
Web Directions Conferences :: http://webdirections.org
My Microformats book :: http://microformatique.com/book


From dland at liveworld.com  Sat Jul 19 23:57:02 2008
From: dland at liveworld.com (Dave Land)
Date: Sat Jul 19 23:57:47 2008
Subject: [uf-discuss] Appropriate use of hcard
In-Reply-To: <C4A655A1.C929%lisagoodlin@gmail.com>
References: <C4A655A1.C929%lisagoodlin@gmail.com>
Message-ID: <65BB82E9-07AC-4D43-AD2C-23B48D2F98C9@liveworld.com>

On Jul 18, 2008, at 11:13 AM, Lisa Goodlin wrote:

> Thanks for the replies. They were very helpful.
>
> On 7/18/08 4:11 AM, "Brian Suda" <brian.suda@gmail.com> wrote:
>
>> As i read over the wiki page, the term "suboptimal" caught my
>> attention. These are people who are trying to add semantic mark-up,
>> and while not 100% valid we label them as "suboptimal". This is not
>> very inviting to new members. Can someone think of a better term,
>> possibly something like "incomplete" or "needs improvement" or
>> "missing properties". We should encourage improvement rather than
>> berating failure.
>>
>> Does anyone have an idea of how to improve this term?
>>
>> -brian
>
> "Could use improvement" is a bit friendlier.

One thing to do would be to describe exactly how they are "suboptimal"
-- missing properties, not entirely human- and machine-readable, and
so forth, to give specific help to those who would "optimize" them.

Dave


From peter at schnitzlers.de  Thu Jul 24 11:06:01 2008
From: peter at schnitzlers.de (Peter Schnitzler)
Date: Thu Jul 24 11:06:13 2008
Subject: [uf-discuss] Picture in better quality
Message-ID: <54DB838D-2124-4A38-A9D5-274EC05A0A3C@schnitzlers.de>

Hi,

my diploma thesis is gonna cover "microformats" as well. Is the picture

http://microformats.org/about/
http://microformats.org/media/2008/micro-diagram.gif


also available in higher resolution?

cheers,
peter
From mail at tobyinkster.co.uk  Thu Jul 24 12:15:20 2008
From: mail at tobyinkster.co.uk (Toby A Inkster)
Date: Thu Jul 24 12:15:33 2008
Subject: [uf-discuss] Cognition 0.1 alpha11 released!
Message-ID: <16FF5976-4419-4E13-A48E-9A17E9F4BE1F@tobyinkster.co.uk>

Summary: added support for hAudio, hResume, species, hmeasure and  
XEN; improved parsing, especially datetime; more experimental  
features for hCard.

URL: http://buzzword.org.uk/cognition/

Details:

Previous releases have included a lot of repetitive code for  
microformat parsing. This release includes a new centralised function  
which takes care of 95% of microformat parsing. It means that I can  
make improvements to one single function which will then benefit  
parsing for all microformats.

Two improvements I've already made are:

* Improved datetime parsing, including support for HTML 5's <time>  
element; and
* Better handling for nested microformats. For example, whereas  
previously Cognition would only parse an agent hCard if the class  
names "agent" and "vcard" were found on the same element, it is now  
able to cope with the "vcard" element being nested *inside* the  
"agent" element.

I've implemented support for the current hAudio draft <http:// 
microformats.org/wiki/haudio>. This information may be output as RDF,  
and in future versions I expect to be able to output it as a podcast  
in RSS or Atom.

I've also implemented support for the most recent draft of hResume  
<http://microformats.org/wiki/hresume>. Output is in RDF, using the  
DOAC (Description of a Career) vocab.

I've implemented support for the current species draft <http:// 
microformats.org/wiki/species-strawman-01>, in a manner broadly  
compatible with the existing user script for Operator. This  
information may be output as RDF, and (using extensions) hCard and  
iCalendar. Details of the support have been documented <http:// 
buzzword.org.uk/cognition/uf-plus.html#species>.

Implemented support for the latest hmeasure <http://microformats.org/ 
wiki/measure> draft, treating units as opaque strings.

Although it was intended as a joke, I've implemented XEN, the XHTML  
Enemies Protocol <http://xen.adactio.com/>. To use it, you MUST  
include the profile URI.

Lastly, I've continued extending Cognition's support for hCard to  
include support for new vCard 4.0 (draft) properties. In this  
revision, this has included changing "lang" from a singular property  
to a plural property, and adding support for the "member" property,  
which may take a URL or an embedded hCard. e.g.

<div class="vcard">
  <p>Members of the <a href="http://microformats.org"
  class="fn org url">Microformats Community</a> include:</p>
  <ul>
   <li class="member vcard">
    <a class="url fn" href="http://tantek.com">Tantek Celik</a>
   </li>
   <li class="member vcard">
    <a class="url fn" href="http://suda.co.uk">Brian Suda</a>
   </li>
   <li>
    <a class="member" href="http://adactio.com/">Jeremy Keith</a>
   </li>
   <li>
    <a class="member" href="mailto:tai@g5n.co.uk">Toby Inkster</a>
   </li>
  </ul>
</div>

This syntax is achieved by extrapolating the existing hCard standard  
from vCard 3.0 to vCard 4.0 draft.

Feedback is welcomed.

-- 
Toby A Inkster
<mailto:mail@tobyinkster.co.uk>
<http://tobyinkster.co.uk>



From leeand00 at hotmail.com  Mon Jul 28 06:18:38 2008
From: leeand00 at hotmail.com (leeand00)
Date: Mon Jul 28 06:18:45 2008
Subject: [uf-discuss] leeand00 wants to keep up with you on Twitter
Message-ID: <488dc72e1d352_29bb155558d24a9888cc@twitter-web058.twitter.com.tmail>

To find out more about Twitter, visit the link below:

http://twitter.com/i/339d5910d26d96e3a9a9ec12eb0497cab111488a

Thanks,
-The Twitter Team

About Twitter

Twitter is a unique approach to communication and networking based on the simple concept of status. What are you doing? What are your friends doing?right now? With Twitter, you may answer this question over SMS, IM, or the Web and the responses are shared between contacts.

This message was sent by a Twitter user who entered your email address. If you'd prefer not to receive emails when other people invite you to Twitter, click here:
http://twitter.com/i/optout/009f3cd2b1ae2b0d110f265cc933d7f2c436697d
From msporny at digitalbazaar.com  Wed Jul 30 11:33:15 2008
From: msporny at digitalbazaar.com (Manu Sporny)
Date: Wed Jul 30 11:33:21 2008
Subject: [uf-discuss] Audio/Video RDF Vocabulary Screencasts
Message-ID: <4890B3EB.3000902@digitalbazaar.com>

Hi uFers,

Based on work that was done in this community over a year ago, we've
attempted to do a port of hAudio over to the RDFa world. The result is a
set of 4 vocabularies for media, audio, video and commerce. We used a
number of Microformats principles when porting the vocabularies and
re-using pre-existing vocabulary terms. We re-use Dublin Core heavily.
The vocabularies can be found here:

http://purl.org/media/
http://purl.org/media/audio
http://purl.org/media/video
http://purl.org/commerce

A small plugin, named Fuzzbot, has been put together to explore semantic
web UI approaches and two demos are now up regarding the audio and video
vocabularies:

Intro to Fuzzbot and Audio Vocabulary
http://www.youtube.com/watch?v=oPWNgZ4peuI

Intro to Video Vocabulary
http://www.youtube.com/watch?v=PVGD9HQloDI

The intros are very rough, done in 1-2 takes, but hopefully they get the
concept across. The next steps are going to be an attempt to use Firefox
3's Microformats functionality to pull uF metadata into Fuzzbot's RDFa
triple store.

Downloads and source for librdfa and Fuzzbot can be found here:

http://rdfa.digitalbazaar.com/fuzzbot

The Linux version is the only one that is up-to-date. I'll compile the
Mac OS X and Windows versions when I get the time to do so.

Feel free to comment/discuss the videos on here. We're looking for
feedback on what would make the demos more enticing. Right now, it's
just "you can use metadata to construct more accurate searches".

-- manu

PS: I also mis-spoke at one point in the first video and said that
"before Fuzzbot it wasn't possible to do this sort of thing", which is
not correct 'cause Operator has been around for a much longer time.
Apologies to Mike Kaply, since he started blazing this trail some time
ago. :)
From mail at tobyinkster.co.uk  Wed Jul 30 15:53:00 2008
From: mail at tobyinkster.co.uk (Toby A Inkster)
Date: Wed Jul 30 15:53:30 2008
Subject: [uf-discuss] Audio/Video RDF Vocabulary Screencasts
Message-ID: <852A987C-E2FA-411F-BE1F-CB1A7D8C2F8B@tobyinkster.co.uk>

For those who are interested, here are two copies of a page, one  
marked up with hAudio and the other with Audio RDF:

http://buzzword.org.uk/2008/audio/

-- 
Toby A Inkster
<mailto:mail@tobyinkster.co.uk>
<http://tobyinkster.co.uk>