[uf-discuss] Re: Apple Data Detectors

Guillaume Lebleu guillaume at lebleu.org
Fri Feb 8 08:40:26 PST 2008


Toby A Inkster wrote:
> Guillaume Lebleu wrote:
>
>   
>> What I have been thinking more and more and what this tells me again is
>> that the same way we talk of POSH and microformats, we could talk of
>> plain text or plain old english formats, essentially standardizing how
>> people write dates, addresses, etc on the Web or on their emails. Asking
>> people to write "Tuesday, February 5, 2008" in this order, with the
>> commas, etc. is very likely even simpler for normal people than writing
>> <abbr class="foo" title="2008-05-02">Tuesday, February 5, 2008</abbr>.
>>     
>
> One problem with that is that it will find matches on people who aren't 
> even intending to use your plain-old-english format. They may happen to be 
> including "Tuesday, February 5, 2008" on their pages with a different 
> intended meaning. 2008 could refer to eight minutes past eight PM in 
> military time -- unlikely, but possible. And as you move away from dates, 
> phone numbers and postcodes which have relatively parseable formats, 
> towards locations, people's names and job titles and so on, the likelihood 
> of false matches increases.
>
> The use of explicit tags to mark up information do make microformats 
> slightly harder to use, yes. But the key is that they also make 
> microformats much easier to explicitly not use.
>
>   
Toby,
I understand the challenge of disambiguation and the value microformats 
bring in terms of easier parser implementation and more reliable 
information consumption experience. The challenge for average people 
writing microformats can't be underestimated though. I strongly believe 
that the time where disambiguation costs are the lowest are at 
publishing time, but this is also the time where you are focused on the 
english content, not the microformats. This is why in the second part of 
the post you cited, I suggested the use of Apple Data Detectors' like 
functionality, not to detect objects in plain old english (POE) in 
published content, but to detect objects in POE at the time they are 
written and ask for the user for disambiguation at the same time, in a 
way that the underlying microformat markup is generated, but without the 
user having to know the syntax. I'm thinking of this particularly in the 
context of writing a blog post: writing 1 hCards just to say "My friend 
Joe" is way too much for normal people. On the other end, if, as I type 
this, I get an intellisense-like list of my contacts that I can select 
from, then I can just select Joe from the list and have the microformat 
markup added for me (just like Wordpress adds a lot of markup that isn't 
in the visual editor or like Wiki converts simplified markup into HTML 
markup).
Guillaume


More information about the microformats-discuss mailing list