From philipj at opera.com  Mon Feb  1 01:51:24 2010
From: philipj at opera.com (=?utf-8?Q?Philip_J=C3=A4genstedt?=)
Date: Mon Feb  1 01:51:44 2010
Subject: [uf-discuss] Fwd: Removing the FN magic in the vCard microdata
	vocabulary
In-Reply-To: <101535998-1264784845-cardhu_decombobulator_blackberry.rim.net-1587876118-@bda088.bisx.prod.on.blackberry>
References: <101535998-1264784845-cardhu_decombobulator_blackberry.rim.net-1587876118-@bda088.bisx.prod.on.blackberry>
Message-ID: <op.u7f0bynbsr6mfa@sisko.linkoping.osa>

Hi microformateers,

Please see the below forwarded question about removing the guessing of  
names when exporting vCard. Since Hixie wants the microdata vCard  
vocab/extraction to be compatible with microformats, I'm taking it to the  
source...

In short, I think that guessing the names will create problems for  
Vietnamese names (family-name given-name given-name), Chinese names (??  
without space), transcribed Chinese names (family-name given-name) and  
probably the Japanese and Korean names, for the same reasons.

Are there compatibility issues with not outputting an N line at all? If  
there is, would there be any issues with simply outputting N:;;;; ?

The current algorithm is used on http://foolip.org/microdatajs/live/ for  
reference.

-- 
Philip J?genstedt
Core Developer
Opera Software

------- Forwarded message -------
From: "Tantek Celik" <tantek@cs.stanford.edu>
To: "Ian Hickson" <ian@hixie.ch>, "Philip J??genstedt"  
<philipj@opera.com>, "Tantek ?elik" <tantek@cs.stanford.edu>, "Jeremy  
Keith" <jeremy@adactio.com>
Cc: "whatwg@whatwg.org List" <whatwg@whatwg.org>
Subject: Re: Removing the FN magic in the vCard microdata vocabulary (Was:  
[whatwg]Microdata feedback)
Date: Fri, 29 Jan 2010 18:08:33 +0100

There have been several issues filed specifically regarding 'n' and 'fn'
optimizations in hCard, in particular the i18n problem that is mentioned
in this thread, and resolved with errata updates to these algorithms.

This particular issue is documented on the hcard-issues-resolved page on
the microformats wiki page.

If there are further problems regarding these property optimizations, I'm
certainly open to seeing (and would like to see) them raised+documented so
that we can fix them in hCard. (There shouldn't be any divergence, and
frankly I'd prefer that vcard microdata simply reference hCard but I
realize that is waiting on hCard 1.0.1).

As I'm editing hCard 1.0.1 now and making changes to address issues just
like this - now is a very good time to give this feedback.

Please either send them to microformats-discuss@microformats.org or feel
free to add them directly to the hCard issues wiki page (preferable):

http://microformats.org/wiki/hcard-issues

And we can follow-up there.

Thanks,

Tantek


------Original Message------
From: Ian Hickson
To: Philip J??genstedt
To: Tantek ?elik
To: Jeremy Keith
Cc: whatwg@whatwg.org List
Subject: Removing the FN magic in the vCard microdata vocabulary (Was:
[whatwg]Microdata feedback)
Sent: Jan 29, 2010 01:04

On Thu, 21 Jan 2010, Philip J?genstedt wrote:
> On Mon, 18 Jan 2010 16:24:46 +0100, Jeremy Keith <jeremy@adactio.com>
> wrote:
> > Hixie wrote:
> > > > Finally on vCard, the final part of the extraction algorithm goes
> > > > to great trouble to guess what is the family name and what is the
> > > > given name. This guess will be broken for transliterated east
> > > > Asian names (CJKV that I know of, maybe others too). Just saying.
> > > > Also, why is it important to explicitly add N:;;;; for
> > > > organizations?
> > >
> > > This is intended to be compatible with Microformats vCard, which has
> > > these weird rules. If you think we should remove them, please at
> > > least first speak to Tantek and see why he thinks.
> >
> > The fn optimisation pattern isn't intended to catch 100% of cases,
> > just the situation "Firstname Lastname" or "Firstname Middlename
> > Lastname". So if you just use fn (formatted name) and don't use n
> > (name), the name will be extracted/guessed using the optimisation
> > pattern.
> >
> > In cases where the pattern doesn't work (e.g. "Anne van Kesteren", or
> > east Asian names) you can still explicitly specify the family name and
> > given name, over-riding the fn optimisation pattern. If you do this,
> > you need to explicitly state this is the name (n) as well as the
> > formatted name (fn).
>
> This is going to break badly whenever a template uses vCard microdata
> and its author either doesn't know the family name and given name
> (because the data was never collected) or doesn't even consider that the
> vcard conversion does this funny guesswork. If a social network site or
> similar does this, then Anne van Kesteren and Zhang Min (fictional name)
> will have their names messed up with no way of fixing it. At least I
> haven't seen a site which asks users to both fill in their full name and
> each component, which is what you need to get this right.
>
> > Similarly, for organisations, you don't have to explicitly set n
> > (name) if you apply both fn (formatted name) and org (organisation
> > name) to a string. This time, the optimisation pattern assumes that
> > the fn is the name of the organisation.
> >
> > Technically, the n property is *always* required but if you use either
> > of those two optimisation patterns, the n is inferred from fn.
>
> If this is just a technical problem with some software requiring N to be
> present, would it be OK to just output an empty N like for
> organizations?

That's a good question... As I mentioned above, the rule is here to be
compatible with Microformats. I'd be happy to remove it, but I'd like
confirmation from the Microformats community that it's ok for us to
diverge in this way from their vocabulary, and to find out if they have
any experience regarding how much of a problem generating a blank N in the
output when it's missing would be. Tantek, Jeremy, any opinions?

From tantek at cs.stanford.edu  Mon Feb  8 13:31:09 2010
From: tantek at cs.stanford.edu (=?UTF-8?Q?Tantek_=C3=87elik?=)
Date: Mon Feb  8 13:31:35 2010
Subject: [uf-discuss] geo shorthand in anchor
In-Reply-To: <1263564099.2546.24.camel@csarven-laptop>
References: <1261936306.2543.29.camel@csarven-laptop>
	<21e770780912271046i3bedc485m59aee2df1c469bc0@mail.gmail.com> 
	<1262206249.4426.96.camel@csarven-laptop>
	<21e770780912301354h2b1c638fj7019592fc17f6fb@mail.gmail.com> 
	<1262210946.9728.14.camel@csarven-laptop>
	<21e770780912301430p71b37da9i93ccd7acd658d9c1@mail.gmail.com> 
	<1262259010.4580.28.camel@csarven-laptop>
	<60cb038a0912310710o3287b374h6e48226af499d3b4@mail.gmail.com> 
	<1263564099.2546.24.camel@csarven-laptop>
Message-ID: <60cb038a1002081331k5334d812oaf991672ce294c55@mail.gmail.com>

On Fri, Jan 15, 2010 at 6:01 AM, Sarven Capadisli <info@csarven.ca> wrote:
> I've noted my observations on your observations
> http://microformats.org/wiki/index.php?title=geo-brainstorming&diff=41657&oldid=41586

Thanks Sarven, you raised some good questions - I've followed up on
the wiki as well.

> I see two things there:
>
> 1. changing the problem i.e., intended visible readable text content

In general we should seek to make content more visible when possible.

> 2. "45.5140800" and "-73.6111000" as text values is no more human
> readable and listenable than as "45.5140800;-73.6111000" title value.

But that's not the exact comparison of the renderings, leaving out the
key difference, the labels:

lat:45.5140800; long:-73.6111000

which is then more readable/listenable/understandable than a pair of
semicolon separated numbers. it may not be perfect, but it is an
improvement.

Thanks,

Tantek
From palmisano at fbk.eu  Fri Feb 19 01:37:39 2010
From: palmisano at fbk.eu (Davide Palmisano)
Date: Fri Feb 19 01:49:51 2010
Subject: [uf-discuss] (no subject)
Message-ID: <AAE8384267E68B47BE0785D6E080B0A2D5AFD013@ntmail2.pc.itc.it>

Dear all,

we are proud to announce a new release of any23 -- Anything to Triples.

          http://developers.any23.org/

Any23 is a Java library that parses RDF from a variety of Web document
formats. The currently supported input formats are RDFa, RDF/XML,
Turtle, N3, N-Triples, and a number of Microformats.
Any23 is an Open Source project originated from the code created
within the Sindice project and now used both inside sindice and in
related projects e.g. Sig.Ma

Any23 comes with a handy command-line tool for parsing RDF and
converting between formats.

We have also set up a demo service where you can try any23 online and
use a REST API to convert between different RDF formats, similar in
spirit to triplr.org:

          http://any23.org/

The major new features in this release are:

* Redesigned Java API
   - Input from string, stream, file, or URI
   - Allow choosing which extractors to use
   - Report origin of triples (document/extractor) to client processors
   - Various processors/serializers for extracted triples
* Added flexible command-line tool for easy testing
* Vastly improved website and documentation
* Media type and encoding detection via Apache Tika
* Switched RDF library from Jena to Sesame
* Added Maven build
* Better RDF extraction from Microformats
* Extractors come with example file to document typical in- and output
* Major refactoring
* Lots and lots of bugfixes

The following people have contributed to this release: Michele
Mostarda and Davide Pamisano (FBK, Trento, Italy, Web of Data Unit
(WED) ); Richard Cyganiak and J?rgen Umbrich (DERI, NUI Galway,
Ireland); Michele Catasta (EPFL, Lausanne, Switzerland), Giovanni
Tummarello

All the best,
Davide Palmisano on behalf of the contributors


Davide Palmisano
Web of Data  Research Unit
Technologist @ Fondazione Bruno Kessler
http://wed.fbk.eu/en/home
---
http://davidepalmisano.wordpress.com
http://twitter.com/dpalmisano
http://www.slideshare.net/dpalmisano
From palmisano at fbk.eu  Fri Feb 19 01:59:31 2010
From: palmisano at fbk.eu (Davide Palmisano)
Date: Fri Feb 19 02:00:18 2010
Subject: [uf-discuss] [ANN] any23 v0.2 released
Message-ID: <AAE8384267E68B47BE0785D6E080B0A2D5AFD015@ntmail2.pc.itc.it>

Dear all,

we are proud to announce a new release of any23 -- Anything to Triples.

          http://developers.any23.org/

Any23 is a Java library that parses RDF from a variety of Web document
formats. The currently supported input formats are RDFa, RDF/XML,
Turtle, N3, N-Triples, and a number of Microformats.
Any23 is an Open Source project originated from the code created
within the Sindice project and now used both inside sindice and in
related projects e.g. Sig.Ma

Any23 comes with a handy command-line tool for parsing RDF and
converting between formats.

We have also set up a demo service where you can try any23 online and
use a REST API to convert between different RDF formats, similar in
spirit to triplr.org:

          http://any23.org/

The major new features in this release are:

* Redesigned Java API
   - Input from string, stream, file, or URI
   - Allow choosing which extractors to use
   - Report origin of triples (document/extractor) to client processors
   - Various processors/serializers for extracted triples
* Added flexible command-line tool for easy testing
* Vastly improved website and documentation
* Media type and encoding detection via Apache Tika
* Switched RDF library from Jena to Sesame
* Added Maven build
* Better RDF extraction from Microformats
* Extractors come with example file to document typical in- and output
* Major refactoring
* Lots and lots of bugfixes

The following people have contributed to this release: Michele
Mostarda and Davide Pamisano (FBK, Trento, Italy, Web of Data Unit
(WED) ); Richard Cyganiak and J?rgen Umbrich (DERI, NUI Galway,
Ireland); Michele Catasta (EPFL, Lausanne, Switzerland), Giovanni
Tummarello

All the best,
Davide Palmisano on behalf of the contributors


Davide Palmisano
Web of Data  Research Unit
Technologist @ Fondazione Bruno Kessler
http://wed.fbk.eu/en/home
---
http://davidepalmisano.wordpress.com
http://twitter.com/dpalmisano
http://www.slideshare.net/dpalmisano

From tantek at cs.stanford.edu  Fri Feb 19 18:01:28 2010
From: tantek at cs.stanford.edu (=?UTF-8?Q?Tantek_=C3=87elik?=)
Date: Fri Feb 19 18:01:52 2010
Subject: [uf-discuss] [ANN] any23 v0.2 released
In-Reply-To: <AAE8384267E68B47BE0785D6E080B0A2D5AFD015@ntmail2.pc.itc.it>
References: <AAE8384267E68B47BE0785D6E080B0A2D5AFD015@ntmail2.pc.itc.it>
Message-ID: <60cb038a1002191801i268bf84ak9c90144cbfc585bf@mail.gmail.com>

On Fri, Feb 19, 2010 at 1:59 AM, Davide Palmisano <palmisano@fbk.eu> wrote:
> Dear all,
>
> we are proud to announce a new release of any23 -- Anything to Triples.
>
> ? ? ? ? ?http://developers.any23.org/

Davide, congratulations on your release!


> Any23 is a Java library that parses RDF from a variety of Web document
> formats. The currently supported input formats are RDFa, RDF/XML,
> Turtle, N3, N-Triples, and a number of Microformats.
> Any23 is an Open Source project originated from the code created
> within the Sindice project and now used both inside sindice and in
> related projects e.g. Sig.Ma
>
> Any23 comes with a handy command-line tool for parsing RDF and
> converting between formats.
>
> We have also set up a demo service where you can try any23 online and
> use a REST API to convert between different RDF formats, similar in
> spirit to triplr.org:
>
> ? ? ? ? ?http://any23.org/
>
> The major new features in this release are:
>
> * Redesigned Java API
> ? - Input from string, stream, file, or URI
> ? - Allow choosing which extractors to use
> ? - Report origin of triples (document/extractor) to client processors
> ? - Various processors/serializers for extracted triples
> * Added flexible command-line tool for easy testing
> * Vastly improved website and documentation
> * Media type and encoding detection via Apache Tika
> * Switched RDF library from Jena to Sesame
> * Added Maven build
> * Better RDF extraction from Microformats

This is great to hear.

Tom Morris has already kindly added any23 to the parsers page:

http://microformats.org/wiki/parsers

Could you list the specific microformats that are parsed by any23?

And even better, feel free to add any23 to the *-implementations pages
of the microformats that it supports, e.g. if it supports hCard, add
it to:

http://microformats.org/wiki/hcard-implementations#Open_Source


> The following people have contributed to this release: Michele
> Mostarda and Davide Pamisano (FBK, Trento, Italy, Web of Data Unit
> (WED) ); Richard Cyganiak and J?rgen Umbrich (DERI, NUI Galway,
> Ireland); Michele Catasta (EPFL, Lausanne, Switzerland), Giovanni
> Tummarello
>
> All the best,
> Davide Palmisano on behalf of the contributors

Thanks again for all your excellent work and for contributing to
bettering the interoperability of semantic data on the web.

Tantek

-- 
http://tantek.com/

From palmisano at fbk.eu  Mon Feb 22 01:55:53 2010
From: palmisano at fbk.eu (Davide Palmisano)
Date: Mon Feb 22 02:01:07 2010
Subject: [uf-discuss] [ANN] any23 v0.2 released
In-Reply-To: <60cb038a1002191801i268bf84ak9c90144cbfc585bf@mail.gmail.com>
References: <AAE8384267E68B47BE0785D6E080B0A2D5AFD015@ntmail2.pc.itc.it>,
	<60cb038a1002191801i268bf84ak9c90144cbfc585bf@mail.gmail.com>
Message-ID: <AAE8384267E68B47BE0785D6E080B0A2D5AFD018@ntmail2.pc.itc.it>


________________________________________
From: microformats-discuss-bounces@microformats.org [microformats-discuss-bounces@microformats.org] On Behalf Of Tantek ?elik [tantek@cs.stanford.edu]
Sent: Saturday, February 20, 2010 3:01 AM
To: Microformats Discuss
Subject: Re: [uf-discuss] [ANN] any23 v0.2 released

On Fri, Feb 19, 2010 at 1:59 AM, Davide Palmisano <palmisano@fbk.eu> wrote:
> Dear all,
>
> we are proud to announce a new release of any23 -- Anything to Triples.
>
>          http://developers.any23.org/

Davide, congratulations on your release!

Many thanks Tantek!


> Any23 is a Java library that parses RDF from a variety of Web document
> formats. The currently supported input formats are RDFa, RDF/XML,
> Turtle, N3, N-Triples, and a number of Microformats.
> Any23 is an Open Source project originated from the code created
> within the Sindice project and now used both inside sindice and in
> related projects e.g. Sig.Ma
>
> Any23 comes with a handy command-line tool for parsing RDF and
> converting between formats.
>
> We have also set up a demo service where you can try any23 online and
> use a REST API to convert between different RDF formats, similar in
> spirit to triplr.org:
>
>          http://any23.org/
>
> The major new features in this release are:
>
> * Redesigned Java API
>   - Input from string, stream, file, or URI
>   - Allow choosing which extractors to use
>   - Report origin of triples (document/extractor) to client processors
>   - Various processors/serializers for extracted triples
> * Added flexible command-line tool for easy testing
> * Vastly improved website and documentation
> * Media type and encoding detection via Apache Tika
> * Switched RDF library from Jena to Sesame
> * Added Maven build
> * Better RDF extraction from Microformats

This is great to hear.

Tom Morris has already kindly added any23 to the parsers page:

http://microformats.org/wiki/parsers

Wow this is great!

Could you list the specific microformats that are parsed by any23?

Of course: Adr, Geo, hCalendar, hCard, hListing, hResume, hReview, License and XFN.

As listed also here, http://developers.any23.org

And even better, feel free to add any23 to the *-implementations pages
of the microformats that it supports, e.g. if it supports hCard, add
it to:

http://microformats.org/wiki/hcard-implementations#Open_Source

Sure, Will do!


> The following people have contributed to this release: Michele
> Mostarda and Davide Pamisano (FBK, Trento, Italy, Web of Data Unit
> (WED) ); Richard Cyganiak and J?rgen Umbrich (DERI, NUI Galway,
> Ireland); Michele Catasta (EPFL, Lausanne, Switzerland), Giovanni
> Tummarello
>
> All the best,
> Davide Palmisano on behalf of the contributors

Thanks again for all your excellent work and for contributing to
bettering the interoperability of semantic data on the web.

Thanks to you for your quick feedback! We are glad to hear 
suggestions and improvements from the community.

Tantek

--
http://tantek.com/

_______________________________________________
microformats-discuss mailing list
microformats-discuss@microformats.org
http://microformats.org/mailman/listinfo/microformats-discuss