[uf-discuss] Re: Microformats in Google Maps

Toby A Inkster mail at tobyinkster.co.uk
Thu Aug 2 08:34:04 PDT 2007

Andy Mabbett wrote:

> <http://microformats.org/wiki/hcard-brainstorming#implied_adr_subproperties>
> which strikes me as unworkable, being overly complex and not suitable
> for internationalisation (not just in non-English speaking countries,
> but outside the USA)

I'm with Andy on this one.

In fact, Tantek's proposed algorithm doesn't even solve the problem of
parsing US addresses. Consider:

	<div xml:lang="fr">
	  <p>Contactez-nous a:</p>
	  <div class="vcard">
	    <div class="org fn">Ambassade de France aux Etats-Unis</div>
	    <div class="adr">
	      4101 Reservior Road, N.W.<br />
	      Washington D.C. 20007<br />
	      Etats-Unis d'Amerique

I recently had to write some code to transfer almost 500,000 addresses
from a loosely formatted list to one which had separate fields for house
name, address, town, county, country and postcode.

Because these were almost entirely UK addresses, and I had a big database
of all UK postal town and corresponding postcodes, I was able to get about
95% accuracy -- but that involved hundreds of lines of code. To cover a
useful number of countries would require tens of thousands of lines of

Requiring the use of heuristics to parse address data raises the barrier to
entry for implementing hCard astronomically.

Andy's suggestion of defaulting to "extended-address" is better, though
given the semantics of "extended-address", which appears to be for flat
numbers, I'd prefer to default to "street-address".

How about:

	Where "adr" has content not enclosed in any explicit sub-
	properties, parsers MAY attempt to heuristically determine
	the address parts and, if appropriate, MAY ask the user
	to manually separate the address. Failing that, parsers
	MUST assume this content to be the "street-address".

Toby A Inkster BSc (Hons) ARCS
[Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux]
[OS: Linux 2.6.12-12mdksmp, up 42 days, 18:43.]

                       Open Mobile Alliance DTD Oops!

More information about the microformats-discuss mailing list