hcard-brainstorming: Difference between revisions

From Microformats Wiki
Jump to navigation Jump to search
Line 435: Line 435:
** No evidence provided that contact information on the Web publishes this information.
** No evidence provided that contact information on the Web publishes this information.


* vat-number : for VAT numbers of companies, which are used a lot in Europe and they need to be published on Belgian publications (including websites).


==Wikipedia's Persondata==
==Wikipedia's Persondata==

Revision as of 09:34, 7 February 2007

hCard Brainstorming

This page is for brainstorming about various uses and details of hCard.

Authors

Contributors

Problems Being Solved

Some of the problems that hCard helps to solve:

  • having to enter business cards that go out of date (subscribe to someone's syndicated hCard instead).
  • annoying "update your contact info" email from various centralized contact info services

Examples

  • See hcard-examples, which provides several illustrative instructive examples, as well as 1:1 hCard examples for each example in RFC 2426.

Using RFC2806 with hCard

RFC 2806 defines the telephone scheme "tel:", "fax:" and "modem:" to handle phone communications with URIs in the same way, "mailto:" is defined for email. It's part of the list or registered schemes by IANA : Uniform Resource Identifier (URI) SCHEMES

tel   telephone [RFC2806]
fax   fax       [RFC2806]
modem modem     [RFC2806]

It is practical to write your tel number like this.

<a class="tel"      href="tel:+1-919-555-7878">+1-919-555-7878</a>

or even

<a class="tel"      href="tel:+1-919-555-7878">Mr Smith's phone</a>

You can add support for "tel:" to your desktop and to your browser

On the CSS front… You could for example add automagically an icon. I have put the property !important for those who wants to add it to their own stylesheet in their browsers, so they know type of links when browsing.

a[href^="tel:"]:before {
    content: '\260f  ' !important;
    padding-left: 20px !important; }

a[href^="mailto:"]:before {
    content: '\2709  ' !important;
    padding-left: 20px !important; }

Encoding "modern" attributes

Since vCard was first established, various interactive communication technologies and addressing schemes have been widely adopted. Although there aren't specific properties for these technologies / addressing schemes, they can be captured as URLs or email addresses.

This has now been written up for the most part. See:

http://microformats.org/wiki/hcard-examples#New_Types_of_Contact_Info

Still to be addressed:

  • iChat mac.com addresses, simply store "@mac.com" email addresses, e.g.
    • <a class="email" href="mailto:steve@mac.com">...
  • MSN Instant Messenger, you can simple store "@hotmail.com" or "@msn.com" or "@passport.com" email addresses.
  • Internet Relay Chat (IRC), use "irc:" URLs.

CSS Styles

Not only can you create semantics with the hCard values, but you can add CSS styles to them as well. You are free to style the terms in any way you want, but here we can list a few ideas for how to style terms.

If you want to encode hCard data, but do NOT want to display it in the HTML code, then you can hide that tag in CSS with the following code:

<span style="display: none">Hidden Data</span>

Transforming applications will still find the data and use it when converting hCards to vCards.

Auto-Discovery

vCard auto extraction

There is currently a debate over the best way to add an auto discovery link to your HTML to extract the vCard.

On the page with the hCard encoding, the best link would be as follows: <link rel="alternate" type="text/directory" href="..." /> this HTML page is an alternate view of the vCard.

The registered and appropriate type for vCard entities is “text/directory”, as defined in Internet RFC 2425, “A MIME Content-Type for Directory Information”. RFC 2426, “vCard MIME Directory Profile”, specifies the vCard profile for “text/directory” entities, which profile the MIME/HTTP header field “Content-Type” would indicate with a “profile” parameter whose value is “VCARD”.

It is unclear whether the HTML/XHTML “type” attribute allows values with parameters. On 2004-05-23, Björn Höhrmann sent to the HTML Working Group a request for clarification on the issue.

When on a different page, referencing that encoded page in the href would not be an alternate view of the current page. Therefore rel="alternate" may not be appropriate. The problem of what rel value to use is bigger than links to vCards.

hCard to hCard relationships

There are several types of hCard to hCard relationships, that is, one hCard hyperlinking to another hCard which would beneift from the explicit rel values that described the specific relationship.

mini hCard to expanded hCard

Perhaps the most common type of hCard to hCard link is a mini hCard, e.g. from a personal home page or blog to the person's contact/about page, perhaps consisting of only a name and URL, that links to an expanded hCard. Examples in the wild:

In this instance, possible rel values might include:

  • rel="expanded"
  • rel="definitive" - the problem with this is that the expanded hCard is not necessarily a definitive version.
  • rel="canonical" - similarly, the expanded hCard is not necessarily at a canonical URL. It may simply be *an* expanded version, not *the* expanded version.

The following rel values have been suggested, but are not really a good idea due to the fact that they imply a dependence to add a new rel value for any new microformat which might have a mini-version linking to a more expanded version:

  • rel="author"
  • rel='contact'
  • rel="contactinfo"
  • rel='hcard'
  • rel='person'

Here are some more generic values that have been suggested which perhaps make even less sense:

  • rel='microformat' - this doesn't make any sense when you imagine a world where nearly every web page contains microformats.
  • rel='about' - what does "about" have to do with a person or even authorship?
  • rel="profile" - should be reserved for meaning here is an XMDP profile for the current page.
  • rel='PIM' - not sure about how this makes any sense either.

mini hCard to remote site

Per the instructions in hcard-examples for marking up people in blogrolls, you might have an hCard of your site for another person which then links to that other person's website. Should there be a rel value that indicates this "mini-hCard" to "person" relationship?

mini hCards and nearby expanded hCard links

Some authors include mini-hCards on their pages of themselves (e.g. in their blog posts), and yet those mini-hCards don't actually point to more expanded versions. However, sometimes they have a separate but nearby link on the same page like "about" or "contact" that does link to an expanded hCard.

E.g. on FactoryCity, blog posts have mini-hCards for "published by", e.g. (white space added for readability):

Published by 
<span class="vcard author">
 <a href="http://factoryjoe.com/blog/author/factoryjoe/" class="url fn">
  Chris Messina
 </a>
</span>

On those same blog pages, there is a link labeled "Contact Information" that links to http://factoryjoe.com/blog/hcard/ which has an hCard with more information like phone number, birthday etc.


Auto-Discovery for XFN

An author will typically their XFN information on a specific page, rather than all pages. In particular, a specific page separate from the home page of their blog, and thus it would be useful to have an explicit rel value to assist in auto-discovery of XFN information.

This was suggested by Jens Alfke on 20050606 at the WWDC blogger's dinner.

geo improvements

These improvements apply to both geo and hCard.

I (Tantek) have seen examples of where there is a human viewable/clickable presentation of a point on a map, and the desire to include the machine readable geo information with the same element, e.g. something like:

<abbr class="geo" title="machine-readable-geo-info">
 human readable/clickable point on a map
</abbr>

But to do this we must specify a syntax for putting both the latitude and longitude into the title attribute as the machine-readable-geo-info.

Fortunately, there already is a syntax for that, in vCard RFC 2426 3.4.2:

   Type value: A single structured value consisting of two float values
   separated by the SEMI-COLON character (ASCII decimal 59).

   Type special notes: This type specifies information related to the
   global position of the object associated with the vCard. The value
   specifies latitude and longitude, in that order (i.e., "LAT LON"
   ordering).

...

   Type example:

        GEO:37.386013;-122.082932

Thus:

<abbr class="geo" title="37.386013;-122.082932">
 Mountain View, CA
</abbr>

I think this is pretty much a no-brainer, because the rules for parsing "geo" are simply altered to:


latitude longitude shorthand

If a "geo" property lacks explicit "latitude" and "longitude" subproperties, then the "geo" property is treated like any other string property (e.g. following rules for parsing <abbr title>, <img alt> etc.), where that string value has the same literal syntax as specified in RFC 2426 section 3.4.2: single structured value consisting of two float values separated by the SEMI-COLON character (ASCII decimal 59), specifying latitude and longitude, in that order.

geo links

In addition, people may publish Google Maps links like this:

<a href="http://maps.google.com/maps?q=37.386013+-122.082932">this spot</a>

or Yahoo! Maps links like this:

<a href="http://maps.yahoo.com/#lat=37.386013&lon=-122.082932&mag=3">this spot</a>

Is it worth permitting this to be a geo as well?

I'm raising this to make sure it is considered.

However, my first guess is NO for two reasons.

  1. No such examples in the wild have been documented or seen as of yet (I certainly haven't seen any).
  2. It would involve additional parsing requirements which are almost certainly going to be site/domain specific, and encoding a particular site's query parameter syntax into a format seems like a bad idea (against principle of decentralization).

This could be mitigated if mapping services would simply accept the literal vCard GEO syntax "37.386013;-122.082932", e.g. http://maps.google.com/maps?q=37.386013;-122.082932 (which currently doesn't work) then we could make a simple rule such as for hyperlinks, parse the href attribute for a geo value at the end of the href, delimited before the value by a "=" (or perhaps "/" for services that use friendlier URLs).

altitude

Some folks have asked for "altitude" as an extension to GEO. Currently we are rejecting all property/value extensions to hCard/vCard.

radius/zoom

Kevin Marks has asked for "radius" or "zoom" as an extension to GEO. Currently we are rejecting all property/value extensions to hCard/vCard.

ISO 19136

When it comes to anything geospatial, any unadorned / simple encoding must remain upwardly-compatible with the more sophisticated GML schema (Geography Markup Language ) which is also known as ISO 19136. This is so that all the fundamental nuances underpinning geocoding ( different datums, different projections, elevation, etc etc ) can ultimately ( or sooner ? ) be completely accounted for.

If you don't know/supply your Coordinate Reference System CRS identifier, your location could fall 100s of metres away from the position intended ie plot in the wrong location on a map. Appendix B of draft ISO/DIS 6709 highlights the variation among three commonly used systems.

ISO/DIS 6709

Draft International Standard ISO/DIS 6709 specifies the standard representation of geographic point location by coordinates. Section 6.3 notes the elements required required for geographic point location:

In this International Standard, geographic point location shall be represented by five elements:

  • a coordinate reference system identification;
  • coordinate representing “x” horizontal position such as latitude;
  • coordinate representing “y” horizontal position such as longitude;
  • for three-dimensional point locations, a value representing vertical position through either height or depth;
  • metadata associated with geographic point location(s) (ISO 19115)

Annex H details the ISO standard for text string representation of point location.

H.6 Format

H.6.1 Elements shall be combined in a point location string in the following sequence:

a) Latitude

b) Longitude

c) if represented, height or depth

d) Coordinate Reference System identifier


H.6.2 The number of digits for latitude, longitude and height (depth) shall indicate the precision of available data.


H.6.3 There shall be no separator between the elements for latitude, longitude, height (depth) and CRS. NOTE The use of designators "+", "-" and "CRS" preceding the value part of each element permits the recognition of the start of each element and the termination of the previous one.


H.6.4 The point location string shall be terminated. The terminator character shall be a solidus (/), unless otherwise specified in the documentation associated with interchange.

It differs from the notation of vCard, for example.


If ISO6709 is used, it is likely to be able to write as follows.

examples
<abbr class="geo" title="+40-075CRSxxxx/">
 Point represented as Degrees
</abbr>

<abbr class="geo" title="+401213.1-0750015.1+2.79CRSxxxx/">
 Point represented as Degrees, minutes, seconds and decimal seconds, with +2.79 a height or depth as defined through the CRS.
</abbr>


Geo Encodings

It is important that whenever location is described that it is achieved in the most openly interoperable manner. A relatively small number of encodings is needed that will meet the needs of a wide range of information communities and users. At http://www.georss.org/ two relatively simple schema have been published; one for WGS84 latitude/longitude ( termed 'simple'), and the other provisions for this AND coordinate reference systems other than WGS84 latitude/longitude ... of which there are a multitude - so this an argument for simple encodings to be upwardly-compatible with the more sophisticated GML schema (Geography Markup Language ).

ISO 19115

ISO 19115:2003 defines the schema required for describing geographic information and services. It provides information about the identification, the extent, the quality, the spatial and temporal schema, spatial reference, and distribution of digital geographic data.

Categorising locations

Perhaps categorsing locations would enable map mashups of microformatted information ? For example, show me a map of the nearest 'place of worship'. This fragment from an application schema illustrates a range of place categories http://www.linz.govt.nz/resources/esa-appl-schema-v1-9-5/esa-46.html#1804

Issues with vCard Applications

See vcard-implementations.

Open Questions

Q: since many of the components would be using CSS classes for encoding data, it is possible to MIX two different profiles. (e.g. hCard and XFN) There are no real constraints on where/how to enforce class names, these are based on the html profile, since it is difficult to associate the text within the attribute to a specific profile.

...
<a href="mailto:joe.smith@example.com" class="fn" rel="met">Joe Smith</a>
...

-- Brian Suda

Q: Preserving White space? Should the transforming applications preserve extra white space characters? For example:

<a href="http://mywebsite.com/" class="fn n">
    <span class="given-name">John</span>
    <span class="other-names">Q.</span>
    <span class="family-name">Public</span>
</a>

When transformed into a vCard, the N property will pick apart the span tags and create the value for N correctly seperated by colons. The FN property will take a string and simply display it. There are two possible renderings for FN:

John Q. Public

    John
    Q.
    Public

Either the white-space is preserved or it is not. Which should the transforming applications render?

-- Brian Suda

A: The parsing application should follow the white space collapsing rules of the mime type it retrieves. I.e. if it retrieves a "text/html" document, it should do HTML white space collapsing.

-- Tantek

Many of the Questions and Answers are relevant to both ["hCal"] and hCard.

Q: Would it be appropriate to wrap the name of the vCard owner with ? This may give the hCard some added semantic value in the XHTML document.

<span class="agent"> 
 <span class="vcard">
  <span class="email">
   <a class="internet" href="mailto:jfriday@host.com">
    <dfn>
       <span class="fn">Joe Friday</span>
    </dfn>
   </a>
  </span>
  <span class="tel">+1-919-555-7878</span>
  <span class="title">Area Administrator, Assistant</span>
 </span>
</span>

-- Ben Ward

Applications

Applications that are hCard aware or can convert hCard to vCard formats.

Copy hCards favelet(s)

  • I think a Favelet would work nicely here. When you find a page that is hCard friendly, you click the favlet and you get yourself a vCard. This is done! See X2V in the implementations section of the hCard spec.

Distributed Commentor Icons

  • See using hCards in your blog for an example of hCards used for comment authors (commentors). The system used there, "Gravatars", is a centralized site that serves commentor icons that requires login etc.

What if we gave each commentor the option of hosting their own icon?

A distributed commentor icon implementation could work like this:

  1. Given the URL of a commentor, look for an <address> element with classname of "vcard" at the commentor's URL. The <address> element is supposed to be the contact information for the page (see hCard FAQ for more info), so this makes sense.
  2. Next, look for the first element inside that hcard that has a classname of "logo".
  3. Hopefully that element is an <img>, and if so, use its src to get the commentor's icon.
  4. Presto. You've got distributed commentor icons!

Spam prevention

hCard uses mailto: links, and therefore it automatically "inherits" the disadvantage of mailto: links: These links can be easily detected by emails spiders (used by spammers).

Email addresses are picked up like any other link crawled by a search engine and trustworthy crawlers may be deterred from adding emphasis while indexing these links by including rel="nofollow" (See rel-nofollow). However, email addresses used for spam are crawled by email spiders which will likely ignore this attribute.

There are ways to prevent email address detection by simple email spiders, while still retaining full compatibility with (X)HTML applications. One common way is to "encode" the the "m" of "mail" and "@" with character entities, yet it's unwise to follow a convention of only encoding specific characters because the email spiders can pick up on this too:

Example of the original link:

<a class="email" href="mailto:john.smith@example.com">john.smith@example.com</a> 

Example of the "encoded" link (with rel-nofollow added):

<a class="e&#109;ail" rel="nofollow" href="&#109;ailto:john.smith&#064;example.com">john.smith&#064;example.com</a>

Simple email spiders which do not do character entity decoding will therefore not be able to find your email address.

Note: Perhaps there are or will be email spiders which can decode entities, so the this technique will only help with some (cheap) email spiders. (See also: http://rbach.priv.at/Misc/2005/EmailSpiderTest)

Other prevention methods to consider

  • Using server-side code to implement character entities randomly
  • Displaying the address in a way thought to be only human readable (thus breaking the link):
    • Using an image instead of text (could still be machine readable using OCR)
    • Using human readable text that conveys the need for editing before use (eg PLEASE-NO-SPAM_name@example_NO-SPAM.com)
  • Using javascript for client-side decryption of an encrypted address (requires javascript to be enabled)
  • Pointing to an email form or other URL instead of an email address

Tutorials

  • How to hCard encode entries in Popular blog software.
  • Good reasons to publish your hCard
    • as a business, get people to put you in their address book so they'll find you later
    • as a business with an email list, get people to add you (with email address) to their address book so that your email list works via whitelisting via the address book.

Parsing

See separate hCard parsing page.

Post vCard additions

Some have found vCard to be limiting in terms of the data/properties/fields they want to express in contact information. Some implementations use vCard extensions to express such information.

This section is for documentation of such suggested additions. Note, we will require empirical evidence of actual *real world* examples on the Web of people publishing this information as part of contact information, before considering such additions/extensions.

  • altitude. From hcard-issues.
    • No evidence provided that contact information on the Web publishes this information.
  • vat-number : for VAT numbers of companies, which are used a lot in Europe and they need to be published on Belgian publications (including websites).

Wikipedia's Persondata

Wikipedia's Persondata aligns very closely with hCard, but has additional date and place of birth & death fields. Andy Mabbett 13:02, 28 Jan 2007 (PST)

TODO

  • The hcard-profile needs verification and perhaps a URL for retrieving the actual XMDP, rather than as <pre> text on a wiki page.
  • Complete translating the examples from the vCard spec into hCard, and place them on a separate hCard examples page.
  • Create a "rich" but realistic hCard example, say for example for a salesperson, who wants to put a whole bunch of contact information on their website in order to be found/contacted easily.
  • Provide examples of how to encode instant messaging (IM) accounts. Figure out what would the mailto: or aim: URL in hCard look like in vCard. And take a look at what vCard applications do today with IM addresses.

References

Normative References

Informative References

Other Implementations/Ideas

  • Representing vCard Objects in RDF/XML This could allow conversion of vCard data from XHTML to RDF and from RDF to XHTML
  • It would also be possible to convert XFN and hCard to FoaF and back.


Ambiguous name components

When automatically publishing hCards from pre-existing data, it's not necessarily possible to tell which words in a name map to which hCard properties. When the structure of a name is unknown, it is hard to ensure an automatically published hCard remains valid.

There's currently no easy answer to this.

One implementation suggestion is a 'best-guess' algorithm, something along the lines of:

  1. If the name is one word, attempt implied nickname optimization
  2. If the name is two words, attempt implied n optimization
  3. For three or more words
    1. Perform a lookup against known sub-name combinations (e.g. 'Sarah Jane', 'Vander Wal')
    2. Apply the grammar "given-name additional-name(s) family-name"

The principal behind this suggestion is that it's better to make a good guess and potentially miscategorize an ambiguous name component than to generate an invalid hCard.

Accepted Suggestions

Encoding Company data as a Business Card (proposal)

( Accepted: http://microformats.org/wiki/hcard#Organization_Contact_Info )

In the wild there are several hCards that do not currently validate because they are businesses that have omitted the "fn" property in favor of the "org" property.

Proposal: hCards representing a business or organization MUST set fn AND org to the same value. Parsers may then use this equivalence, if detected, to treat an hCard as the contact info for a business or organization rather than an individual.

Note that Apple Address Book supports this semantic when importing vCards.

See the Technorati Contact Info for an example.

Implied "FN and N" Optimization (proposal)

Right now a parser first looks for an "n" element.

And then if no "n" is present, look for an "fn" element to use to imply an "n" element per the "implied n property" rules in the spec.

BACKGROUND:

Due to the prevalence of the use of "nicknames" or "handles" on the Web, in actual content published on the Web (e.g. authors of reviews), there has been a discussion about adding a "fn" shortcut to the "n" shortcut that used the "nickname" as a fallback.

PROPOSAL:

We should consider adding one more implied optimization after the steps documented above and that is:

If no "fn" is present either, then look for a "nickname" element to use to imply both the "fn", and the "n/given-name", leaving the "n/family-name" as empty.

This would enable "nickname" only hCards for denoting and individual on a website, which is quite common on blogs and reviews published on the Web.


Rejected Suggestions

Suggestion: The use of class="url" on an <a> tag to represent an hCard URL property is redundant. By virtue of the <a> tag you know this is a URL.

Rejected. This is a bad suggestion because although it appears to reduce redunancy and keep things cleaner, it also creates a few problems. Without explicitly noting that this is a URL then any <a> tags within a 'vcard' would be considered a URL, for example:

<span class="vcard">
...
<ul class="categories">
<li><a href="http://w3c.org">W3C</a></li>
</ul>
...
</span>

There is no way to "turn-off" the encoding of the W3C URL, whereas if "url" needed to be explicitly listed in the class attribute list, then by NOT listing it you could effectively turn it off.

Related Pages

The hCard specification is a work in progress. As additional aspects are discussed, understood, and written, they will be added. These thoughts, issues, and questions are kept in separate pages.