xfn-to-foaf: Difference between revisions
m (→Inspiration, References: Add link to mbox_sha1sum thread\) |
(→Determining the object: Integrate mbox_sha1sum and mbox ideas, plus foaf:img links.) |
||
Line 44: | Line 44: | ||
# In the high-bandwidth situation, parsers {{may}} follow the link target to look for a representative hCard for the object. | # In the high-bandwidth situation, parsers {{may}} follow the link target to look for a representative hCard for the object. | ||
#* (Parsers {{may}} then further follow rel="me" links from <nowiki>http://bob.example.net</nowiki> in order to find a "better" representative hCard for Bob.) | #* (Parsers {{may}} then further follow rel="me" links from <nowiki>http://bob.example.net</nowiki> in order to find a "better" representative hCard for Bob.) | ||
# If no hCard representing the object of the relationship has been found, then the object is | # If no hCard representing the object of the relationship has been found, then the object is taken to be a <code>foaf:Person</code> with a <code>foaf:name</code> corresponding to the link text and, depending on the kind of link provided in the <code>href</code> attribute, a <code>foaf:mbox</code> (for "mailto:" links), <code>foaf:mbox_sha1sum</code> (for "urn:sha1:" URLs), <code>foaf:img</code> (image links, determined by <code>type</code> attribute or HTTP headers, not by file name) or <code>foaf:page</code> (all other links). Our example would generate the following triples: <pre><nowiki><#y> a foaf:Person; | ||
<pre><nowiki><# | |||
foaf:page <http://bob.example.net>; | foaf:page <http://bob.example.net>; | ||
foaf:name "Bob Smith".</nowiki></pre> | foaf:name "Bob Smith".</nowiki></pre> |
Revision as of 14:55, 29 April 2008
XFN → FOAF
by Toby Inkster
A number of people have expressed an interest in extracting RDF-like data from XFN and hCard. The problem is that while XFN is interpreted as representing a relationship between two people, it actually encodes a relationship between two URIs.
This page describes a technique for figuring out which people these URIs represent. It is not an attempt to describe a new specification or standard, but rather, a set of best practices. Two algorithms are described: a "high-bandwidth" version which requires web crawling, and a "low-bandwidth" version which uses only the information found on the initial page.
For most of the examples on this page, the following XFN link will be used:
<a rel="friend met" href="http://bob.example.net">Bob Smith</a>
which has been found on Alice Jones' web page at http://alice.example.net.
Determining the subject
- Find the representative hCard for the current page.
- In the high-bandwidth situation, parsers MAY crawl
rel=me
links in order to find a "better" representative hCard, where the meaning of "better" is to be defined by the parser itself.- (Parsers SHOULD impose a depth limit for crawling.)
- If no hCard for the subject has been found, the subject is a person represented by the following RDF triples:
<#x> a foaf:Person; foaf:page <http://alice.example.net>.
A parser which understands RDFa or other semantics may use additional techniques to determine the subject of the link, but those are beyond the scope of the Microformats wiki.
Determining the predicate
This is the easiest step.
For rel="me"
the predicate is foaf:page
.
For other relationships, the local name of the predicate is the same as the rel value, and the namespace URI is defined as http://gmpg.org/xfn/11#
. All non-me XFN values are considered to be refinements of foaf:knows. As an example the fully qualified URIs for the predicates associated with rel="met friend"
are:
http://gmpg.org/xfn/11#met
http://gmpg.org/xfn/11#friend
http://xmlns.com/foaf/0.1/knows
Determining the object
- If the link element is a descendant of an element with
class="vcard"
which is not the representative hCard for the page, then this hCard is taken to represent the person who is the object of the relationship. - If a non-representative hCard exists on the page with a UID property which is identical to the target of the link element, then this hCard is taken to represent the object of the relationship.
- In the high-bandwidth situation, parsers MAY follow the link target to look for a representative hCard for the object.
- (Parsers MAY then further follow rel="me" links from http://bob.example.net in order to find a "better" representative hCard for Bob.)
- If no hCard representing the object of the relationship has been found, then the object is taken to be a
foaf:Person
with afoaf:name
corresponding to the link text and, depending on the kind of link provided in thehref
attribute, afoaf:mbox
(for "mailto:" links),foaf:mbox_sha1sum
(for "urn:sha1:" URLs),foaf:img
(image links, determined bytype
attribute or HTTP headers, not by file name) orfoaf:page
(all other links). Our example would generate the following triples:<#y> a foaf:Person;
foaf:page <http://bob.example.net>; foaf:name "Bob Smith".
A parser which understands RDFa or other semantics may use additional techniques to determine the object of the link, but those are beyond the scope of the Microformats wiki.
Example
The following example is assumed to have been found at http://alice.example.net. For simplicity's sake, we assume the low-bandwidth situation.
<html lang="en"> <title>Alice Jones</title> <div class="vcard"> <h1 class="fn">Alice Jones</h1> <p class="adr"> <span class="locality">Sydney</span>, <span class="country-name">Australia</span>. </p> <p> <a href="http://alice.example.com/blog/" rel="me" class="url"> Alice's Blog </a> </p> </div> <h2>Friends & Contacts</h2> <ul> <li class="vcard"> <a class="fn url" href="http://bob.example.net" rel="friend met"> Bob Smith </a> </li> <li> <a href="http://carol.example.net" rel="co-worker met"> Carol Brown </a> </li> <li> <a href="http://dave.example.net" rel="friend neighbor met"> Dave Wong </a> </li> <li> <a href="http://eve.example.net" rel="adversary met"> Eve Ville </a> </li> </ul> <address class="vcard"> Page maintained by <a href="http://eve.example.net" class="url uid" >Eve Ville</a>. Contact <a class="email" href="mailto:eve@example.net" >eve@example.net</a> for corrections. (I know I'm not the most trustworthy of sources.) </address> </html>
The subject of all the XFN links on the page is the hCard for Alice Jones at the top of the page. This is determined in step two of the representative hCard parsing procedure because it contains rel="me"
.
The next XFN link is the one labelled "Bob Smith". Because the link is part of an hCard, the person described by the hCard is the object of the link.
For the next two XFN links, there exist no hCards that represent the objects. We can gather some information about them from the link element itself: their foaf:name
and foaf:page
. (Note that FOAF defines foaf:name very loosely, so it's OK if the link text is a nickname.)
Although at first glance the XFN link for Eve Ville looks similar, there is in fact an hCard later on in the page with a UID matching the XFN link target, so using rule #2 for determining the object, we use this hCard as the object of the XFN relationship. Note that "adversary" is not an XFN rel value, so is beyond the scope of this document.
Possible RDF Output
The following is a possible RDF/XML representation of the information in the example above.
<?xml version="1.0"?> <rdf:RDF xmlns="http://xmlns.com/foaf/0.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:xfn="http://gmpg.org/xfn/11#" xmlns:hcard="urn:ietf:rfc:2426#"> <!-- Alice --> <Person rdf:about="http://alice.example.net"> <!-- data from Alice's hCard --> <hcard:fn>Alice Jones</hcard:fn> <hcard:adr> <hcard:locality>Sydney</hcard:locality> <hcard:country-name>Australia</hcard:country-name> </hcard:adr> <hcard:url rdf:resource="http://alice.example.com/blog/" /> <!-- data from Alice's XFN links --> <page rdf:resource="http://alice.example.com/blog/" /> <knows rdf:resource="#Bob" /> <xfn:met rdf:resource="#Bob" /> <xfn:friend rdf:resource="#Bob" /> <knows rdf:resource="#Carol" /> <xfn:met rdf:resource="#Carol" /> <xfn:co-worker rdf:resource="#Carol" /> <knows rdf:resource="#Dave" /> <xfn:met rdf:resource="#Dave" /> <xfn:friend rdf:resource="#Dave" /> <xfn:neighbor rdf:resource="#Dave" /> <knows rdf:resource="http://eve.example.net" /> <xfn:met rdf:resource="http://eve.example.net" /> </Person> <!-- Bob, data from hCard --> <Person rdf:ID="Bob"> <hcard:fn>Bob Smith</hcard:fn> <hcard:url rdf:resource="http://bob.example.net" /> </Person> <!-- Carol, implied data --> <Person rdf:ID="Carol"> <name>Carol Brown</name> <page rdf:resource="http://carol.example.net" /> </Person> <!-- Dave, implied data --> <Person rdf:ID="Dave"> <name>Dave Wong</name> <page rdf:resource="http://dave.example.net" /> </Person> <!-- Eve, data from hCard --> <Person rdf:about="http://eve.example.net"> <hcard:fn>Eve Ville</hcard:fn> <hcard:url rdf:resource="http://eve.example.net" /> <hcard:uid rdf:resource="http://eve.example.net" /> <hcard:email rdf:resource="mailto:eve@example.net" /> </Person> </rdf:RDF>
Note that some personal data for contacts is expressed in the FOAF vocabulary, and some information is expressed in vCard/hCard vocabulary. User agents may use OWL or another technique to draw equivalencies between vocabularies, such as taking hcard:fn
to be equivalent to foaf:name
.
Organisation hCards and XFN
If either the subject or object hCard represents an organisation (rather than a person), the following relationships are meaningless:
- acquaintance
- friend
- child
- parent
- sibling
- spouse
- kin
- crush
- date
- sweetheart
Inspiration, References
- Re: XFN is getting smoked by FOAF - Toby Inkster, 2008-03-11
- Re: XFN + hCard - Toby Inkster, 2008-03-12
- Re: A (big) problem with XFN: identity of source and target not findable - Toby Inkster, 2008-03-18
- Re: A (big) problem with XFN: identity of source and target not findable - David Janes, 2008-03-18
- Re: A (big) problem with XFN: identity of source andtarget not findable - Toby Inkster, 2008-03-18
- Re: A (big) problem with XFN: identity of source andtarget not findable - Roger L Costello, 2008-03-20
- Re: Coding mbox_sha1sum in XFN - Toby Inkster, 2008-04-26
Ideas
Here are some quick ideas I plan on thinking over, expanding upon and then integrating into the main body of text soonish:
- If the URI linked to begins "mailto:" then equivalent to
foaf:mbox
. - If the URI linked to begins "urn:sha1:" then equivalent to
foaf:mbox_sha1sum
. - If the
<a>
element has a type attribute which matches regexp/^image\//i
then equivalent tofoaf:img
. (Note don't try to match URI against common file extensions. That is an antipattern.)