From mail at tobyinkster.co.uk Tue Apr 1 01:20:33 2008 From: mail at tobyinkster.co.uk (Toby A Inkster) Date: Tue Apr 1 02:01:26 2008 Subject: [uf-discuss] Re: Reviving =?iso-8859-7?q?=A1This_Week_in_Microformats=A2?= References: Message-ID: <1cb9c5-mdd.ln1@ophelia.g5n.co.uk> Ben Ward wrote: > Each week a new Wiki page will be created to live-edit the ?This Week?? > post, and everyone is invited to contribute to it. Has this died out already? -- Toby A Inkster BSc (Hons) ARCS [Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux] [OS: Linux 2.6.17.14-mm-desktop-9mdvsmp, up 5 days, 20:39.] Cognition 0.1 Alpha 6 http://tobyinkster.co.uk/blog/2008/03/29/cognition-alpha6/ From davidjanes at blogmatrix.com Tue Apr 1 02:05:44 2008 From: davidjanes at blogmatrix.com (David Janes) Date: Tue Apr 1 02:05:50 2008 Subject: =?WINDOWS-1252?Q?Re:_[uf-discuss]_Re:_Revivin?= =?WINDOWS-1252?Q?g_=91This_Week_in_Microformats=92?= In-Reply-To: <1cb9c5-mdd.ln1@ophelia.g5n.co.uk> References: <1cb9c5-mdd.ln1@ophelia.g5n.co.uk> Message-ID: <21e523c20804010305m75009f08q4954f533d3123cf6@mail.gmail.com> I tried to do my bit! 2008/4/1 Toby A Inkster : > > Ben Ward wrote: > > > Each week a new Wiki page will be created to live-edit the 'This Week?' > > post, and everyone is invited to contribute to it. > > Has this died out already? > > -- > Toby A Inkster BSc (Hons) ARCS > [Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux] > [OS: Linux 2.6.17.14-mm-desktop-9mdvsmp, up 5 days, 20:39.] > > Cognition 0.1 Alpha 6 > http://tobyinkster.co.uk/blog/2008/03/29/cognition-alpha6/ > > > > > _______________________________________________ > microformats-discuss mailing list > microformats-discuss@microformats.org > http://microformats.org/mailman/listinfo/microformats-discuss > -- David Janes Founder, BlogMatrix http://www.blogmatrix.com http://www.onaswarm.com http://www.onamine.com From davidjanes at blogmatrix.com Tue Apr 1 02:16:25 2008 From: davidjanes at blogmatrix.com (David Janes) Date: Tue Apr 1 02:16:27 2008 Subject: =?WINDOWS-1252?Q?Re:_[uf-discuss]_Re:_Revivin?= =?WINDOWS-1252?Q?g_=91This_Week_in_Microformats=92?= In-Reply-To: <1cb9c5-mdd.ln1@ophelia.g5n.co.uk> References: <1cb9c5-mdd.ln1@ophelia.g5n.co.uk> Message-ID: <21e523c20804010316k29550317w259d379c98102f89@mail.gmail.com> 2008/4/1 Toby A Inkster : > Ben Ward wrote: > > > Each week a new Wiki page will be created to live-edit the 'This Week?' > > post, and everyone is invited to contribute to it. > > Has this died out already? May I suggest that this is something that a once-a-week message be sent out, say the day before it's to be published. Regards, etc... David -- David Janes Founder, BlogMatrix http://www.blogmatrix.com http://www.onaswarm.com http://www.onamine.com From lists at ben-ward.co.uk Tue Apr 1 03:00:18 2008 From: lists at ben-ward.co.uk (Ben Ward) Date: Tue Apr 1 03:00:25 2008 Subject: =?WINDOWS-1252?Q?Re:_[uf-discuss]_Re:_Reviving_=91This_Week_in_M?= =?WINDOWS-1252?Q?icroformats=92?= In-Reply-To: <21e523c20804010305m75009f08q4954f533d3123cf6@mail.gmail.com> References: <1cb9c5-mdd.ln1@ophelia.g5n.co.uk> <21e523c20804010305m75009f08q4954f533d3123cf6@mail.gmail.com> Message-ID: <669F09BD-2F61-45BE-83A6-BDE64CF7C34D@ben-ward.co.uk> On 1 Apr 2008, at 11:05, David Janes wrote: > I tried to do my bit! > > 2008/4/1 Toby A Inkster : >> >> Ben Ward wrote: >> >>> Each week a new Wiki page will be created to live-edit the 'This >>> Week?' >>> post, and everyone is invited to contribute to it. >> >> Has this died out already? Argh, sorry! This has been posted now, combined together from the previous two drafts. THANK YOU to everyone who's been putting effort into keeping the drafts alive. I've created a new draft page for next weeks entry at http:// microformats.org/wiki/this-week-2008-03-31 B From gordon at onlinehome.de Tue Apr 1 03:09:08 2008 From: gordon at onlinehome.de (Gordon) Date: Tue Apr 1 03:09:16 2008 Subject: [uf-discuss] hCardMapper v0.96 Message-ID: <47F217D4.1080302@onlinehome.de> Hi everyone, I have released a new version of the hCardMapper script. The hCardMapper now has much better support for Json returned by Optimus, ufXtract and hKit parsers. Demo (with mofo parser) and download at http://lib.omnia-computing.de/hcardmapper Thanks to everyone who sent feedback to me, especially to Matthias Pfefferle, who made the script into a Wordpress plugin, which can be found at http://svn.wp-plugins.org/hcard-commenting/branches/ Unfortunately, all Microformat parsers yield different results when it comes to representing hCards in Json. None follows the jCard standard suggested at http://microjson.org/wiki/JCard. This makes mapping the Json onto the form fields much more difficult than it could be and bloats the script. I was wondering if there is currently any effort to further standardize the representation of Microformats in Json and to have Microformat parsers implement them. Can anyone give me a clue about this please? Thanks and Cheers, Gordon From mail at tobyinkster.co.uk Tue Apr 1 06:50:11 2008 From: mail at tobyinkster.co.uk (Toby A Inkster) Date: Tue Apr 1 07:01:23 2008 Subject: [uf-discuss] Re: hCardMapper v0.96 References: <47F217D4.1080302@onlinehome.de> Message-ID: <3mu9c5-eqe.ln1@ophelia.g5n.co.uk> Gordon wrote: > Unfortunately, all Microformat parsers yield different results when it > comes to representing hCards in Json. None follows the jCard standard > suggested at http://microjson.org/wiki/JCard. The suggestion at that page defined new terms for various VCARD properties. For example, "postal-code" becomes "postalCode". (Yes, I do realise that hyphenated names are more difficult to use as JSON keys in Javascript.) There is no pattern to how these new terms are defined. e.g. the example above drops the hyphen and adopts camelCase, but "given-name" apparently becomes "given", and "adr" becomes "address". With these inconsistencies in naming, the only way an author could implement jCard would be if there was a full table mapping between hCard and jCard terms. There is no such table on that page -- authors need to make guesses. Whatsmore, in the example given, "tel" takes a single string as a value, whereas surely it should be an array? People can have multiple phone numbers. Ditto the single string for "email" and the single object for "address". If these issues could be addressed, I'd be happy to work on a jCard output module for Cognition. -- Toby A Inkster BSc (Hons) ARCS [Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux] [OS: Linux 2.6.17.14-mm-desktop-9mdvsmp, up 6 days, 1:55.] Cognition 0.1 Alpha 6 http://tobyinkster.co.uk/blog/2008/03/29/cognition-alpha6/ From pfefferle at gmail.com Tue Apr 1 08:05:32 2008 From: pfefferle at gmail.com (Matthias Pfefferle) Date: Tue Apr 1 08:05:37 2008 Subject: [uf-discuss] hCardMapper v0.96 In-Reply-To: References: Message-ID: ...but withal that wrong naming of the jCard attributes, microJSON is a good idea to solve this given problem. We should use the started wiki page: http://microformats.org/wiki/json to find the best mapping between the HTML- and JSON version of a Microformat. From pfefferle at gmail.com Tue Apr 1 07:55:53 2008 From: pfefferle at gmail.com (Matthias Pfefferle) Date: Tue Apr 1 08:21:04 2008 Subject: [uf-discuss] Re: hCardMapper v0.96 Message-ID: ...but withal that wrong naming of the jCard attributes, microJSON is a good idea to solve this given problem. We should use the started wiki page: http://microformats.org/wiki/json to find the best mapping between the HTML- and JSON version of a Microformat. From gordon at onlinehome.de Tue Apr 1 08:49:54 2008 From: gordon at onlinehome.de (Gordon) Date: Tue Apr 1 08:55:01 2008 Subject: [uf-discuss] Re: hCardMapper v0.96 In-Reply-To: <3mu9c5-eqe.ln1@ophelia.g5n.co.uk> References: <47F217D4.1080302@onlinehome.de> <3mu9c5-eqe.ln1@ophelia.g5n.co.uk> Message-ID: <47F267B2.2070304@onlinehome.de> Toby A Inkster schrieb: > Gordon wrote: > > >> Unfortunately, all Microformat parsers yield different results when it >> comes to representing hCards in Json. None follows the jCard standard >> suggested at http://microjson.org/wiki/JCard. >> > > The suggestion at that page defined new terms for various VCARD > properties. For example, "postal-code" becomes "postalCode". (Yes, I do > realise that hyphenated names are more difficult to use as JSON keys in > Javascript.) > > There is no pattern to how these new terms are defined. e.g. the example > above drops the hyphen and adopts camelCase, but "given-name" apparently > becomes "given", and "adr" becomes "address". With these inconsistencies > in naming, the only way an author could implement jCard would be if there > was a full table mapping between hCard and jCard terms. There is no such > table on that page -- authors need to make guesses. > > Whatsmore, in the example given, "tel" takes a single string as a value, > whereas surely it should be an array? People can have multiple phone > numbers. Ditto the single string for "email" and the single object for > "address". > > If these issues could be addressed, I'd be happy to work on a jCard output > module for Cognition. > > Good call. I'd say the most obvious solution would be to make all properties that can have multiple occurences into plurals, so "nickname" becomes "nicknames", "email" becomes "emails" and so on. Properties that appear multiple times but contain a simple datatype should be Arrays of that datatype, so nicknames is an Array of Strings, while eMails is an Array of Objects, because an email has a type and a value. I'd suggest using an Array to hold this Object even if there is just one Object inside. For singular properties I suggest singular naming, so fn stays fn. Singular properties with a complex datatype, like n hold one corresponding Object. I suggest we camelize any hypenated properties, though I wouldn't mind underscoring them either. I have prepared a quick diagram that might help to headstart this: http://lib.omnia-computing.de/images/JCard.png From mail at tobyinkster.co.uk Tue Apr 1 13:18:17 2008 From: mail at tobyinkster.co.uk (Toby A Inkster) Date: Tue Apr 1 13:18:42 2008 Subject: [uf-discuss] Re: hCardMapper v0.96 Message-ID: <29F31AA1-FC21-4188-A4FF-674B60AEEDE3@tobyinkster.co.uk> Gordon wrote: > I have prepared a quick diagram that might help to headstart this: > http://lib.omnia-computing.de/images/JCard.png The "type" subproperties for email, tel and adr should be arrays. Also, note that all the subproperties of "adr" may be plural. http://microformats.org/discuss/mail/microformats-dev/2008-January/ thread.html#421 -- Toby A Inkster From adeklein at gmail.com Tue Apr 1 14:42:46 2008 From: adeklein at gmail.com (Albert de Klein) Date: Tue Apr 1 14:42:50 2008 Subject: [uf-discuss] Monkeyformats - adding microformats to third-party web sites with GreaseMonkey Message-ID: Hi All, For a Dutch online magazine for web workers I've written an article (http://naarvoren.nl/artikel/monkeyformats/) in which I explain how you can add microformats to third-party web sites with GreaseMonkey. These client-side microformats work well with the Operator add-on and let you do microformatish things with websites that do not yet support microformats but might prove very useful as microformat examples. In the Netherlands microformat adoptation by popular sites hasn't gained momentum yet as it has in US. I've put up some example GreaseMonkey scripts on the Userscripts.org repository with the tag monkeyformats (http://userscripts.org/tags/monkeyformats) to make them easily retrievable. Do any of you know any popular English sites that lack microformats but might be great microformat examples that could be added to the monkeyformat collection? Regards, Albert de Klein From gordon at onlinehome.de Tue Apr 1 15:13:59 2008 From: gordon at onlinehome.de (Gordon) Date: Tue Apr 1 15:14:08 2008 Subject: [uf-discuss] Re: hCardMapper v0.96 In-Reply-To: <29F31AA1-FC21-4188-A4FF-674B60AEEDE3@tobyinkster.co.uk> References: <29F31AA1-FC21-4188-A4FF-674B60AEEDE3@tobyinkster.co.uk> Message-ID: <47F2C1B7.9060807@onlinehome.de> Toby A Inkster schrieb: > Gordon wrote: > >> I have prepared a quick diagram that might help to headstart this: >> http://lib.omnia-computing.de/images/JCard.png > > The "type" subproperties for email, tel and adr should be arrays. > > Also, note that all the subproperties of "adr" may be plural. > http://microformats.org/discuss/mail/microformats-dev/2008-January/thread.html#421 > > Thanks for pointing that out. I second Andy Mabbet's concern about allowing multiple instances for fields like e.g. country not making sense. But if this is what was agreed upon, let's stick to it. Some more questions: In one of my earlier drafts for this diagram, I had implemented givenName and familyName as multiple. I am not sure why I did this and cannot find proper reference in the RFC. Can these be multiple? Can geo be multiple? It is listed with the singular properties, but if geo is meant to reference a place, shouldn't geo be part of the Address it references? It's probably more of a vCard to hCard to jCard legacy problem, but any opinions on that? I have updated the diagram. In case anyone wants to edit in on their own, there is also a zip file that contains the diagram in dia, svg and visio format. Diagram: http://lib.omnia-computing.de/images/JCard.png Zip: http://lib.omnia-computing.de/files/jcard.zip From mdagn at spraci.com Tue Apr 1 20:36:06 2008 From: mdagn at spraci.com (Michael MD) Date: Tue Apr 1 20:36:11 2008 Subject: [uf-discuss] hCardMapper v0.96 References: Message-ID: <003b01c8947b$11b59420$116bacca@COMCEN> > ...but withal that wrong naming of the jCard attributes, microJSON is > a good idea to solve this given problem. > > We should use the started wiki page: http://microformats.org/wiki/json > to find the best mapping between the HTML- and JSON version of a > Microformat. interesting ... I was looking ages ago for any work related to representing microformats as data structures wrote a little perl microformat parser for use with user generated html snippets a while back and never quite made up my mind about its output format. one of the versions of it ended up spitting out something quite similar to this except that it always had arrays for the leaf elements. From mail at tobyinkster.co.uk Wed Apr 2 00:08:31 2008 From: mail at tobyinkster.co.uk (Toby A Inkster) Date: Wed Apr 2 00:09:05 2008 Subject: [uf-discuss] (no subject) Message-ID: <4243A62B-9FBF-4707-9B42-319BF1FF8F39@tobyinkster.co.uk> Gordon wrote: > I second Andy Mabbet's concern about allowing multiple instances for > fields like e.g. country not making sense. RFC 2426 says, 'Where it makes semantic sense, individual text components can include multiple text values (e.g., a "street" component with multiple lines) separated by the COMMA character (ASCII decimal 44).' It doesn't specify which fields it makes sense for, and which it does not ? that is left to the judgement of the vCard producers and consumers. Here's an example with multiple countries, which makes sense:
The Scottish Parliament,
Edinburgh,
EH99 1SP,
Scotland,
United Kingdom
Other countries that lie within the United Kingdom include England and Wales, and some might argue Northern Ireland and perhaps even Cornwall. The concept of countries within countries could also be applied to the United Arab Emirates, and Bosnia and Herzegovina, but not to say, Australia or the United States where although the member states enjoy similar levels of autonomy, few would seriously describe them as countries. > In one of my earlier drafts for this diagram, I had implemented > givenName and familyName as multiple. I am not sure why I did this and > cannot find proper reference in the RFC. Can these be multiple? Similar situation. I can't think of any reason for including multiple given-names and family-names in a vCard, but perhaps it might be common in some other cultures ? I don't know. Still, there's no harm in supporting multiple instances. The hCard spec says multiple instances are allowed. > Can geo be multiple? Not according to the hCard spec. The RFC isn't particularly explicit. My parser supports multiple geos within a vCard. But that's just me being lazy ? easier to pull them all into an array than to choose one to keep. > It is listed with the singular properties, but if geo is meant to > reference a place, shouldn't geo be part of the Address it references? I can't imagine that many parsers would have a problem with geo being nested within the address, but in vCard terms its a separate property, so once parsed, should be represented separately from the address. > http://lib.omnia-computing.de/images/JCard.png You have "n" as being "0..*". Each hCard should have exactly one "n" property, Though it may be implied rather than explicit, and it may be empty in the case of hCards for organisations. -- Toby A Inkster From gordon at onlinehome.de Wed Apr 2 07:03:00 2008 From: gordon at onlinehome.de (Gordon) Date: Wed Apr 2 07:03:09 2008 Subject: [uf-discuss] Standardized Representation of Microformats in JSON / was: (no subject) & hCardMapper v0.96 In-Reply-To: <4243A62B-9FBF-4707-9B42-319BF1FF8F39@tobyinkster.co.uk> References: <4243A62B-9FBF-4707-9B42-319BF1FF8F39@tobyinkster.co.uk> Message-ID: <47F3A024.5070306@onlinehome.de> Toby A Inkster schrieb: > Gordon wrote: > >> I second Andy Mabbet's concern about allowing multiple instances for >> fields like e.g. country not making sense. > > RFC 2426 says, 'Where it makes semantic sense, individual text > components can include multiple text values (e.g., a "street" > component with multiple lines) separated by the COMMA character (ASCII > decimal 44).' It doesn't specify which fields it makes sense for, and > which it does not ? that is left to the judgement of the vCard > producers and consumers. > > Here's an example with multiple countries, which makes sense: > >
> The Scottish Parliament, >
> Edinburgh,
> EH99 1SP,
> Scotland,
> United Kingdom >
> > Other countries that lie within the United Kingdom include England and > Wales, and some might argue Northern Ireland and perhaps even > Cornwall. The concept of countries within countries could also be > applied to the United Arab Emirates, and Bosnia and Herzegovina, but > not to say, Australia or the United States where although the member > states enjoy similar levels of autonomy, few would seriously describe > them as countries. > >> In one of my earlier drafts for this diagram, I had implemented >> givenName and familyName as multiple. I am not sure why I did this and >> cannot find proper reference in the RFC. Can these be multiple? > > > Similar situation. I can't think of any reason for including multiple > given-names and family-names in a vCard, but perhaps it might be > common in some other cultures ? I don't know. Still, there's no harm > in supporting multiple instances. The hCard spec says multiple > instances are allowed. > >> Can geo be multiple? > > Not according to the hCard spec. The RFC isn't particularly explicit. > My parser supports multiple geos within a vCard. But that's just me > being lazy ? easier to pull them all into an array than to choose one > to keep. > >> It is listed with the singular properties, but if geo is meant to >> reference a place, shouldn't geo be part of the Address it references? > > I can't imagine that many parsers would have a problem with geo being > nested within the address, but in vCard terms its a separate property, > so once parsed, should be represented separately from the address. > > > http://lib.omnia-computing.de/images/JCard.png > > You have "n" as being "0..*". Each hCard should have exactly one "n" > property, Though it may be implied rather than explicit, and it may be > empty in the case of hCards for organisations. > Thanks Toby. I have updated the zip file and the diagram now. One more thing: I am unsure about the plural naming convention I suggested. Apart from indicating if the value of a property can hold multiple values, I don't see any added benefit right now. But I do see an unnecessary level of complexity when it comes to parsing an hCard into a jCard. You would need to know how to properly inflect a property, e.g. you cannot simply add an s to "honorific-prefix". And whats the plural of adr anyway? Same for converting plurals back into singulars. And since we want to represent an hCard anyway, why should we deviate from the property names of an hCard, if it's not adding any worthwhile benefit. So, back to singular? From mail at ciaranmcnulty.com Wed Apr 2 08:38:37 2008 From: mail at ciaranmcnulty.com (Ciaran McNulty) Date: Wed Apr 2 08:38:47 2008 Subject: [uf-discuss] Standardized Representation of Microformats in JSON / was: (no subject) & hCardMapper v0.96 In-Reply-To: <47F3A024.5070306@onlinehome.de> References: <4243A62B-9FBF-4707-9B42-319BF1FF8F39@tobyinkster.co.uk> <47F3A024.5070306@onlinehome.de> Message-ID: On Wed, Apr 2, 2008 at 4:03 PM, Gordon wrote: > One more thing: I am unsure about the plural naming convention I suggested. > Apart from indicating if the value of a property can hold multiple values, > I don't see any added benefit right now. But I do see an unnecessary level > of complexity when it comes to parsing an hCard into a jCard. You would need > to know how to properly inflect a property, e.g. you cannot simply add an s > to "honorific-prefix". And whats the plural of adr anyway? Same for > converting plurals back into singulars. And since we want to represent an > hCard anyway, why should we deviate from the property names of an hCard, if > it's not adding any worthwhile benefit. So, back to singular? Totally agreed, the only change to the hCard field names should be from hyphen-separated to camelCase (this transform is used in stylesheet attribute names in DOM so may be formally specified somewhere if it seems non-obvious enough we need to lay it out. There's very little utility in having the field names reflect their plurality - a parser needs to know how to parse each field separately anyhow, for most real-world applications. Keeping vCard names is more useful than changing them for very little gain. -Ciaran McNulty From danbri at danbri.org Wed Apr 2 09:13:23 2008 From: danbri at danbri.org (Dan Brickley) Date: Wed Apr 2 09:13:32 2008 Subject: [uf-discuss] Standardized Representation of Microformats in JSON / was: (no subject) & hCardMapper v0.96 In-Reply-To: References: <4243A62B-9FBF-4707-9B42-319BF1FF8F39@tobyinkster.co.uk> <47F3A024.5070306@onlinehome.de> Message-ID: <47F3BEB3.2030907@danbri.org> Hi all Ciaran McNulty wrote: > On Wed, Apr 2, 2008 at 4:03 PM, Gordon wrote: > >> One more thing: I am unsure about the plural naming convention I suggested. >> >> > [...] > > There's very little utility in having the field names reflect their > plurality - a parser needs to know how to parse each field separately > anyhow, for most real-world applications. Keeping vCard names is more > useful than changing them for very little gain. > > +1 ...keeping the naming the same (minus punctuation fixes for .js syntax) makes sense. I'm happy to see this discussion, as it helps decouple the implicit abstract data-model of Microformats from the specifics of their use as an HTML notation. This will make interop with FOAF/RDF system easier for all of us, I'm sure. For going from FOAF to Microformats btw, I'm thinking to add annotations into the FOAF schema that indicate the best corresponding Microformat term. More on that next week, hopefully. For another JSON representation of Microformat (and FOAF/RDF) data, don't forget the Google Social Graph API: http://code.google.com/apis/socialgraph/docs/ Also (while we're cataloguing JSON idioms for this stuff) any mapping of Microformats into RDF (eg. using GRDDL) will allow it so show up using the SPARQL resultset format, see http://www.w3.org/TR/rdf-sparql-json-res/ ... this is best thought of by analogy with SQL: it captures tables of query results, with records/rows and named columns for fields. So the basic JSON structure there remains the same, regardless of which domain vocabulary is being used. I'm not advocating for either here, just circulating related work... All the best, Dan -- http://danbri.org/ From gordon at onlinehome.de Wed Apr 2 14:36:17 2008 From: gordon at onlinehome.de (Gordon) Date: Wed Apr 2 14:36:27 2008 Subject: [uf-discuss] Standardized Representation of Microformats in JSON / was: (no subject) & hCardMapper v0.96 In-Reply-To: <47F3BEB3.2030907@danbri.org> References: <4243A62B-9FBF-4707-9B42-319BF1FF8F39@tobyinkster.co.uk> <47F3A024.5070306@onlinehome.de> <47F3BEB3.2030907@danbri.org> Message-ID: <47F40A61.9010109@onlinehome.de> Dan Brickley schrieb: > +1 ...keeping the naming the same (minus punctuation fixes for .js > syntax) makes sense. > Ok. New Diagram and Zip are online now. To sum it up: 1. A jCard has the same properties as any hCard/vCard. 2. Hyphenated properties from hCard/vCard drop the hyphen and use camel-case instead. Correct: given-name becomes givenName. So much for naming conventions. On to structure. 3. All singular instance properties use only their correspong datatype for value. Correct: n = { givenName: 'John', familyName: 'Doe'} fn = 'John Doe' Wrong: n = [ {givenName: 'John'}, {familyName: 'Doe'} ] fn = {value: 'John Doe'} 4. All properties that may have multiple instances use an Array of their corresponding datatype. Correct: nickname = ['Rio Demonhog', 'Gogo Fiasco'] 5. Properties that are not set must be omitted. Example: email = [ {type: ['pref'], value: 'foo@example.com'}, {value: 'bar@example.com'} ] I am undecided on disallowing or forcing reduction of Arrays and Objects, when they only hold a single value. There is definitely an advantage in disallowing reduction, as it reduces authoring choices and thus complexity. So if we want to keep it simple, I guess disallowing is the right choice. On the other hand, JavaScript is loosely typed and we could come up with a much more compact jCard if we force reduction. This would also make sense in conjunction with 5. The only thing that doesn't make sense to me is allowing both. Would you want either 6a. Enclosing Arrays or Objects must NOT be reduced. Wrong: email = 'bar@example.com' Correct: email = [{value: 'bar@example.com'}] or 6b. Enclosing Arrays or Objects MUST be reduced Wrong: email = [{value: 'bar@example.com'}] Correct: email = 'bar@example.com' and Correct: email = [ {type: ['pref'], value: 'foo@example.com'}, 'foobar@example.com' ] and Wrong: nickname = ['Gogo Fiasco'] Correct: nickname = 'Gogo Fiasco' and last rule 7. A type property must not be the only property of an Object. Wrong: email = [{type: 'pref'}] This would be it. So, what do you say? Cheers, Gordon From thom at ts0.com Wed Apr 2 15:12:07 2008 From: thom at ts0.com (Thom Shannon) Date: Wed Apr 2 15:12:12 2008 Subject: [uf-discuss] Monkeyformats - adding microformats to third-party web sites with GreaseMonkey In-Reply-To: References: Message-ID: <47F412C7.6030009@ts0.com> Here's my contribution! http://userscripts.org/scripts/show/24705 Microformat marked up BT directory searches. Should only have taken me 5 mins but thanks to the crappiness and inconsistencies of BT it took all evening! I mean why does one search type say Tel: and the other Telephone: !?!? Enjoy. Albert de Klein wrote: > Hi All, > > For a Dutch online magazine for web workers I've written an article > (http://naarvoren.nl/artikel/monkeyformats/) in which I explain how > you can add microformats to third-party web sites with GreaseMonkey. > These client-side microformats work well with the Operator add-on and > let you do microformatish things with websites that do not yet support > microformats but might prove very useful as microformat examples. In > the Netherlands microformat adoptation by popular sites hasn't gained > momentum yet as it has in US. > > I've put up some example GreaseMonkey scripts on the Userscripts.org > repository with the tag monkeyformats > (http://userscripts.org/tags/monkeyformats) to make them easily > retrievable. > > Do any of you know any popular English sites that lack microformats > but might be great microformat examples that could be added to the > monkeyformat collection? > > Regards, > Albert de Klein > _______________________________________________ > microformats-discuss mailing list > microformats-discuss@microformats.org > http://microformats.org/mailman/listinfo/microformats-discuss > > From dmitry.baranovskiy at gmail.com Wed Apr 2 15:54:26 2008 From: dmitry.baranovskiy at gmail.com (Dmitry Baranovskiy) Date: Wed Apr 2 15:54:37 2008 Subject: [uf-discuss] Standardized Representation of Microformats in JSON / was: (no subject) & hCardMapper v0.96 In-Reply-To: <47F40A61.9010109@onlinehome.de> References: <4243A62B-9FBF-4707-9B42-319BF1FF8F39@tobyinkster.co.uk> <47F3A024.5070306@onlinehome.de> <47F3BEB3.2030907@danbri.org> <47F40A61.9010109@onlinehome.de> Message-ID: Hi, Very interesting conversation. I hope we will come to some JSON standard at the end of it and promise in forward to implement it in Optimus. It is a fact that JavaScript use camel-case conversion for style properties. Another fact is that it is very BAD and STUPID thing in JavaScript. In addition it is very inconvenient. So, I am really against this. Listen to Douglas Crockford at http://developer.yahoo.com/yui/theater/ if my opinion is not good enough for you. We have a standard for names, so, please, don?t invent the new one. n = {"given-name": "John", "family-name": "Doe"} Consistency. Agreed with everything else. On 03/04/2008, at 9:36 AM, Gordon wrote: > > > 3. All singular instance properties use only their correspong > datatype for value. > > Correct: > n = { givenName: 'John', familyName: 'Doe'} > fn = 'John Doe' > > Wrong: > n = [ {givenName: 'John'}, {familyName: 'Doe'} ] > fn = {value: 'John Doe'} From mail at tobyinkster.co.uk Thu Apr 3 00:06:30 2008 From: mail at tobyinkster.co.uk (Toby A Inkster) Date: Thu Apr 3 00:06:42 2008 Subject: [uf-discuss] Re: Standardized Representation of Microformats in JSON / was: (no subject) & hCardMapper v0.96 Message-ID: <319C13D4-20CE-45E7-9780-F83F2027F83A@tobyinkster.co.uk> Gordon wrote; > I am undecided on disallowing or forcing reduction of Arrays and > Objects, when they only hold a single value. There is definitely an > advantage in disallowing reduction, as it reduces authoring choices > and > thus complexity. So if we want to keep it simple, I guess > disallowing is > the right choice. On the other hand, JavaScript is loosely typed > and we > could come up with a much more compact jCard if we force reduction. I say disallow. Allowing this would increase complexity at the consumer side. Forcing it increases the complexity at the producer side as well. If people are worried about compactness, they should use "Content-Encoding: gzip". I'm pretty close to an implementation for jCard. So far, missing 'adr', 'org' and 'agent', but have implemented all other properties, plus vCard 'X-GENDER' as 'xGender' in jCard. Right now my JSON violates rule #5: there's plenty of empty arrays. "key": [], "mailer": [], etc I hope to get that fixed too. But otherwise, nearly there! -- Toby A Inkster From bnowack at semsol.com Thu Apr 3 00:39:20 2008 From: bnowack at semsol.com (Benjamin Nowack) Date: Thu Apr 3 00:39:23 2008 Subject: [uf-discuss] Standardized Representation of Microformats in JSON / was: (no subject) & hCardMapper v0.96 Message-ID: On Apr 2, 2008, at 15:54:26 PST, Dmitry Baranovskiy wrote: [...] >We have a standard for >names, so, please, don?t invent the new one. > >n = {"given-name": "John", "family-name": "Doe"} > >Consistency. I second that. Based on inspiring conversations at SemanticCamp London I started working on a test suite for microformats. It uses RDF and SPARQL internally, so that pass/fail checks cann be done on a semantic, not a syntactic level. One important requirement we identified back then was that a microformats parser should not have to produce URI-qualified snippets, but rather some intermediate, a bit more simple/compact format. The "microjson" idea sounds ideal for that. Mapping this sort of JSON structure to RDF (and vice-versa) should be straight-forward (no need to learn any new specs). It should be possible to semi-automate the conversion, though, so I'd vote for keeping the names consistent accross all formats, too (although the camel-case conversion could be implemented as well). An agreed-on intermediate format/model for parsed microformats would allow me/us to directly re-use existing query and/or validation machinery for an easy-to-use and extensible test suite. I don't have a strong opinion on the [] vs. "" vs. {} (on RDF's triple level, things get unified anyway). But a predictable JSON structure for "instance data" would really simplify things a lot. Best, Benji -- Benjamin Nowack http://bnode.org/ > >Agreed with everything else. > >On 03/04/2008, at 9:36 AM, Gordon wrote: >> >> >> 3. All singular instance properties use only their correspong >> datatype for value. >> >> Correct: >> n = { givenName: 'John', familyName: 'Doe'} >> fn = 'John Doe' >> >> Wrong: >> n = [ {givenName: 'John'}, {familyName: 'Doe'} ] >> fn = {value: 'John Doe'} > From mail at ciaranmcnulty.com Thu Apr 3 00:48:33 2008 From: mail at ciaranmcnulty.com (Ciaran McNulty) Date: Thu Apr 3 00:48:35 2008 Subject: [uf-discuss] Standardized Representation of Microformats in JSON / was: (no subject) & hCardMapper v0.96 In-Reply-To: References: <4243A62B-9FBF-4707-9B42-319BF1FF8F39@tobyinkster.co.uk> <47F3A024.5070306@onlinehome.de> <47F3BEB3.2030907@danbri.org> <47F40A61.9010109@onlinehome.de> Message-ID: It's important to consider who jCard is 'for'. In Microformats we put the publisher first, and shift as much complexity to the parser as we think we can get away with. For jCard, there are precious few people publishing in JSON - this format is either going to be optimised for: a) People writing hCard->jCard converters b) People who want to parse jCard (including parsing hCards via a converter). In my opinion, to promote people using uF more in their projects (I'm thinking things like hCardMapper) we should squarely aim the format at group b, and shift as much complexity as we can into the converter. What I propose is that a lot of our defaulting rules, designed to make things easier for HTML authors, should not be present in jCard. As an example, hCard has rules for parsing fn values into 'n' values - a step designed to promote uptake in HTML authors. Conversely, I would propose that 'n' values are mandatory in jCard to make things easier for parsers, and that converters be responsible for applying the defaulting rules. -Ciaran McNulty From mail at ciaranmcnulty.com Thu Apr 3 00:54:52 2008 From: mail at ciaranmcnulty.com (Ciaran McNulty) Date: Thu Apr 3 00:54:56 2008 Subject: [uf-discuss] Standardized Representation of Microformats in PHP/other languages Message-ID: As a tangential note from the discussion about a standardised JSON format, it would be useful to be able to represent uF data as datastructures in other programming languages. hKit, I know, can return a serialised PHP object which is very useful for those of us in that world. The simplest way of 'specifying' such a structure would, IMO be to say that it should be the equivalent of the JSON uF format, as if it had been run through the json_decode() [1] function (which by default returns an object but can be made to use an associative array). Do any other of the PHP types have a strong opinion about this? I realise that at the moment it's a side issue. Is there an obvious representation in any other programming languages? -Ciaran McNulty [1] http://uk2.php.net/json From gordon at onlinehome.de Thu Apr 3 02:07:20 2008 From: gordon at onlinehome.de (Gordon) Date: Thu Apr 3 02:07:26 2008 Subject: [uf-discuss] Standardized Representation of Microformats in PHP/other languages In-Reply-To: References: Message-ID: <47F4AC58.2040304@onlinehome.de> Ciaran McNulty schrieb: > Is there an obvious representation in any other programming languages? > Ruby doesn't seem to have dedicated JSON encoding/decoding methods, but Ruby on Rails does[1]. Rails seems to first convert[2] JSON to YAML, so the resulting datatypes are dependant on how YAML[3] gets represented in Ruby. [1] http://api.rubyonrails.org/classes/ActiveSupport/JSON.html [2] http://www.noobkit.com/show/ruby/rails/rails-edge/activesupport-edge/activesupport/json/convert_json_to_yaml.html [3] http://www.ruby-doc.org/stdlib/libdoc/yaml/rdoc/index.html From mail at tobyinkster.co.uk Thu Apr 3 02:16:22 2008 From: mail at tobyinkster.co.uk (Toby A Inkster) Date: Thu Apr 3 02:17:04 2008 Subject: [uf-discuss] Re: Standardized Representation of Microformats in PHP/other languages Message-ID: <7526167F-D4C2-4FD6-8A68-0015AB9D3A65@tobyinkster.co.uk> Ciaran McNulty wrote: > Do any other of the PHP types have a strong opinion about this? I > realise that at the moment it's a side issue. Agree 100%. Of course the json_decode() function was only added to PHP in (IIRC) version 5.1, but that needn't effect code which is simply aiming at outputting a structure that is equivalent to the output from json_decode. > Is there an obvious representation in any other programming languages? Could be done in the same way in Perl, using the "from_json" function. -- Toby A Inkster From danbri at danbri.org Thu Apr 3 02:43:16 2008 From: danbri at danbri.org (Dan Brickley) Date: Thu Apr 3 02:43:22 2008 Subject: [uf-discuss] Re: Standardized Representation of Microformats in JSON / was: (no subject) & hCardMapper v0.96 In-Reply-To: <319C13D4-20CE-45E7-9780-F83F2027F83A@tobyinkster.co.uk> References: <319C13D4-20CE-45E7-9780-F83F2027F83A@tobyinkster.co.uk> Message-ID: <47F4B4C4.5000601@danbri.org> Toby A Inkster wrote: > plus vCard 'X-GENDER' as 'xGender' in jCard. Does it make sense to carry through that 'x'? In http://tools.ietf.org/html/draft-resnick-vcarddav-vcardrev-01#section-5.2.9 I see: [[ 5.2.9. GENDER Purpose: To specify the gender of the object the vCard represents. Value type: A single text value. Special notes: The value "M" stands for male while "F" stands for female. Example: GENDER:F ]] FWIW doesn't seem to rule *out* additional values, which makes it consistent with the FOAF design (although we use 'male' and 'female' as the two most well-known values for foaf:gender). While out of scope for this thread, I'm very interested to have collaboration between FOAF and Microformat approaches on representing gender; both in the vocabulary sense, but also in making nice accessible form controls that allow type-in values as well as a constrained list. Nearby in the Web: https://www.ietf.org/mailman/listinfo/vcarddav "This list is for discussion of revisions to vCard (RFC 2426) and the cardDAV protocol draft-daboo-carddav." cheers, Dan -- http://danbri.org/ From mail at tobyinkster.co.uk Thu Apr 3 03:06:22 2008 From: mail at tobyinkster.co.uk (Toby A Inkster) Date: Thu Apr 3 03:06:31 2008 Subject: [uf-discuss] hCard -> jCard implementation Message-ID: <8CDBD445-045A-4523-9D04-C0AFBBCCFD60@tobyinkster.co.uk> Yes, already. This is based on the soon-to-be-released Cognition-0.1- alpha7. Example input: http://examples.tobyinkster.co.uk/hcard Example output: http://srv.buzzword.org.uk/jcard/examples.tobyinkster.co.uk/hcard% 2523jack Notable notes: * Cognition doesn't just support hCard as input ? it will pick up contact data from, say the W3C PIM vocabulary used with RDFa, or FOAF in eRDF, or chunks of RDF/XML in HTML comments. Basically, if you've got some metadata, it will be found. And if the metadata relates to a person, then it will be exported as jCard. The example input above does indeed include some RDFa goodness. * geo (indeed multiple instances thereof) is supported as a descendent of adr. * altitude, reference-frame and body are supported as sub-properties of geo, roughly as documented on the uf geo-extension-strawman and geo-extension-elevation wiki pages. * Because different terms are used in hCard and vCard, categories are duplicated in the jCard output ? there is a "category" array and a "categories" array containing identical information. * When no "type" is given for "adr", "tel" or "email", default types are explicitly added to the output. List of default types are in hCard spec, section 3.15.2. * There is a tiny buglet with nested vCards for agents ? look carefully at the output and you'll see. Yes, http://srv.buzzword.org.uk/jcard/ can be used to test your own hCards, but bear in mind that it is *very* slow. If anyone is willing to donate hosting to the Cognition project, I'd be eternally grateful. (Well, maybe not eternally, but certainly grateful for at least a month.) Note that %2523 is a double-URL-encoded hash symbol. (Not sure why the double-encoding is necessary, but I haven't been able to get %23 to work on its own.) That is, it's targeting a particular hCard on the page using the id attribute. If you leave the fragment identifier out, an array will be returned, containing jCard objects for each card found on the page ? if there's only one hCard on the page, it will still be in an array. e.g. if you want to test then visit . There are other services too: http://srv.buzzword.org.uk -- Toby A Inkster From mail at tobyinkster.co.uk Thu Apr 3 03:40:17 2008 From: mail at tobyinkster.co.uk (Toby A Inkster) Date: Thu Apr 3 04:18:25 2008 Subject: [uf-discuss] collaboration between FOAF and Microformat approaches on representing gender Message-ID: <22DA7617-0C00-4E8C-84F0-FE1713A4604B@tobyinkster.co.uk> Dan Brickley wrote: > Does it make sense to carry through that 'x'? > http://tools.ietf.org/html/draft-resnick-vcarddav- vcardrev-01#section-5.2.9 I've not seen that draft. Looks interesting. Right now, Cognition's extensions to hCard are all "x-" prefixed. It will *parse* them without the "x-", but will include the "x-" in output. Cognitions's hCard extensions are documented here: http://buzzword.org.uk/cognition/uf-plus.html#hcard > I'm very interested to have collaboration between FOAF and > Microformat approaches on representing gender The way that Cognition resolves the differences between FOAF/RDF and hCard is this: 1. RDF requires that each triple has a subject URI. For hCard that subject URI is the hCard's "uid" property. If no UID property exists, a fake one is created using the "fn" and the "ldap:" URI scheme. 2. RDF requires that predicates must be a URI. hCard properties are simply namespaced into "urn:ietf:rfc:2426#". That's pretty much it. hCalendar and hAtom are handled in pretty much the same way. All data gleaned from microformats is just internally represented as RDF triples. Because of the intermediate triple-store which is shared by all of Cognition's parsers (microformats, RDFa, eRDF, GRDDL, etc), when output is being generated (e.g. vCard output), Cognition doesn't make any distinction as to where the data came from. So an hCard without any e-mail addresses might actually pick up (from RDFa, etc) some e- mail addresses when output as vCard. I like to call this "gainy" conversion (in contrast to lossy). For an example of gainy conversion take a look at the hCard to jCard implementation I just posted. In the example output, the mobile phone number (amongst other properties) wasn't included in the input hCard. In terms of gender, Cognition basically just accepts any string. The only special handling for gender is that if the subject has an rdf:type of w3cpim:Male or w3cpim:Female, then the gender string is taken to be "Male" or "Female" respectively. w.r.t. inclusiveness, accepting a free-form string is pretty much the only option. -- Toby A Inkster From mail at ciaranmcnulty.com Thu Apr 3 04:18:36 2008 From: mail at ciaranmcnulty.com (Ciaran McNulty) Date: Thu Apr 3 04:25:44 2008 Subject: [uf-discuss] Re: Standardized Representation of Microformats in PHP/other languages In-Reply-To: <7526167F-D4C2-4FD6-8A68-0015AB9D3A65@tobyinkster.co.uk> References: <7526167F-D4C2-4FD6-8A68-0015AB9D3A65@tobyinkster.co.uk> Message-ID: On Thu, Apr 3, 2008 at 11:16 AM, Toby A Inkster wrote: > Agree 100%. Of course the json_decode() function was only added to PHP in > (IIRC) version 5.1, but that needn't effect code which is simply aiming at > outputting a structure that is equivalent to the output from json_decode. I hadn't realised it was so recent - according to php.net it was added in 5.2. It is in the mainstream PHP though so is probably a good one to go with - I'm a little wary of basing a specified behaviour on it though - it's not that unheard of for PHP functions to start behaving differently between point releases! There appears to be a Zend::Json package that can decode() JSON - simple testing makes it appear that it parses the JSON output of hKit identically (when told to return an object rather than an array). I've not seen any other widespread JSON parsing packages - I would have thought PEAR would have one but they don't seem to. -Ciaran McNulty From gordon at onlinehome.de Thu Apr 3 04:20:50 2008 From: gordon at onlinehome.de (Gordon Oheim) Date: Thu Apr 3 04:33:54 2008 Subject: [uf-discuss] Standardized Representation of Microformats in JSON / was: (no subject) & hCardMapper v0.96 In-Reply-To: References: <4243A62B-9FBF-4707-9B42-319BF1FF8F39@tobyinkster.co.uk> <47F3A024.5070306@onlinehome.de> <47F3BEB3.2030907@danbri.org> <47F40A61.9010109@onlinehome.de> Message-ID: <47F4CBA2.40807@onlinehome.de> Dmitry Baranovskiy schrieb: > Hi, > Very interesting conversation. I hope we will come to some JSON > standard at the end of it and promise in forward to implement it in > Optimus. That is great news. > > It is a fact that JavaScript use camel-case conversion for style > properties. Another fact is that it is very BAD and STUPID thing in > JavaScript. In addition it is very inconvenient. So, I am really > against this. Listen to Douglas Crockford at > http://developer.yahoo.com/yui/theater/ if my opinion is not good > enough for you. We have a standard for names, so, please, don?t invent > the new one. > > n = {"given-name": "John", "family-name": "Doe"} > I have considered this back and forth now. From a programmer's perspective, I'd prefer being able to access all properties through their identifier. This is much more convenient to type and produces easier to read sourcecode. But if we solely view jCards as a format for data-interchange, then for the sake of consistency with the standard names you are right. JavaScript allows us to use the standardized names, so let's use them. So +1 for "All Objects in a jCard are used as Associative Arrays and must use strings for property names instead of identifiers." From mail at tobyinkster.co.uk Thu Apr 3 04:33:46 2008 From: mail at tobyinkster.co.uk (Toby A Inkster) Date: Thu Apr 3 04:33:54 2008 Subject: [uf-discuss] Re: Standardized Representation of Microformats in PHP/other languages Message-ID: Ciaran McNulty wrote: > I've not seen any other widespread JSON parsing packages - I would > have thought PEAR would have one but they don't seem to. They did, but it may have been removed. It was compatible with the json_decode function too. I've got a copy of it in one of my SVN repositories if you want it: http://demiblog.svn.sourceforge.net/viewvc/demiblog/trunk/blog/PEAR/ Services/ -- Toby A Inkster From derrick at pallas.us Thu Apr 3 07:20:19 2008 From: derrick at pallas.us (Derrick Lyndon Pallas) Date: Thu Apr 3 07:20:29 2008 Subject: [uf-discuss] Standardized Representation of Microformats in PHP/other languages In-Reply-To: References: Message-ID: <47F4F5B3.7030901@pallas.us> Ciaran McNulty wrote: > As a tangential note from the discussion about a standardised JSON > format, it would be useful to be able to represent uF data as > datastructures in other programming languages. > There seems to be a lot of confusion here about the differences between syntax, structure, and semantics. What is the difference between {"given-name": "John", "family-name": "Doe"} {"family-name": "Doe","given-name": "John"} There is a potential structural difference (e.g., in PHP, where associative arrays have order) but the syntax is the same and (depending on the application) the semantics are probably the same. What's the difference between {"given-name": "John", "family-name": "Doe"} { "given-name" => "John", "family-name" => "Doe" } There is only a syntactic difference, in that the former is Javascript and the latter is Ruby. Both the structure and the semantics are identical, that is: create a mapping such that the string "given-name" associates with "John", and the string "family-name" associates with "Doe". The point is that different representations can have the same structure and semantics. In this case, it seems like a mistake to talk about a representational mapping. As far as I understood, microformats is primarily concerned about adding semantic value specifically to HTML. This is done with well-defined structure that translates (as defined by microformats) into the syntax of HTML. So then, what is the difference between {"given-name": "John", "family-name": "Doe"} JohnDoe Primarily a syntactic one. Structurally they are the same and semantically they are both hCard fragments. A more fundamental difference, however, is that the latter is the primary syntax; conversion from HTML to JSON will be lossy. Furthermore, the semantics are now twice filtered: the converter has to be as up-to-snuff on the currently defined classes as the consuming application itself. Finally, you lose many of the benefits of hypertext: the include pattern no longer works, URIs become strings, and it isn't clear how embedded microformats should be handled. The only real way to share microformatted information is to pass it around in an HTML container, directly or as a URL. Defining a generic conversion is a mistake. Instead, we should focus on semantics and let applications define their own internal representations. ~D From derrick at pallas.us Thu Apr 3 07:23:33 2008 From: derrick at pallas.us (Derrick Lyndon Pallas) Date: Thu Apr 3 07:23:35 2008 Subject: [uf-discuss] Re: Standardized Representation of Microformats in PHP/other languages In-Reply-To: References: Message-ID: <47F4F675.9080108@pallas.us> Toby A Inkster wrote: > Ciaran McNulty wrote: > >> I've not seen any other widespread JSON parsing packages - I would >> have thought PEAR would have one but they don't seem to. > They did, but it may have been removed. It was compatible with the > json_decode function too. I've got a copy of it in one of my SVN > repositories if you want it: There is a PECL package for JSON. This is part of the main distribution in 5.2.0+. ~D From flafortune at praizedmedia.com Thu Apr 3 08:11:52 2008 From: flafortune at praizedmedia.com (Francois Lafortune) Date: Thu Apr 3 08:07:29 2008 Subject: [uf-discuss] Standardized Representation of Microformats in PHP/other languages In-Reply-To: <47F4F5B3.7030901@pallas.us> References: <47F4F5B3.7030901@pallas.us> Message-ID: <47F501C8.5040807@praizedmedia.com> Derrick Lyndon Pallas wrote: > Ciaran McNulty wrote: >> As a tangential note from the discussion about a standardised JSON >> format, it would be useful to be able to represent uF data as >> datastructures in other programming languages. >> > There seems to be a lot of confusion here about the differences > between syntax, structure, and semantics. What is the difference between > > {"given-name": "John", "family-name": "Doe"} > {"family-name": "Doe","given-name": "John"} > > There is a potential structural difference (e.g., in PHP, where > associative arrays have order) but the syntax is the same and > (depending on the application) the semantics are probably the same. > What's the difference between > > {"given-name": "John", "family-name": "Doe"} > { "given-name" => "John", "family-name" => "Doe" } > > There is only a syntactic difference, in that the former is Javascript > and the latter is Ruby. Both the structure and the semantics are > identical, that is: create a mapping such that the string "given-name" > associates with "John", and the string "family-name" associates with > "Doe". > > The point is that different representations can have the same > structure and semantics. In this case, it seems like a mistake to talk > about a representational mapping. As far as I understood, microformats > is primarily concerned about adding semantic value specifically to > HTML. This is done with well-defined structure that translates (as > defined by microformats) into the syntax of HTML. > > So then, what is the difference between > > {"given-name": "John", "family-name": "Doe"} > John class="family-name">Doe > > Primarily a syntactic one. Structurally they are the same and > semantically they are both hCard fragments. A more fundamental > difference, however, is that the latter is the primary syntax; > conversion from HTML to JSON will be lossy. Furthermore, the semantics > are now twice filtered: the converter has to be as up-to-snuff on the > currently defined classes as the consuming application itself. > Finally, you lose many of the benefits of hypertext: the include > pattern no longer works, URIs become strings, and it isn't clear how > embedded microformats should be handled. > > The only real way to share microformatted information is to pass it > around in an HTML container, directly or as a URL. Defining a generic > conversion is a mistake. Instead, we should focus on semantics and let > applications define their own internal representations. > > ~D > > _______________________________________________ > microformats-discuss mailing list > microformats-discuss@microformats.org > http://microformats.org/mailman/listinfo/microformats-discuss I may be pretty quiet on this mailing list, but I second this. I especially like the last two sentences. From glenn.jones at madgex.com Thu Apr 3 08:26:39 2008 From: glenn.jones at madgex.com (Glenn Jones) Date: Thu Apr 3 08:59:11 2008 Subject: [uf-discuss] Re: Standardized Representation of Microformats Message-ID: <36A319113CF910438942741C4727ADFF01C359D2@MOBY.Clarence.local> I would really like microformats community to come up with a standardized JSON representation of microformat structures. Over the last few weeks I have asked number of people involved with microformats parsers if they were interested in a common JSON description for output and the answer seems so far has been a strong yes. The main reason is that it could help us build shared test suite and enable comparative testing between the parsers. Just like the move by the IE team to publicly release their test suites so that browser manufactures can start to coalesce around a testable understanding of a specification. We should start to do the same for microformat parsers. What we need first is a single output format to test against. JSON seems to cross both client and server world well, so it would be my choice. It would also help people build applications more easily as they would be able to switch from one parser/service to another and reuse code a little more. I am currently working on a test suite that uses a POSH pattern to express JSON based asserts i.e. NodeValue(vcard[0].url[2]) = http://www.flickr.com/photos/glennjonesnet/ You can see an example here http://lab.backnetwork.com/testsuite/hcard/1.0/hcard1.htm This is ufxtract parsing the test http://lab.backnetwork.com/ufXtract/?url=http%3A%2F%2Flab.backnetwork.co m%2Ftestsuite%2Fhcard%2F1.0%2Fhcard1.htm&format=test-fixture&output=text I am coding a JavaScript test runner using ufxtract to parses the "test-fixture" POSH pattern then I run the asserts and get a pass/fail response. This is all very early prototype work, but hopefully it shows the value of standardized JSON representation of microformat structures. Glenn Jones www.glennjones.net From mail at ciaranmcnulty.com Thu Apr 3 13:12:12 2008 From: mail at ciaranmcnulty.com (Ciaran McNulty) Date: Thu Apr 3 13:12:33 2008 Subject: [uf-discuss] Standardized Representation of Microformats in PHP/other languages In-Reply-To: <47F4F5B3.7030901@pallas.us> References: <47F4F5B3.7030901@pallas.us> Message-ID: On Thu, Apr 3, 2008 at 4:20 PM, Derrick Lyndon Pallas wrote: > The only real way to share microformatted information is to pass it around > in an HTML container, directly or as a URL. Defining a generic conversion is > a mistake. Instead, we should focus on semantics and let applications define > their own internal representations. I think the point is, services are already providing hCard->JSON conversion and it would be nice if they were all doing it the same way. As I said at the start of this subject, the idea of talking about a representation in other languages is really a side issue (but there are already existant hCard->PHP services, for instance). Basically iff we're going to be talking about JSON representations maybe it's good if other languages can benefit from it. -Ciaran McNulty From gordon at onlinehome.de Thu Apr 3 16:23:41 2008 From: gordon at onlinehome.de (Gordon Oheim) Date: Thu Apr 3 16:28:45 2008 Subject: [uf-discuss] jCard draft Message-ID: <47F5750D.8030708@onlinehome.de> Hi all, I have added a preliminary draft for a possible jCard specification to the wiki at http://microformats.org/wiki/jcard. The content is based on what I read from the discussion list so far. The intention was to have a reference for further discussion and for solidifying a candidate for a jCard standard. Please forgive my poor wiki editing skills and feel free to add to the page. Cheers, Gordon From derrick at pallas.us Thu Apr 3 17:45:30 2008 From: derrick at pallas.us (Derrick Lyndon Pallas) Date: Thu Apr 3 18:25:36 2008 Subject: [uf-discuss] Standardized Representation of Microformats in PHP/other languages In-Reply-To: References: <47F4F5B3.7030901@pallas.us> Message-ID: <47F5883A.3050803@pallas.us> Ciaran McNulty wrote: > I think the point is, services are already providing hCard->JSON conversion and it would be nice if they were all doing it the same way. > > As I said at the start of this subject, the idea of talking about a representation in other languages is really a side issue (but there are already existant hCard->PHP services, for instance). > > Basically iff we're going to be talking about JSON representations maybe it's good if other languages can benefit from it. > This seems unnecessary, especially for a group that is not a standards body. If you want to use uf in Javascript or PHP or LOLCODE, parse the HTML into whatever structure you'd like. If you want to share it, pass along marked up HTML, directly or as a URI. Under some circumstances, it might make sense to use an RDF representation; unlike JSON, that has the benefit that it already has the syntax to preserve the semantics of ufs. But that still seems like it's beyond the scope of microformats.org, which is usually positive in nature. ~D From philip.tellis at gmail.com Thu Apr 3 21:26:55 2008 From: philip.tellis at gmail.com (Philip Tellis) Date: Thu Apr 3 21:33:03 2008 Subject: [uf-discuss] Re: Standardized Representation of Microformats in PHP/other languages In-Reply-To: References: <7526167F-D4C2-4FD6-8A68-0015AB9D3A65@tobyinkster.co.uk> Message-ID: <2e95f9b80804032226ja6eb984xf1daf3944ce9a750@mail.gmail.com> On 03/04/2008, Ciaran McNulty wrote: > There appears to be a Zend::Json package that can decode() JSON - > simple testing makes it appear that it parses the JSON output of hKit > identically (when told to return an object rather than an array). I'm jumping on to this a little late, but a few points from my experience with JSON in php. 1. The php extension for json is significantly faster than the native PHP implementation (sorry, I don't have tests that I can show). 2. AFAIK, the extension has been around since 4.3 at least, and made it into mainstream PHP later. 3. By default (in PHP 5), json_decode will return an Object rather than an associative array. You need to pass in true as the second parameter to this function to get it to return an array. Apart from this, I have seen no differences since 4.3. 4. json_encode/json_decode are almost as fast as php's built in serialize and unserialize functions, and result in smaller serialised representations. From thom at ts0.com Fri Apr 4 01:22:37 2008 From: thom at ts0.com (Thom Shannon) Date: Fri Apr 4 01:22:46 2008 Subject: [uf-discuss] testers wanted, hCard to bluetooth on windows Message-ID: <47F5F35D.7040908@ts0.com> Hi, I've modified the Mac only bluetooth user script for operator to work on windows. It's based on the script on Mike Kaply's site (does anyone know who wrote that? Was it Mike?) It requires the Widcomm bluetooth drivers (they're the most common, if you have c:\Program Files\WIDCOMM then you're good to go). http://www.ts0.com/bluetoothwin.js I've tested it successfully on my Win XP machine to a Nokia 6230 and a Qtek S200 Win Mobile phone (aka SPV M600). A friend tested it on a newer Nokia, it received the file but failed to recognise it as a vcard to import. I've yet to find a way to use the bluetooth business card service, so I'm sending it just as a file transfer and relying on the device importing it. It'll probably require an XPCOM to use the Widcomm API, but this script should work as well as it's Mac counterpart. From thom at ts0.com Fri Apr 4 01:37:36 2008 From: thom at ts0.com (Thom Shannon) Date: Fri Apr 4 01:37:41 2008 Subject: [uf-discuss] jCard draft In-Reply-To: <47F5750D.8030708@onlinehome.de> References: <47F5750D.8030708@onlinehome.de> Message-ID: <47F5F6E0.4020201@ts0.com> > I have added a preliminary draft for a possible jCard specification to > the wiki at http://microformats.org/wiki/jcard. > The content is based on what I read from the discussion list so far. > The intention was to have a reference for further discussion and for > solidifying a candidate for a jCard standard. All the examples except the very first one are invalid JSON. Object keys MUST be surrounded with double quotes. Wrong email = [ {type: ['pref'], value: 'foo at example.com'}, {value: 'bar at example.com'} ] Correct email = [ {"type": ["pref"], "value": "foo at example.com"}, {"value": "bar at example.com"} ] From lists at ben-ward.co.uk Fri Apr 4 03:08:48 2008 From: lists at ben-ward.co.uk (Ben Ward) Date: Fri Apr 4 03:08:53 2008 Subject: [uf-discuss] jCard draft In-Reply-To: <47F5750D.8030708@onlinehome.de> References: <47F5750D.8030708@onlinehome.de> Message-ID: On 4 Apr 2008, at 01:23, Gordon Oheim wrote: > I have added a preliminary draft for a possible jCard specification > to the wiki at http://microformats.org/wiki/jcard. > The content is based on what I read from the discussion list so far. > The intention was to have a reference for further discussion and for > solidifying a candidate for a jCard standard. Hi, This is great work, and it's something that I found a number of developers asking about during South By South West. I think it was Glenn Jones suggesting that we're now at a point with parser maturity that some thought needs to be given to having interoperable JSON structures. I have two points of initial followup, one with my admin hat on, the other without. 1. ADMIN: This discussion should probably take place on the microformats-dev mailing list, rather than -discuss. It should come to the attention of all parser developers that way, and hopefully stay focused on this very parser-centric work. I've cross posted this thread to microformats-dev@microformats.org; please continue the development discussion there. 2. In my view: I'm totally supportive and in favour of this work, I think ?jCard? is a bad name for it; I think this work would be better presented connected to the hCard specification itself ? and future equivalents for the other microformats too. Whether that end up as an ?Object Model? section of the relevant specs, or new documents (e.g. hcard-object-model). It doesn't need it's own, separate format name; it's really further specifying hcard itself. What's more, whilst JSON is the obvious driver technology for this work, I think it would make more sense to produce an implementation- agnostic Object Model that would work in JSON, XML, YML or whatever other transport people might want to implement for. I think it's unlikely we'd want to specify ?jCard?, ?xCard?, ?yCard? and so on?) > Please forgive my poor wiki editing skills and feel free to add to > the page. The page is off to a great start! Keep it up. Thanks, Ben From csarven at gmail.com Fri Apr 4 10:34:58 2008 From: csarven at gmail.com (Sarven Capadisli) Date: Fri Apr 4 10:59:41 2008 Subject: [uf-discuss] entry-title on Message-ID:
foo
Operator 0.9.1 http://tools.weborganics.co.uk/ http://tools.microformatic.com/help/xhtml/hatom/ all grab the @alt value for entry-title. Is this documented anywhere for hAtom parsing? In the case of should entry-title be the @title value since it is more accurate - also more visible - then the use of @alt? If @title is absent, then perhaps @alt may be used? Sarven Capadisli http://www.csarven.ca From scott at randomchaos.com Fri Apr 4 14:23:38 2008 From: scott at randomchaos.com (Scott Reynen) Date: Fri Apr 4 14:23:40 2008 Subject: [uf-discuss] entry-title on In-Reply-To: References: Message-ID: On Apr 4, 2008, at 12:34 PM, Sarven Capadisli wrote: > all grab the @alt value for entry-title. Is this documented anywhere > for hAtom parsing? It's documented here: http://microformats.org/wiki/parsing I just added a link to that from here: http://microformats.org/wiki/hatom-parsing It may be other places, but it seems to be ignored in searches. > In the case of should entry-title be the @title value since it > is more accurate - also more visible - then the use of @alt? If @title > is absent, then perhaps @alt may be used? I don't think so. @alt is alternative content ("this attribute specifies alternate text"), whereas @title is description of that content ("This attribute offers advisory information about the element for which it is set."), so @alt should be preferred for textual properties (e.g. entry-title). Peace, Scott From ggb at tid.es Sun Apr 6 05:58:35 2008 From: ggb at tid.es (Gustavo) Date: Sun Apr 6 10:58:44 2008 Subject: [uf-discuss] microservices In-Reply-To: <1005d65f0803120909t58710b22o5e87ddb268b85497@mail.gmail.com> References: <47D76E8A.1080503@tid.es> <1005d65f0803120909t58710b22o5e87ddb268b85497@mail.gmail.com> Message-ID: <47F8C8FB.202@tid.es> You can find a prototype implementation of this idea in this thread: https://labs.mozilla.com/forum/index.php/topic,654.0.html The idea behind this concept is for a website to be able to offer a service that could be applied to a microformat just publishing an xml, instead of developing and installing scripts/extensions for the firefox operator addon (in the same way that a website nowadays offers opensearch services) Any feedback on the interest of this topic would be very appreciated in order to continue this work in a possible specification and also in the implementation. BR, G. Jason Karns wrote: > On Wed, Mar 12, 2008 at 1:47 AM, Gustavo wrote: >> > I may be wrong - in which case, it's probably a good idea if we see if >> > Microsoft's OpenService stuff gets implemented anywhere outside of >> > Internet Explorer 8. > > Mike Kaply has produced another microformats-related extension for > Firefox that uses IE8's Activities. > http://www.kaply.com/weblog/2008/03/07/microsoft-activities-for-firefox-new-version/ > > ~Jason > _______________________________________________ > microformats-discuss mailing list > microformats-discuss@microformats.org > http://microformats.org/mailman/listinfo/microformats-discuss From csarven at gmail.com Sun Apr 6 09:38:53 2008 From: csarven at gmail.com (Sarven Capadisli) Date: Sun Apr 6 12:11:56 2008 Subject: [uf-discuss] entry-title on In-Reply-To: References: Message-ID: On Fri, Apr 4, 2008 at 6:23 PM, Scott Reynen wrote: > It's documented here: > > http://microformats.org/wiki/parsing > > I just added a link to that from here: > > http://microformats.org/wiki/hatom-parsing Thank you. > > In the case of should entry-title be the @title value since it > > is more accurate - also more visible - then the use of @alt? If @title > > is absent, then perhaps @alt may be used? > > > > I don't think so. @alt is alternative content ("this attribute specifies > alternate text"), whereas @title is description of that content ("This > attribute offers advisory information about the element for which it is > set."), so @alt should be preferred for textual properties (e.g. > entry-title). @alt is meant for "alternative text" in cases where can't be experienced visually. It is otherwise invisible. @title is meant to be visible alongside the . entry-title value should be visible. Sarven Capadisli http://www.csarven.ca From mail at ciaranmcnulty.com Mon Apr 7 00:47:56 2008 From: mail at ciaranmcnulty.com (Ciaran McNulty) Date: Mon Apr 7 02:27:50 2008 Subject: [uf-discuss] entry-title on In-Reply-To: References: Message-ID: On Sun, Apr 6, 2008 at 5:38 PM, Sarven Capadisli wrote: > @alt is meant for "alternative text" in cases where can't be > experienced visually. It is otherwise invisible. > @title is meant to be visible alongside the . > > entry-title value should be visible. Quite, but entry-title should only be on an image when that image contains the text of the title (bad practice admittedly). So in this case the @alt should contain the text in the image. -Ciaran McNulty From msporny at digitalbazaar.com Mon Apr 7 13:41:22 2008 From: msporny at digitalbazaar.com (Manu Sporny) Date: Mon Apr 7 15:25:56 2008 Subject: [uf-discuss] Fuzzbot - An embedded semantic data viewer Message-ID: <47FA86F2.6070304@digitalbazaar.com> Fuzzbot is designed to detect RDFa and other semantic data formats and display them to the person browsing. RDFa is a way to embed machine-readable data into web pages, which helps computers help you interact with web pages in a smarter way. For example, Fuzzbot can show you information about people that it has found on a web page - helping you view only the data in which you're interested. The goal of Fuzzbot is to integrate Microformats and RDFa into a common format (JSON/RDF) which authors can then write Actions and UIs against. What this means is that Fuzzbot will deal with higher-level semantic concepts (People, Places, Events, Audio, Video, etc.), rather than formats (hCard, FOAF, etc.). Fuzzbot is primarily a test bed for UI concepts and is not a replacement for Operator - ideally, some of what we learn from Fuzzbot could be integrated into Operator. http://rdfa.digitalbazaar.com/fuzzbot/ Screenshots are available here, for those that don't want to install the plugin: http://rdfa.digitalbazaar.com/fuzzbot#screenshots Some of you might note that the primary UI concept behind Fuzzbot is based on Dmitri Glazkov's Margin Marks concept[1] that he posted to this mailing list about 3-4 months ago. It is a visual approach to displaying semantic data. The current release (v0.7.5) is a very preliminary version of the software. There will be UI bugs and perhaps some operational bugs (For example, parsing Digg.com is very slow). Firefox XPIs are available for both Linux and Windows, here: http://rdfa.digitalbazaar.com/fuzzbot/download/ All librdfa source code is released under LGPL v3, and the Fuzzbot plugin is released under the Mozilla Public License. Source code is available via GIT: git clone http://rdfa.digitalbazaar.com/fuzzbot.git git clone http://rdfa.digitalbazaar.com/librdfa.git If you have any thoughts or questions on the direction of this project, or where you'd like to see it headed, please discuss on the list and we'll try to work it into the project plan. -- manu PS: We're also looking into creating a native C library to do Microformats parsing, but wanted to make sure there wasn't anybody that had already done this. Is there anybody on here that has created a native C library for parsing Microformats? [1] http://glazkov.com/blog/margin-marks/ From angus at pobox.com Tue Apr 8 05:20:56 2008 From: angus at pobox.com (Angus McIntyre) Date: Tue Apr 8 05:47:46 2008 Subject: [uf-discuss] Appropriate microformats for journal listings? Message-ID: I'm editing a page that lists editions of a journal, each entry having a form something like: Title Journalname 1 (2003) - downloadlink - Article 1 Author1, Author2 (Affiliation) Article 2 Author3 (Affiliation) and so on. Are there microformats that would make sense to use here? I toyed with the idea of making the author name lines use vCard, but that runs into problems where you have multiple authors belonging to the same organization. hAtom? Any suggestions would be welcome, thanks, Angus From lists at ben-ward.co.uk Tue Apr 8 06:04:50 2008 From: lists at ben-ward.co.uk (Ben Ward) Date: Tue Apr 8 06:05:02 2008 Subject: [uf-discuss] Appropriate microformats for journal listings? In-Reply-To: References: Message-ID: <3B60DC69-113D-4DDD-BC8A-B6B4441A5DB1@ben-ward.co.uk> Hi Angus, On 8 Apr 2008, at 13:20, Angus McIntyre wrote: > I'm editing a page that lists editions of a journal, each entry > having a form something like: > > Title > Journalname 1 (2003) > - downloadlink - > > Article 1 > Author1, Author2 (Affiliation) > > Article 2 > Author3 (Affiliation) > > and so on. > > Are there microformats that would make sense to use here? I toyed > with the idea of making the author name lines use vCard, but that > runs into problems where you have multiple authors belonging to the > same organization. hAtom? Seems like a very good fit with hAtom, the only complication being the presence of multiple authors which is unhandled in hAtom, and? untested? in hCard. VCARD has a concept of ?AGENTS?, which effectively nests vcards within each other. They're unhandled in desktop software, so demand to work out parsing rules in hCard has been low. If my understanding of AGENT is current, though, this seems like an appropriate place to use them. It should look something like this:
? ?
My understanding is that both named people should be ?agents? of the organisation. As to how this parses? that could be more fun and needs input from parser developers when they get a moment! Ben From julian_bond at voidstar.com Tue Apr 8 05:10:35 2008 From: julian_bond at voidstar.com (Julian Bond) Date: Tue Apr 8 06:16:13 2008 Subject: [uf-discuss] Parsing XFN in PHP Message-ID: I need some advice about reading rel="me" tags in arbitrary web pages using PHP. I'm intending to use this to help build a lifestream style function. The basic intent is to cut down the amount of data entry the user has to do. When they give me a MyBlogLog, Friendfeed, Plaxo Pulse page that has lists of links to their profile pages I should be able to avoid having to ask them for all of them again. So:- - User gives me a URL for one of their profile pages - Use Curl to collect the source - Parse the source looking for links with a rel="me" - Extract an array of Link URL - Link Text - Do something useful with the array. (???? followed by Profit!) I've been searching this morning for a PHP library to do the parsing and link extraction or PHP examples or example regex to use in PREG_MATCH_ALL or something/anything, without success. Since the source data is probably badly written and broken html, I don't think I can use XML methods as all the XML unserialising code I've used barfs on badly formed XML. One possibility I suppose is to run it though HTML-Tidy first but I run the (admittedly small) chance of html-tidy wiping out some of the links. So what do people use to consume XFN with PHP? -- Julian Bond E&MSN: julian_bond at voidstar.com M: +44 (0)77 5907 2173 Webmaster: http://www.ecademy.com/ T: +44 (0)192 0412 433 Personal WebLog: http://www.voidstar.com/ skype:julian.bond?chat Not Tested On Animals From mail at ciaranmcnulty.com Tue Apr 8 06:38:37 2008 From: mail at ciaranmcnulty.com (Ciaran McNulty) Date: Tue Apr 8 06:38:56 2008 Subject: [uf-discuss] Parsing XFN in PHP In-Reply-To: References: Message-ID: On Tue, Apr 8, 2008 at 1:10 PM, Julian Bond wrote: > I need some advice about reading rel="me" tags in arbitrary web pages using > PHP. I'm intending to use this to help build a lifestream style function. > The basic intent is to cut down the amount of data entry the user has to do. > When they give me a MyBlogLog, Friendfeed, Plaxo Pulse page that has lists > of links to their profile pages I should be able to avoid having to ask them > for all of them again. So:- > > - User gives me a URL for one of their profile pages > - Use Curl to collect the source > - Parse the source looking for links with a rel="me" > - Extract an array of Link URL - Link Text > - Do something useful with the array. (???? followed by Profit!) Have a look at the Google Social Graph API [1] - it doesn't query things 'live', but because it's Google they can return all the results in one response to your query, and it saves you spidering the site yourself and worrying about all the complexity that would involve. Alternatively, if you want to parse uFs in PHP, I believe hKit by Drew McLellan [2] may have some @rel=me support? -Ciaran McNulty [1] http://code.google.com/apis/socialgraph/ [2] http://code.google.com/p/hkit/ From andr3.pt at gmail.com Tue Apr 8 06:52:28 2008 From: andr3.pt at gmail.com (=?ISO-8859-1?Q?Andr=E9_Lu=EDs?=) Date: Tue Apr 8 06:52:32 2008 Subject: [uf-discuss] Parsing XFN in PHP In-Reply-To: References: Message-ID: Hi Julian, You can either use hkit ( http://code.google.com/p/hkit/ ) or the SocialGraph API, by Google (http://code.google.com/apis/socialgraph/). Cheers, Andr? On Tue, Apr 8, 2008 at 1:10 PM, Julian Bond wrote: > I need some advice about reading rel="me" tags in arbitrary web pages using > PHP. I'm intending to use this to help build a lifestream style function. > The basic intent is to cut down the amount of data entry the user has to do. > When they give me a MyBlogLog, Friendfeed, Plaxo Pulse page that has lists > of links to their profile pages I should be able to avoid having to ask them > for all of them again. So:- > > - User gives me a URL for one of their profile pages > - Use Curl to collect the source > - Parse the source looking for links with a rel="me" > - Extract an array of Link URL - Link Text > - Do something useful with the array. (???? followed by Profit!) > > I've been searching this morning for a PHP library to do the parsing and > link extraction or PHP examples or example regex to use in PREG_MATCH_ALL or > something/anything, without success. Since the source data is probably badly > written and broken html, I don't think I can use XML methods as all the XML > unserialising code I've used barfs on badly formed XML. One possibility I > suppose is to run it though HTML-Tidy first but I run the (admittedly small) > chance of html-tidy wiping out some of the links. > > So what do people use to consume XFN with PHP? > > -- > Julian Bond E&MSN: julian_bond at voidstar.com M: +44 (0)77 5907 2173 > Webmaster: http://www.ecademy.com/ T: +44 (0)192 0412 433 > Personal WebLog: http://www.voidstar.com/ skype:julian.bond?chat > Not Tested On Animals > _______________________________________________ > microformats-discuss mailing list > microformats-discuss@microformats.org > http://microformats.org/mailman/listinfo/microformats-discuss > From julian_bond at voidstar.com Tue Apr 8 07:21:37 2008 From: julian_bond at voidstar.com (Julian Bond) Date: Tue Apr 8 07:25:35 2008 Subject: [uf-discuss] Parsing XFN in PHP In-Reply-To: References: Message-ID: Ciaran McNulty Tue, 8 Apr 2008 14:38:37 >Have a look at the Google Social Graph API [1] - it doesn't query >things 'live', but because it's Google they can return all the results >in one response to your query, and it saves you spidering the site >yourself and worrying about all the complexity that would involve. I'm really looking forward to the SG-API becoming useful, but right now it's pretty flaky. There's a lot of pages you'd expect to be in there that aren't and the result you get back aren't what you'd expect. >Alternatively, if you want to parse uFs in PHP, I believe hKit by Drew >McLellan [2] may have some @rel=me support? I'll take a look. Thanks. -- Julian Bond E&MSN: julian_bond at voidstar.com M: +44 (0)77 5907 2173 Webmaster: http://www.ecademy.com/ T: +44 (0)192 0412 433 Personal WebLog: http://www.voidstar.com/ skype:julian.bond?chat Not Tested On Animals From daniel.oconnor at gmail.com Tue Apr 8 07:09:03 2008 From: daniel.oconnor at gmail.com (Daniel O'Connor) Date: Tue Apr 8 07:33:54 2008 Subject: [uf-discuss] Parsing XFN in PHP In-Reply-To: References: Message-ID: <106cc1200804080709w58dd566cyc09fe64878bf0a78@mail.gmail.com> On Tue, Apr 8, 2008 at 9:40 PM, Julian Bond wrote: > I need some advice about reading rel="me" tags in arbitrary web pages using > PHP. I'm intending to use this to help build a lifestream style function. > The basic intent is to cut down the amount of data entry the user has to do. > When they give me a MyBlogLog, Friendfeed, Plaxo Pulse page that has lists > of links to their profile pages I should be able to avoid having to ask them > for all of them again. So:- See also http://code.google.com/p/xmlgrddl/ Do: //Load a GRDDL engine $grddl = XML_GRDDL::factory('xsl'); $xml = $grddl->fetch($url); //Look for GRDDL transformations to extract out any data at those URLs $stylesheets = $grddl->inspect($url); $stylesheets[] = 'http://www.w3.org/2003/12/rdf-in-xhtml-xslts/grokXFN.xsl'; //Force XFN to apply $rdf_xml = array(); foreach ($stylesheets as $stylesheet) { $rdf_xml[] = $grddl->transform($xml, $stylesheet); } //Produce One True RDF/XML document $result = array_reduce($rdf_xml, array($grddl, 'merge')); $document = simplexml_load_string($file); $document->registerNameSpace('vcard', 'http://www.w3.org/2006/vcard/ns#'); $links = $document->xpath('//rdf:homepage'); //Present this list of links to the user for selection ("hey, those are my links' or "that's my friend's link") print_r($links); A little verbose, and a little fragile, but it should work From mail at tobyinkster.co.uk Tue Apr 8 08:44:58 2008 From: mail at tobyinkster.co.uk (Toby A Inkster) Date: Tue Apr 8 08:45:54 2008 Subject: [uf-discuss] Re: Appropriate microformats for journal listings? References: <3B60DC69-113D-4DDD-BC8A-B6B4441A5DB1@ben-ward.co.uk> Message-ID: Ben Ward wrote: > VCARD has a concept of ?AGENTS?, which effectively nests vcards within > each other. They're unhandled in desktop software, so demand to work out > parsing rules in hCard has been low. For what it's worth, Cognition has successfully parsed agents since the first alpha release. It supports agents as an embedded vCard:
John Citizen ...
or as a plain string:
John Citizen
Cognition can be used as a command-line tool, through its web interface at or as a Perl module. An example hCard which includes a couple of agents -- one of each form -- is at . -- Toby A Inkster BSc (Hons) ARCS [Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux] [OS: Linux 2.6.17.14-mm-desktop-9mdvsmp, up 13 days, 2:50.] Tagliatelle with Fennel and Asparagus http://tobyinkster.co.uk/blog/2008/04/06/tagliatelle-fennel-asparagus/ From julian_bond at voidstar.com Tue Apr 8 23:41:10 2008 From: julian_bond at voidstar.com (Julian Bond) Date: Wed Apr 9 00:14:58 2008 Subject: [uf-discuss] Parsing XFN in PHP In-Reply-To: References: Message-ID: Let me expand on that. Julian Bond Tue, 8 Apr 2008 15:21:37 >I'm really looking forward to the SG-API becoming useful, but right now >it's pretty flaky. There's a lot of pages you'd expect to be in there >that aren't and the result you get back aren't what you'd expect. SG-API actually worked very well for my purposes. I'm looking for outward edges and they came back in a pretty convenient form. However, it's dependent on the underlying index, not reading the pages in real time. And several friendfeed pages I tried had no data or incomplete data because they'd been created since the last time the spider called. So it looks to me like SG-API is a useful research tool, but not a useful data import tool. >>Alternatively, if you want to parse uFs in PHP, I believe hKit by Drew >>McLellan [2] may have some @rel=me support? Not yet. It seems to be extensible but there's only an extension for hCard at the moment. Reading between the lines, hKit is using Tidy to turn the html into well formed xhtml and then simpleXml to parse out the uFs. So going down that route or one like it seems to be the best option. It would be good if there were actually some solid libraries to read all the uFs and especially XFN in PHP. A format that's easy to write but hard to read isn't terribly useful. :( -- Julian Bond E&MSN: julian_bond at voidstar.com M: +44 (0)77 5907 2173 Webmaster: http://www.ecademy.com/ T: +44 (0)192 0412 433 Personal WebLog: http://www.voidstar.com/ skype:julian.bond?chat Not Tested On Animals From adam.craven at fourshapes.com Wed Apr 9 04:07:52 2008 From: adam.craven at fourshapes.com (Adam Craven - Four Shapes) Date: Wed Apr 9 04:08:00 2008 Subject: [uf-discuss] Operator 0.9.1 In-Reply-To: <942A2E6C-E8DC-4AF4-9D34-8E5D8EB1D337@eatyourgreens.org.uk> References: <1e76a9d60803200805k33d17116t1af98501150e92fd@mail.gmail.com> <1c69ccdc0803201313y5f3a03efo3238e70ab3712009@mail.gmail.com> <1e76a9d60803201345k3c9638d5hd1afc2fd8f0c682f@mail.gmail.com> <1e76a9d60803210629w40512ba9l60952c29c7f12ee1@mail.gmail.com> <942A2E6C-E8DC-4AF4-9D34-8E5D8EB1D337@eatyourgreens.org.uk> Message-ID: <93E240AC-8C9C-4549-B2A0-719754214B8B@fourshapes.com> Another issue with operator; it's very slow with JavaScript combined with onmouseover and DOM manipulation - such as tooltips. This is with a default installation. Anyone else having the problem? Does the author happen to frequent this mailing list? On 21 Mar 2008, at 15:01, Jim O'Donnell wrote: > I can confirm this - Operator finds the address regardless, but only > finds the name 'Andrew Jaswa' if 'show hidden microformats' is ticked. > > Jim > > On 21 Mar 2008, at 13:29, Andrew Jaswa wrote: > >>> Once I find out some more I'll share. >> >> >> Alright so here is what I found out: Op. 0.9.1 does not find the >> hCard >> when the nested divs are floated. >> I've also noticed that when the "Show hidden Microformats" is checked >> Op will find the hCard. >> >> >> My sample file is found at: http://gotkicked.net/research/uf/opbug.html >> >> Can someone else please confirm this? >> >> >> Andrew > > Jim O'Donnell > jim@eatyourgreens.org.uk > http://eatyourgreens.org.uk > http://flickr.com/photos/eatyourgreens > > > > _______________________________________________ > microformats-discuss mailing list > microformats-discuss@microformats.org > http://microformats.org/mailman/listinfo/microformats-discuss From angus at pobox.com Wed Apr 9 04:52:21 2008 From: angus at pobox.com (Angus McIntyre) Date: Wed Apr 9 05:32:55 2008 Subject: [uf-discuss] Appropriate microformats for journal listings? In-Reply-To: <3B60DC69-113D-4DDD-BC8A-B6B4441A5DB1@ben-ward.co.uk> References: <3B60DC69-113D-4DDD-BC8A-B6B4441A5DB1@ben-ward.co.uk> Message-ID: At 2:04 PM +0100 4/8/08, Ben Ward wrote: >Seems like a very good fit with hAtom, the only >complication being the presence of multiple >authors which is unhandled in hAtom, and? >untested? in hCard. > >VCARD has a concept of 'AGENTS', which effectively nests vcards >within each other. They're unhandled in desktop software, so demand >to work out parsing rules in hCard has been low. Thank you for the suggestions and examples. That's very helpful. Does anyone know how well these are handled by current software? i.e. will anything that tries to read multi-author hAtoms, or hCard with agents: a. Degrade gracefully, or b. Ignore loftily, or c. Fail catastrophically? Incidentally, with regard to hAtom use, would you recommend making each journal an individual hFeed, with the articles represented by an hEntry ... or should the page as a whole be an hFeed, with each journal edition being an hEntry? Thanks, Angus From geraldbauer2007 at gmail.com Wed Apr 9 08:40:51 2008 From: geraldbauer2007 at gmail.com (Gerald Bauer) Date: Wed Apr 9 08:40:55 2008 Subject: [uf-discuss] Microformats Mania in Vancouver, B.C - Events Aplenty Join Us - Open Web, VanDev, SocialCamp Message-ID: <7e7cb8940804090840m10fb92b0v6778ee076442a78c@mail.gmail.com> Hello, Join us for the Open Web Vancouver 2008 two-day conference on Apr 14+15 in Vancouver, B.C on Canada's West Coast that will include talks on Microformats such as: o Microformats and Distributed Social Networks w/ Chris Messina o Microformats Past, Present, Future w / Ryan King More @ http://www.openwebvancouver.ca On Apr 16 I volunteer for a free talk at the University of British Columbia (UBC) downtown campus in Vancouver to talk on "Microformats - Adding Semantics to Your Web Site - Web 3.0 in Action". Join us. More @ http://is.gd/4YM Finally, *you* are invited to speak and sign-up for a lightning talk (10-15min) at the upcoming SocialCamp part of the Open Web Vancouver 2008 event. More @ http://barcamp.org/SocialCampVancouver Cheers. -- Gerald Bauer - Internet Professional - http://geraldbauer.wordpress.com From kevinmarks at gmail.com Wed Apr 9 10:58:59 2008 From: kevinmarks at gmail.com (Kevin Marks) Date: Wed Apr 9 10:59:02 2008 Subject: [uf-discuss] Microformats Mania in Vancouver, B.C - Events Aplenty Join Us - Open Web, VanDev, SocialCamp In-Reply-To: <7e7cb8940804090840m10fb92b0v6778ee076442a78c@mail.gmail.com> References: <7e7cb8940804090840m10fb92b0v6778ee076442a78c@mail.gmail.com> Message-ID: <73766b160804091058h58aa04ebkfcfaaaf23ff03509@mail.gmail.com> As part of my talk on the Social web I'll be discussing Google's social Graph API, which is a cache of the distributed social graph created by XFN On Wed, Apr 9, 2008 at 8:40 AM, Gerald Bauer wrote: > Hello, > > Join us for the Open Web Vancouver 2008 two-day conference on Apr > 14+15 in Vancouver, B.C on Canada's West Coast that will include talks > on Microformats such as: > > o Microformats and Distributed Social Networks w/ Chris Messina > o Microformats Past, Present, Future w / Ryan King > > More @ http://www.openwebvancouver.ca > > On Apr 16 I volunteer for a free talk at the University of British > Columbia (UBC) downtown campus in Vancouver to talk on "Microformats - > Adding Semantics to Your Web Site - Web 3.0 in Action". Join us. > > More @ http://is.gd/4YM > > Finally, *you* are invited to speak and sign-up for a lightning > talk (10-15min) at the upcoming SocialCamp part of the Open Web > Vancouver 2008 event. > > More @ http://barcamp.org/SocialCampVancouver > > Cheers. > > -- > Gerald Bauer - Internet Professional - http://geraldbauer.wordpress.com > _______________________________________________ > microformats-discuss mailing list > microformats-discuss@microformats.org > http://microformats.org/mailman/listinfo/microformats-discuss > From kevinmarks at gmail.com Wed Apr 9 11:18:24 2008 From: kevinmarks at gmail.com (Kevin Marks) Date: Wed Apr 9 11:18:28 2008 Subject: [uf-discuss] Parsing XFN in PHP In-Reply-To: References: Message-ID: <73766b160804091118t1c5ad3bbof0bc5456898c2d1a@mail.gmail.com> On Tue, Apr 8, 2008 at 11:41 PM, Julian Bond wrote: > Let me expand on that. > > Julian Bond Tue, 8 Apr 2008 15:21:37 > > > > I'm really looking forward to the SG-API becoming useful, but right now > it's pretty flaky. There's a lot of pages you'd expect to be in there that > aren't and the result you get back aren't what you'd expect. > > > > SG-API actually worked very well for my purposes. I'm looking for outward > edges and they came back in a pretty convenient form. However, it's > dependent on the underlying index, not reading the pages in real time. And > several friendfeed pages I tried had no data or incomplete data because > they'd been created since the last time the spider called. So it looks to me > like SG-API is a useful research tool, but not a useful data import tool. We expect to crawl more often soon; one thing that you can do is use the test parser as described here: http://groups.google.com/group/social-graph-api/browse_thread/thread/c2deffae0bba09dc and here: http://code.google.com/apis/socialgraph/docs/testparse.html to parse pages that are missing from the index (though I wouldn't recommend doing this for huge numbers of pages, it coudl help as a stopgap, and also as a way to validate your own local parsing. From dmitry at baranovskiy.com Wed Apr 9 20:56:08 2008 From: dmitry at baranovskiy.com (Dmitry Baranovskiy) Date: Wed Apr 9 20:56:20 2008 Subject: [uf-discuss] Optimus 0.5.1 Message-ID: <8a52ddad0804092056u7e12718dy76eadd06f32e0b08@mail.gmail.com> Hi, I did some massive update to Optimus* (microformats transformer): ? support of nested microformats ? support of multiple includes ? support of nested includes ? support of anchor (you can use URL like http://example.com#my-vcard to narrow the target) ? support of @couldbe attribute (internal feature, now item in hreview, for example, could be vcard or vevent) ? hfeed now is optional ? fix for text spacing ? fix for empty tags in output ? hListing support ? hAudio support ? general performance improvement ? add RSS as an output format ? rewrite validator from scratch ? better UTF-8 support Enjoy. As always feedback is highly appreciated. _________ * http://www.microformatique.com/optimus/ -- Best regards, Dmitry Baranovskiy http://dmitry.baranovskiy.com From mdagn at spraci.com Wed Apr 9 23:07:37 2008 From: mdagn at spraci.com (Michael MD) Date: Wed Apr 9 23:07:45 2008 Subject: [uf-discuss] Optimus 0.5.1 References: <8a52ddad0804092056u7e12718dy76eadd06f32e0b08@mail.gmail.com> Message-ID: <001b01c89ad1$2e3867b0$116bacca@COMCEN> > ? support of nested includes nested includes? ! my guess is that anyone doing that would be asking for trouble! From dmitry.baranovskiy at gmail.com Wed Apr 9 23:39:42 2008 From: dmitry.baranovskiy at gmail.com (Dmitry Baranovskiy) Date: Wed Apr 9 23:39:52 2008 Subject: [uf-discuss] Optimus 0.5.1 In-Reply-To: <001b01c89ad1$2e3867b0$116bacca@COMCEN> References: <8a52ddad0804092056u7e12718dy76eadd06f32e0b08@mail.gmail.com> <001b01c89ad1$2e3867b0$116bacca@COMCEN> Message-ID: Example: Header: company name + company logo, footer: company address content: apart from some text, company news. So, header is hCard, footer is included, company news is hAtom where author is a company. So we include header as an author. Ta-da! Nested inclusion. And after all, nothing should stop people from using include pattern heavily. On 10/04/2008, at 4:07 PM, Michael MD wrote: >> ? support of nested includes > > nested includes? ! > > my guess is that anyone doing that would be asking for trouble! From julian_bond at voidstar.com Thu Apr 10 00:30:16 2008 From: julian_bond at voidstar.com (Julian Bond) Date: Thu Apr 10 00:31:20 2008 Subject: [uf-discuss] Parsing XFN in PHP In-Reply-To: <73766b160804091118t1c5ad3bbof0bc5456898c2d1a@mail.gmail.com> References: <73766b160804091118t1c5ad3bbof0bc5456898c2d1a@mail.gmail.com> Message-ID: Kevin Marks Wed, 9 Apr 2008 11:18:24 >We expect to crawl more often soon; one thing that you can do is use >the test parser as described here: > >http://groups.google.com/group/social-graph-api/browse_thread/thread/c2d >effae0bba09dc > >and here: > >http://code.google.com/apis/socialgraph/docs/testparse.html > >to parse pages that are missing from the index (though I wouldn't >recommend doing this for huge numbers of pages, it coudl help as a >stopgap, and also as a way to validate your own local parsing. Hmmm. Now that's an interesting idea. Some thoughts:- - Any chance of open-sourcing the parser? I presume it's python? - A variation of the parser that used GET and took just two parameters, a url and a urlFormat would be useful. Of course it could be built from outside using the existing test parser. - In fact that variation would make a great production service that would really benefit the uF community. As an aside, hKit could really use - Support for all uFs and not just hCard - Modifications to reduce dependencies and just possibly work with PHP4 Any chance of that happening? Are there any uF projects to build parser libraries and uF validation tools? -- Julian Bond E&MSN: julian_bond at voidstar.com M: +44 (0)77 5907 2173 Webmaster: http://www.ecademy.com/ T: +44 (0)192 0412 433 Personal WebLog: http://www.voidstar.com/ skype:julian.bond?chat It's Got To Be Good From mail at tobyinkster.co.uk Thu Apr 10 01:53:26 2008 From: mail at tobyinkster.co.uk (Toby A Inkster) Date: Thu Apr 10 02:01:24 2008 Subject: [uf-discuss] Re: Optimus 0.5.1 References: <8a52ddad0804092056u7e12718dy76eadd06f32e0b08@mail.gmail.com> <001b01c89ad1$2e3867b0$116bacca@COMCEN> Message-ID: <6511d5-nel.ln1@ophelia.g5n.co.uk> Michael MD wrote: > nested includes? ! > my guess is that anyone doing that would be asking for trouble! This is another thing that Cognition has supported since alpha1. (Though until the latest release it supported including a node's own parent, which really *is* asking for trouble!) The only thing you've got to be careful of (from a parser's POV) is making sure that you don't get stuck in an infinite loop. The solution is to write your inclusion code to *not* support nested includes, and then simply call the function a few times. (The first call will handle includes, then the second call will handle includes within includes, etc.) My policy is to allow two levels of includes for adr and geo, 4 levels for hCalendar valarm and hCalendar vfreebusy, and 6 for hCard, hAtom entries, hCalendar vevent and hCalendar todo. I also use an optimisation such that each call of the function actually checks to see if any changes have been made. If no changes have been made, then the loop is ended prematurely -- this prevents the inclusion code (which is computationally expensive) from being called when there are no instances of class="include" left, or those instances are invalid (e.g. attempts to include an ancestor node). -- Toby A Inkster BSc (Hons) ARCS [Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux] [OS: Linux 2.6.17.14-mm-desktop-9mdvsmp, up 14 days, 19:59.] Tagliatelle with Fennel and Asparagus http://tobyinkster.co.uk/blog/2008/04/06/tagliatelle-fennel-asparagus/ From mail at ciaranmcnulty.com Thu Apr 10 02:03:16 2008 From: mail at ciaranmcnulty.com (Ciaran McNulty) Date: Thu Apr 10 02:03:19 2008 Subject: [uf-discuss] Parsing XFN in PHP In-Reply-To: References: <73766b160804091118t1c5ad3bbof0bc5456898c2d1a@mail.gmail.com> Message-ID: On Thu, Apr 10, 2008 at 8:30 AM, Julian Bond wrote: > - Modifications to reduce dependencies and just possibly work with PHP4 -1 from me, PHP5 is approaching 4 years old and PHP6 is just around the corner, and the gains from using PHP5's object syntax are almost immeasurable IMO. -Ciaran McNulty From mail at tobyinkster.co.uk Thu Apr 10 02:08:19 2008 From: mail at tobyinkster.co.uk (Toby A Inkster) Date: Thu Apr 10 02:46:32 2008 Subject: [uf-discuss] Re: Optimus 0.5.1 References: <8a52ddad0804092056u7e12718dy76eadd06f32e0b08@mail.gmail.com> Message-ID: <3121d5-nel.ln1@ophelia.g5n.co.uk> Dmitry Baranovskiy wrote: > As always feedback is highly appreciated. Just a few thoughts... * Parsing issues a few warnings related to use of XHTML with namespaces. You might want to think about turning down PHP's error reporting. e.g. error_reporting(E_ERROR); * Parsing same page with output as JSON I see (line break added): tel: [{, "type": "work""value": "+1 (310) 597 3781 work"}, {, "type": "work""value": "+1 (310) 597 3781 work"}]} this is clearly garbled. * There are other JSON output errors: e.g. not all strings are quoted. I might have some more feedback when I've gone through your source code. -- Toby A Inkster BSc (Hons) ARCS [Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux] [OS: Linux 2.6.17.14-mm-desktop-9mdvsmp, up 14 days, 20:17.] Tagliatelle with Fennel and Asparagus http://tobyinkster.co.uk/blog/2008/04/06/tagliatelle-fennel-asparagus/ From mail at ciaranmcnulty.com Thu Apr 10 02:48:13 2008 From: mail at ciaranmcnulty.com (Ciaran McNulty) Date: Thu Apr 10 02:48:16 2008 Subject: [uf-discuss] Optimus 0.5.1 In-Reply-To: <8a52ddad0804092056u7e12718dy76eadd06f32e0b08@mail.gmail.com> References: <8a52ddad0804092056u7e12718dy76eadd06f32e0b08@mail.gmail.com> Message-ID: On Thu, Apr 10, 2008 at 4:56 AM, Dmitry Baranovskiy wrote: > I did some massive update to Optimus* (microformats transformer): Dmitry, all looks great! The only problem I can see is that it doesn't handle invalid HTML that well (an example would be http://ciaranmcnulty.livejournal.com/). -Ciaran McNulty From mail at tobyinkster.co.uk Thu Apr 10 02:19:19 2008 From: mail at tobyinkster.co.uk (Toby A Inkster) Date: Thu Apr 10 02:50:06 2008 Subject: [uf-discuss] Re: Parsing XFN in PHP References: <73766b160804091118t1c5ad3bbof0bc5456898c2d1a@mail.gmail.com> Message-ID: Julian Bond wrote: > - Modifications to reduce dependencies and just possibly work with PHP4 PHP4 has been dead since the beginning of January. There will be no further releases apart from the odd security fix. For projects looking to expand beyond PHP5, PHP6 is a more useful option. -- Toby A Inkster BSc (Hons) ARCS [Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux] [OS: Linux 2.6.17.14-mm-desktop-9mdvsmp, up 14 days, 20:34.] Tagliatelle with Fennel and Asparagus http://tobyinkster.co.uk/blog/2008/04/06/tagliatelle-fennel-asparagus/ From mail at ciaranmcnulty.com Thu Apr 10 03:21:29 2008 From: mail at ciaranmcnulty.com (Ciaran McNulty) Date: Thu Apr 10 03:21:31 2008 Subject: [uf-discuss] Optimus 0.5.1 In-Reply-To: References: <8a52ddad0804092056u7e12718dy76eadd06f32e0b08@mail.gmail.com> Message-ID: On Thu, Apr 10, 2008 at 10:48 AM, Ciaran McNulty wrote: > The only problem I can see is that it doesn't handle invalid HTML that > well (an example would be http://ciaranmcnulty.livejournal.com/). In fact it does just look like you need to turn down error reporting to do that (or precede your loadHtml() call with an @). You do also need to check the return value of your fopen() - if I enter a URL that 404s, your code appears to keep trying to read from that resource, meaning I get returned a few hundred Mb of 'error in fread - file resource is not valid' type messages. Send me an email off-list of you want me to take a look at those bits for you. -Ciaran McNulty From mdagn at spraci.com Thu Apr 10 04:04:55 2008 From: mdagn at spraci.com (Michael MD) Date: Thu Apr 10 04:05:10 2008 Subject: [uf-discuss] Parsing XFN in PHP References: <73766b160804091118t1c5ad3bbof0bc5456898c2d1a@mail.gmail.com> Message-ID: <006f01c89afa$b5afadb0$116bacca@COMCEN> > On Thu, Apr 10, 2008 at 8:30 AM, Julian Bond > wrote: >> - Modifications to reduce dependencies and just possibly work with PHP4 > > -1 from me, PHP5 is approaching 4 years old and PHP6 is just around > the corner, and the gains from using PHP5's object syntax are almost > immeasurable IMO. There are still lots of people stuck with php4 on shared servers ... and there is lots of legacy code out there in the real world that still needs to work (forget about finding time to rewrite it all!) What did they expect? ... changing a language in ways that breaks existing code is hardly the way to encourage people to upgrade to a new version! From julian_bond at voidstar.com Thu Apr 10 04:39:14 2008 From: julian_bond at voidstar.com (Julian Bond) Date: Thu Apr 10 04:40:21 2008 Subject: [uf-discuss] Parsing XFN in PHP In-Reply-To: References: <73766b160804091118t1c5ad3bbof0bc5456898c2d1a@mail.gmail.com> Message-ID: <4kcWV9Bixf$HFA7E@jblaptop.voidstar.com> Ciaran McNulty Thu, 10 Apr 2008 10:03:16 >On Thu, Apr 10, 2008 at 8:30 AM, Julian Bond wrote: >> - Modifications to reduce dependencies and just possibly work with PHP4 > >-1 from me, PHP5 is approaching 4 years old and PHP6 is just around >the corner, and the gains from using PHP5's object syntax are almost >immeasurable IMO. Yes, yes and yes. comma but, there's still an awful lot of PHP hosting out there that is out of date. Hence the efforts in many standards communities and application dev communities to reduce dependencies and produce code that will run on PHP4. I now it's horrible. I know it's easier and better to write to PHP5. But it's also desirable that Microformats are readable by the widest possible audience. -- Julian Bond E&MSN: julian_bond at voidstar.com M: +44 (0)77 5907 2173 Webmaster: http://www.ecademy.com/ T: +44 (0)192 0412 433 Personal WebLog: http://www.voidstar.com/ skype:julian.bond?chat Tastes Like Milk From julian_bond at voidstar.com Thu Apr 10 05:05:05 2008 From: julian_bond at voidstar.com (Julian Bond) Date: Thu Apr 10 05:05:45 2008 Subject: [uf-discuss] Parsing XFN in PHP In-Reply-To: <006f01c89afa$b5afadb0$116bacca@COMCEN> References: <73766b160804091118t1c5ad3bbof0bc5456898c2d1a@mail.gmail.com> <006f01c89afa$b5afadb0$116bacca@COMCEN> Message-ID: Michael MD Thu, 10 Apr 2008 21:04:55 >There are still lots of people stuck with php4 on shared servers >... and there is lots of legacy code out there in the real world that >still needs to work (forget about finding time to rewrite it all!) Interesting as this is, isn't it besides the point? Which is the lack of a PHP library of whatever flavour for parsing out XFN and other uFs. -- Julian Bond E&MSN: julian_bond at voidstar.com M: +44 (0)77 5907 2173 Webmaster: http://www.ecademy.com/ T: +44 (0)192 0412 433 Personal WebLog: http://www.voidstar.com/ skype:julian.bond?chat Tastes Like Milk From mark at markng.me.uk Thu Apr 10 05:40:35 2008 From: mark at markng.me.uk (Mark Ng) Date: Thu Apr 10 05:40:46 2008 Subject: [uf-discuss] Parsing XFN in PHP In-Reply-To: References: <73766b160804091118t1c5ad3bbof0bc5456898c2d1a@mail.gmail.com> <006f01c89afa$b5afadb0$116bacca@COMCEN> Message-ID: On 10/04/2008, Julian Bond wrote: > Interesting as this is, isn't it besides the point? Which is the lack of a > PHP library of whatever flavour for parsing out XFN and other uFs. XFN itself is fairly easy to deal with by just throwing pages through tidy and using DOM/SAX/xPath, surely ? I made a rudimentary parser to do this some time ago. The code is a little ugly to publish, but I don't mind sharing privately. Mark From mail at ciaranmcnulty.com Thu Apr 10 06:01:05 2008 From: mail at ciaranmcnulty.com (Ciaran McNulty) Date: Thu Apr 10 06:01:10 2008 Subject: [uf-discuss] Parsing XFN in PHP In-Reply-To: References: <73766b160804091118t1c5ad3bbof0bc5456898c2d1a@mail.gmail.com> <006f01c89afa$b5afadb0$116bacca@COMCEN> Message-ID: On Thu, Apr 10, 2008 at 1:40 PM, Mark Ng wrote: > XFN itself is fairly easy to deal with by just throwing pages through > tidy and using DOM/SAX/xPath, surely ? I made a rudimentary parser to > do this some time ago. The code is a little ugly to publish, but I > don't mind sharing privately. Here's a *very* hacky code example from when I just wanted to check my 'me' links - I include it here just to demonstrate how simple XFN can be and hopefully it's apparent how easy it would be to work up into a nice objecty system for spidering: loadHtml($html)){ $xpath = new DomXpath($dom); if($nodes = $xpath->query("//a[contains(concat(' ', normalize-space(@rel), ' '),' me ')]")){ foreach($nodes as $node){ echo $node->getAttribute('href'), PHP_EOL; } } } else{ echo 'Could not parse HTML', PHP_EOL; } } else{ echo 'Could not fetch file', PHP_EOL; } ?> From mail at tobyinkster.co.uk Thu Apr 10 05:45:05 2008 From: mail at tobyinkster.co.uk (Toby A Inkster) Date: Thu Apr 10 06:02:30 2008 Subject: [uf-discuss] Re: Parsing XFN in PHP References: <73766b160804091118t1c5ad3bbof0bc5456898c2d1a@mail.gmail.com> <006f01c89afa$b5afadb0$116bacca@COMCEN> Message-ID: Michael MD wrote: > There are still lots of people stuck with php4 on shared servers ... I suspect that won't be the case much longer after security updates for PHP4 are discontinued in August. > What did they expect? ... changing a language in ways that breaks > existing code is hardly the way to encourage people to upgrade to a new > version! There are virtually no changes in PHP5 that break existing PHP4 code. Some things like register_globals and auto-escaping of incoming variables are turned off by default in PHP5 (they were on by default in PHP 4.0 IIRC) but can be switched on in php.ini or a .htaccess file in a matter of seconds. PHP6 will be a more painful switch for those with legacy code. (But still fairly painless for those updating from PHP5-style code.) -- Toby A Inkster BSc (Hons) ARCS [Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux] [OS: Linux 2.6.17.14-mm-desktop-9mdvsmp, up 14 days, 23:58.] Tagliatelle with Fennel and Asparagus http://tobyinkster.co.uk/blog/2008/04/06/tagliatelle-fennel-asparagus/ From gareth at morethanseven.net Thu Apr 10 06:44:12 2008 From: gareth at morethanseven.net (gareth rushgrove) Date: Thu Apr 10 06:44:14 2008 Subject: [uf-discuss] Re: Parsing XFN in PHP In-Reply-To: References: <73766b160804091118t1c5ad3bbof0bc5456898c2d1a@mail.gmail.com> <006f01c89afa$b5afadb0$116bacca@COMCEN> Message-ID: <9011f7c70804100644k2f960282s9567ff18282a90a@mail.gmail.com> On Thu, Apr 10, 2008 at 1:45 PM, Toby A Inkster wrote: > Michael MD wrote: > > > There are still lots of people stuck with php4 on shared servers ... > > I suspect that won't be the case much longer after security updates for > PHP4 are discontinued in August. > > > > What did they expect? ... changing a language in ways that breaks > > existing code is hardly the way to encourage people to upgrade to a new > > version! > > There are virtually no changes in PHP5 that break existing PHP4 code. > But their are new parts in PHP5 that if you use will render your code incompatible with PHP4 - in hKits case the use of SimpleXML. Although I agree PHP4 support wouldn't interest me. Drew releases hKit under an open source licence. I'd recommend joining the mailing list and checking out the source if you fancy. http://groups.google.com/group/hkit-discuss http://code.google.com/p/hkit/ With regards microformat support, their are actually a few profiles floating around for hKit as well as hcard, although they are scattered around the web. If I get a chance I'll see if I can get a list together somewhere - although Drew may already have one. G > Some things like register_globals and auto-escaping of incoming variables > are turned off by default in PHP5 (they were on by default in PHP 4.0 > IIRC) but can be switched on in php.ini or a .htaccess file in a matter of > seconds. > > PHP6 will be a more painful switch for those with legacy code. (But still > fairly painless for those updating from PHP5-style code.) > > > -- > Toby A Inkster BSc (Hons) ARCS > [Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux] > [OS: Linux 2.6.17.14-mm-desktop-9mdvsmp, up 14 days, 23:58.] > > > Tagliatelle with Fennel and Asparagus > http://tobyinkster.co.uk/blog/2008/04/06/tagliatelle-fennel-asparagus/ > > _______________________________________________ > > > microformats-discuss mailing list > microformats-discuss@microformats.org > http://microformats.org/mailman/listinfo/microformats-discuss > -- Gareth Rushgrove garethrushgrove.com morethanseven.net isitbirthday.com From gareth at morethanseven.net Thu Apr 10 06:48:08 2008 From: gareth at morethanseven.net (gareth rushgrove) Date: Thu Apr 10 06:48:12 2008 Subject: [uf-discuss] Parsing XFN in PHP In-Reply-To: References: Message-ID: <9011f7c70804100648p2851c2feq990acccaa24ac692@mail.gmail.com> On Tue, Apr 8, 2008 at 1:10 PM, Julian Bond wrote: > I need some advice about reading rel="me" tags in arbitrary web pages using > PHP. I'm intending to use this to help build a lifestream style function. > The basic intent is to cut down the amount of data entry the user has to do. > When they give me a MyBlogLog, Friendfeed, Plaxo Pulse page that has lists > of links to their profile pages I should be able to avoid having to ask them > for all of them again. So:- > > - User gives me a URL for one of their profile pages > - Use Curl to collect the source > - Parse the source looking for links with a rel="me" > - Extract an array of Link URL - Link Text > - Do something useful with the array. (???? followed by Profit!) > > I've been searching this morning for a PHP library to do the parsing and > link extraction or PHP examples or example regex to use in PREG_MATCH_ALL or > something/anything, without success. Since the source data is probably badly > written and broken html, I don't think I can use XML methods as all the XML > unserialising code I've used barfs on badly formed XML. One possibility I > suppose is to run it though HTML-Tidy first but I run the (admittedly small) > chance of html-tidy wiping out some of the links. > > So what do people use to consume XFN with PHP? Another approach is to use an external service based parser and simply send it requests. Depends on your exact needs but uFXtract might be worth a look. Supports lots of formats plus a couple of interesting concepts (paged datasets, some basic spidering): http://lab.backnetwork.com/ufXtract/ Then just use your favourite http request tool in php to make requests of the service and parse the response (XML or JSON as you prefer) > > -- > Julian Bond E&MSN: julian_bond at voidstar.com M: +44 (0)77 5907 2173 > Webmaster: http://www.ecademy.com/ T: +44 (0)192 0412 433 > Personal WebLog: http://www.voidstar.com/ skype:julian.bond?chat > Not Tested On Animals > _______________________________________________ > microformats-discuss mailing list > microformats-discuss@microformats.org > http://microformats.org/mailman/listinfo/microformats-discuss > -- Gareth Rushgrove garethrushgrove.com morethanseven.net isitbirthday.com From ryan.lists.warpshare at gmail.com Thu Apr 10 08:25:06 2008 From: ryan.lists.warpshare at gmail.com (Ryan Parman) Date: Thu Apr 10 08:25:29 2008 Subject: [uf-discuss] Re: Parsing XFN in PHP In-Reply-To: References: <73766b160804091118t1c5ad3bbof0bc5456898c2d1a@mail.gmail.com> <006f01c89afa$b5afadb0$116bacca@COMCEN> Message-ID: <6A1AB78F-FD0A-42C2-9119-B3D29F042393@gmail.com> On Apr 10, 2008, at 5:45 AM, Toby A Inkster wrote: > Michael MD wrote: > >> There are still lots of people stuck with php4 on shared servers ... > > I suspect that won't be the case much longer after security updates > for > PHP4 are discontinued in August. > >> What did they expect? ... changing a language in ways that breaks >> existing code is hardly the way to encourage people to upgrade to a >> new >> version! > > There are virtually no changes in PHP5 that break existing PHP4 code. > > Some things like register_globals and auto-escaping of incoming > variables > are turned off by default in PHP5 (they were on by default in PHP 4.0 > IIRC) but can be switched on in php.ini or a .htaccess file in a > matter of > seconds. 1) PHP 4.x users/developers have also had *years* to upgrade outdated code, or change to similar software that's been updated this century. 2) There are things that can be done faster, more efficiently, with less code in PHP 5.x than in 4.x. As a developer, I favor less code. 3) If your shared hosting provider only has PHP 4.x support, change your host. There are lots and lots of PHP5-capable hosts out there for cheap. (Make sure the host has at least 5.1.x though. 5.0.x was ridiculously buggy.) 4) If this were an existing project with PHP 4.x support, then sure, maintain support if the cost is reasonable. But for any new project, I'd say to start on a 5.x codebase. -- Ryan Parman From ryan.lists.warpshare at gmail.com Thu Apr 10 09:05:47 2008 From: ryan.lists.warpshare at gmail.com (Ryan Parman) Date: Thu Apr 10 09:05:58 2008 Subject: [uf-discuss] Parsing XFN in PHP In-Reply-To: References: <73766b160804091118t1c5ad3bbof0bc5456898c2d1a@mail.gmail.com>