From lists at ben-ward.co.uk Fri Jun 6 11:37:20 2008 From: lists at ben-ward.co.uk (Ben Ward) Date: Fri Jun 6 16:08:08 2008 Subject: [uf-dev] Defining and Extending Value Excepting In-Reply-To: <36A319113CF910438942741C4727ADFF01E97814@MOBY.Clarence.local> References: <25D51B07-201C-48F3-A1F6-8B2909B88B15@ben-ward.co.uk> <36A319113CF910438942741C4727ADFF01E97814@MOBY.Clarence.local> Message-ID: <8AA2B099-6FAA-43FC-B33E-C81326A95DE5@ben-ward.co.uk> Hey guys, I've tried to move this on a bit. I've clarified the ?no nesting value inside value? discussion under the parsing bullet points, see http://microformats.org/wiki/value-excerption-pattern I've also moved the ?parsing to-do? section off that page and pushed it onto a proper -issues page for the pattern. I've restructured that from the discussions, so we've something to focus on there now. Since it's better organised now, and by extension better organised for wider feedback, I'm going to publicise the existence of these pages on uf-discuss and invite the wider community to raise other issues. Concerning the current open issues, I'd like to draw your attentions to my most recent notes on them, see what you think. * Excluded Fields (http://microformats.org/wiki/value-excerption-pattern-issues#Excluded_Fields ) We've been thinking in terms of excluding particular fields from being used with value excerpting. What if we flipped it? Make it opt-in for particular fields? Have each spec clarify ?this field _may_ be used with value excerpting. That way, large fields like hAtom's entry- title, where value-excerpting has no (ahem) ?value?, won't be affected by it _and_ this would actually allow many of the problems with nesting microformats to be avoided without need for an ?mfo?-like class processing instruction. * Depth of Parsing (http://microformats.org/wiki/value-excerption-pattern-issues#Depth_of_Parsing ) Currently parsing all descendants can cause the nested-microformat- value-overwriting-potential-world-of-pain issue, an MKaply seemed to think he'd seen documentation that restrcting value excerpting to children only. Options are the mfo processing instruction proposal, which I dislike because it adds a processing instruction into an element which should be about the semantics of the content, not ?how to parse this?, we could restrict it to children only, which I suspect could break lots of hCards TEL fields, or perhaps the third option I've added today, which is to specify parsing children only as the default behaviour, but allow individual properties to override this to all descendants. Properties like TEL, where it's reasonable that it will be at the outer edge of a DOM tree, could permit all descendants to be parsed as value. Both excluding fields and parsing grandchildren add optionality to particular fields, from a parsing POV I see it as a set of switches when parsing each field, which can at least be clearly defined. Does that seem reasonable? As always, feedback most welcome and requested! B From lists at ben-ward.co.uk Wed Jun 11 13:26:26 2008 From: lists at ben-ward.co.uk (Ben Ward) Date: Wed Jun 11 14:10:46 2008 Subject: [uf-dev] How do we (want to) document parsing? Message-ID: Parser devs, I've been carrying on work on speccing value-excerpting, I'm keen that we set a good example of specifying parsing rules with this, with a view to requiring a higher standard in future and also going back to better spec the other patterns and microformats. To be honest, I'm underqualified for this. Actually, wait, that's not true, I'm amply qualified but haven't applied any of my knowledge of representing processes and so forth in the real world. Anyway, digression. I have, for better, worse or more likely embarrassment, put together a shoddy flow chart of how parsing of the value-excerption-pattern could work, factoring in the open issue of parsing @titles from empty elements (I'm working on the issues one at a time). We don't have uploading enabled on the wiki, so it's here: -ward.co.uk/ microformats/value-excerption-pattern/ValueExcerptionParseFlowChart.png My question is simple, in creating it I came across one open issue with the parsing flow, so it's been useful to do, but I need to know is it actually useful documentation in itself? Would you refer to something diagrammatic when implementing a parser? Or is there some other, better (perhaps more Wiki compatible) means of representing parsing rules and method branching that we should adopt? Would pseudo- code be sufficient? I know test cases are also a big thing, and I'll produce some of those as well as I work through the issue log. Thanks, Ben From aconbere at gmail.com Wed Jun 11 14:31:29 2008 From: aconbere at gmail.com (anders conbere) Date: Wed Jun 11 15:28:11 2008 Subject: [uf-dev] How do we (want to) document parsing? In-Reply-To: References: Message-ID: <8ca3fbe80806111431j53544bdek5620f7ad87f3394a@mail.gmail.com> On Wed, Jun 11, 2008 at 1:26 PM, Ben Ward wrote: > Parser devs, > > I've been carrying on work on speccing value-excerpting, I'm keen that we > set a good example of specifying parsing rules with this, with a view to > requiring a higher standard in future and also going back to better spec the > other patterns and microformats. > > To be honest, I'm underqualified for this. Actually, wait, that's not true, > I'm amply qualified but haven't applied any of my knowledge of representing > processes and so forth in the real world. Anyway, digression. > > I have, for better, worse or more likely embarrassment, put together a > shoddy flow chart of how parsing of the value-excerption-pattern could work, > factoring in the open issue of parsing @titles from empty elements (I'm > working on the issues one at a time). > > We don't have uploading enabled on the wiki, so it's here: > -ward.co.uk/microformats/value-excerption-pattern/ValueExcerptionParseFlowChart.png I'm not getting an image back here. > > My question is simple, in creating it I came across one open issue with the > parsing flow, so it's been useful to do, but I need to know is it actually > useful documentation in itself? Would you refer to something diagrammatic > when implementing a parser? Or is there some other, better (perhaps more > Wiki compatible) means of representing parsing rules and method branching > that we should adopt? Would pseudo-code be sufficient? I've been a big fan of representing the parsing rules in terms of claim or triples. This is how rdf describes it's parsing rules, and allows for easily codified tests. ~ Anders > > I know test cases are also a big thing, and I'll produce some of those as > well as I work through the issue log. > > Thanks, > > Ben > _______________________________________________ > microformats-dev mailing list > microformats-dev@microformats.org > http://microformats.org/mailman/listinfo/microformats-dev > From scott at randomchaos.com Wed Jun 11 17:58:53 2008 From: scott at randomchaos.com (Scott Reynen) Date: Wed Jun 11 17:59:04 2008 Subject: [uf-dev] How do we (want to) document parsing? In-Reply-To: <8ca3fbe80806111431j53544bdek5620f7ad87f3394a@mail.gmail.com> References: <8ca3fbe80806111431j53544bdek5620f7ad87f3394a@mail.gmail.com> Message-ID: <701EEC62-84BD-4F4B-9559-2C40504A599D@randomchaos.com> On [Jun 11], at [ Jun 11] 3:31 , anders conbere wrote: >> We don't have uploading enabled on the wiki, so it's here: >> -ward.co.uk/microformats/value-excerption-pattern/ >> ValueExcerptionParseFlowChart.png > > I'm not getting an image back here. I believe it's here: http://ben-ward.co.uk/microformats/value-excerption-pattern/ValueExcerptionParseFlowChart.png > I've been a big fan of representing the parsing rules in terms of > claim or triples. This is how rdf describes it's parsing rules, and > allows for easily codified tests. I'm not clear on how that would work with microformats. I can see how triples could be used for testing, as the result a parser should get, but I'm not clear on how they could be used for describing the process by which a parser should arrive at that result, which seems to be what Ben is seeking. If you still think that would work after looking at Ben's flow chart, could you maybe translate it into triples as a demonstration? Peace, Scott From brian.suda at gmail.com Thu Jun 12 01:21:08 2008 From: brian.suda at gmail.com (Brian Suda) Date: Thu Jun 12 01:21:15 2008 Subject: [uf-dev] How do we (want to) document parsing? In-Reply-To: <701EEC62-84BD-4F4B-9559-2C40504A599D@randomchaos.com> References: <8ca3fbe80806111431j53544bdek5620f7ad87f3394a@mail.gmail.com> <701EEC62-84BD-4F4B-9559-2C40504A599D@randomchaos.com> Message-ID: <21e770780806120121l57de8fcagf190d68cf31be257@mail.gmail.com> On Thu, Jun 12, 2008 at 12:58 AM, Scott Reynen wrote: > http://ben-ward.co.uk/microformats/value-excerption-pattern/ValueExcerptionParseFlowChart.png --- i had a look at the flow chart and found a few things that i think should be fixed and a few that i disagree with. (maybe we should number these nodes so it is easier to reference?) 1) I don't think values should be concatenated with a unicode char 0020 (a space). If there was intention to add white-space then those should be part of the value. We should not introduce additional information that was not explicitly marked-up. 2) If the value contains no inner-text, then use the @title. I think this was a proposal, but until we get more feedback it probably should not be part of our paring rules. What would be the semantics in that? I know this is an attempt at a worker-a-round, but i don't think it should be included in these parsing rules until we discuss it further. TIDY still has bugs (or maybe it is a feature) with empty nodes. Also, i don?t know if this chart can handle or should handle nested values? did we make a decision that nested value properties were to be ignored? Great work Ben, this is much easier for people to understand than a series of bullet points. Thanks, -brian -- brian suda http://suda.co.uk From lists at ben-ward.co.uk Thu Jun 12 02:36:52 2008 From: lists at ben-ward.co.uk (Ben Ward) Date: Thu Jun 12 02:36:59 2008 Subject: Value Excerption Pattern Parsing (was: [uf-dev] How do we (want to) document parsing?) In-Reply-To: <21e770780806120121l57de8fcagf190d68cf31be257@mail.gmail.com> References: <8ca3fbe80806111431j53544bdek5620f7ad87f3394a@mail.gmail.com> <701EEC62-84BD-4F4B-9559-2C40504A599D@randomchaos.com> <21e770780806120121l57de8fcagf190d68cf31be257@mail.gmail.com> Message-ID: On 12 Jun 2008, at 09:21, Brian Suda wrote: > On Thu, Jun 12, 2008 at 12:58 AM, Scott Reynen > wrote: >> http://ben-ward.co.uk/microformats/value-excerption-pattern/ValueExcerptionParseFlowChart.png > > --- i had a look at the flow chart and found a few things that i think > should be fixed and a few that i disagree with. Disagreement is fine and very welcome. This is all draft, in progress work :-) Fundamentally, I'm keen to establish _how_ we represent this sort of process going forward, with the complete understanding that the detail of this current diagram can and will change. > (maybe we should number these nodes so it is easier to reference?) Could do, although http://microformats.org/wiki/value-excerption-pattern-issues provides numbering of sorts so perhaps refer to those for now? > 1) I don't think values should be concatenated with a unicode char > 0020 (a space). If there was intention to add white-space then those > should be part of the value. We should not introduce additional > information that was not explicitly marked-up. The open issue is: http://microformats.org/wiki/value-excerption-pattern-issues#White-space_behaviour_when_concatenating_value_nodes . Seems reasonable. The default case I was thinking of at the time was actually somewhat muddled with concatenating repeat properties: e.g. additional-name properties in hCard, which would want to be space- separated. For value, I now lean toward agreeing with you, in so far as regardless of number of segments, we're still marking up a single ?f property, rather than multiple occurrences of the same ?f property. > 2) If the value contains no inner-text, then use the @title. I think > this was a proposal, but until we get more feedback it probably should > not be part of our paring rules. What would be the semantics in that? > I know this is an attempt at a worker-a-round, but i don't think it > should be included in these parsing rules until we discuss it further. This (http://microformats.org/wiki/value-excerption-pattern-issues#Parsing_title_from_Empty_value_Elements ) is the open issue I'm currently working on, and building the diagram was development exercise to clarify how it could be parsed. The semantics are a little tricky, because we're working with the fact that HTML does not have a native means of doing this. I think it's definable, though, so will have a go later. > TIDY still has bugs (or maybe it is a feature) with empty nodes. It does, and for some reason dropping empty elements is not a feature that can be switched off at the command line like other behaviours. However, I've found it's trivial to compile a version of tidy with ?don't drop empty elements with class names? behaviour added, and will submit it as a patch when I get time. That said, even without the patch making it back into the Tidy trunk any time soon, the fact is that Tidy can be made to work with an empty element technique. I've documented that on the -issues page (http://microformats.org/wiki/value-excerption-pattern-issues#Parsing_title_from_Empty_value_Elements ). > Also, i don?t know if this chart can handle or should handle nested > values? did we make a decision that nested value properties were to be > ignored? The reaction was negative, and you pointed out that from a publisher point-of-view nesting value in value was unnecessary; there's seems to be no reason to do it. So (http://microformats.org/wiki/value-excerption-pattern-issues#Nested_value ) I closed that issue and intend that we spec the pattern not to act recursively. > Great work Ben, this is much easier for people to understand than a > series of bullet points. That's my intention. I think there's a lot of potential to explore better ways of documenting parsing rules. B From mkaply at us.ibm.com Thu Jun 12 07:30:50 2008 From: mkaply at us.ibm.com (Michael Kaply) Date: Thu Jun 12 07:55:12 2008 Subject: Value Excerption Pattern Parsing (was: [uf-dev] How do we (want to) document parsing?) In-Reply-To: Message-ID: Do we want to do a final collapsing of whitespace after everything is concatenated? Like if the values are: Michael Kaply It would end up as: <>Michael Kaply<> Note that MichaelKaply would be <>MichaelKaply<> Because there is no whitespace between Michael and Kaply This would be similar to how we clean up whitespace in other properties. Michael Kaply Firefox Advocate mkaply@us.ibm.com http://www.kaply.com/weblog/ (External Blog) http://blogs.tap.ibm.com/weblogs/page/mkaply@us.ibm.com (Internal Blog) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://microformats.org/discuss/mail/microformats-dev/attachments/20080612/bcc290ad/attachment.html From brian.suda at gmail.com Fri Jun 13 01:14:31 2008 From: brian.suda at gmail.com (Brian Suda) Date: Fri Jun 13 01:14:42 2008 Subject: Value Excerption Pattern Parsing (was: [uf-dev] How do we (want to) document parsing?) In-Reply-To: References: Message-ID: <21e770780806130114s51a11e5sbc3f716018a74d06@mail.gmail.com> On 6/12/08, Michael Kaply wrote: > Do we want to do a final collapsing of whitespace after everything is > concatenated? --- that's a good question? right now this is what i do (not to say it is the best way) > Like if the values are: > > Michael Kaply > > > > > > It would end up as: > > <>Michael Kaply<> --- in my case i would take the full first value " Michael " and concatenate that with " Kaply " then do a trim on the result. So i would drop the leading and trailing white-space, but preserve the double space in the middle of the name. My result would be "Michael Kaply" it is debatable if that is correct or not. > Note that > > MichaelKaply > > would be > > <>MichaelKaply<> --- correct. I don't personally like the idea of adding a space by default. I tend to use value for things like phone numbers or email 123-ABCD(5678) briando_not_spam_me@suda.co.uk adding a space by default would be incorrect in these instances. > This would be similar to how we clean up whitespace in other properties. --- most of my value output is concatenated, then something like trim() is applied and only removed the leading and trailing white-space, but preserves any internal white-space. If the person is explicitly adding the spacing into the values, then we should probably honor that. -brian -- brian suda http://suda.co.uk From rff.rff at gmail.com Mon Jun 16 06:27:51 2008 From: rff.rff at gmail.com (gabriele renzi) Date: Mon Jun 16 06:27:55 2008 Subject: [uf-dev] badly formatted hCard? (elements in children of class) Message-ID: <828083e70806160627n464bb71l3338373854e58e94@mail.gmail.com> Hi everyone, given this code div class="vcard"> Melanie Kl?? Ippendorfer Weg. 24
53127 Bonn
found online[1], am I correct in asuming that it is badly formatted? Specifically, the "url" property is in a SPAN element, so I would use "Melanie Kl??" as its value, but I take it that the author wanted me to use the child element, so the href value of the A tag. Other uF parsers seem to handle this the same way that I do: look at the element with that class, not at its children, thus extracting a name as a url value. Is this behaviour correct, or shall I do it differently (and report bugs to other uf-parser authors) ? Thanks in advance. [1] http://weblog.netzgeschaedigt.de/?p=763 -- goto 10: http://www.goto10.it blog it: http://riffraff.blogsome.com blog en: http://www.riffraff.info From brian.suda at gmail.com Mon Jun 16 08:47:40 2008 From: brian.suda at gmail.com (Brian Suda) Date: Mon Jun 16 08:47:46 2008 Subject: [uf-dev] badly formatted hCard? (elements in children of class) In-Reply-To: <828083e70806160627n464bb71l3338373854e58e94@mail.gmail.com> References: <828083e70806160627n464bb71l3338373854e58e94@mail.gmail.com> Message-ID: <21e770780806160847h374e18cblbff2c882a280b381@mail.gmail.com> On 6/16/08, gabriele renzi wrote: > given this code [..] found online[1], am I correct in asuming that it is badly formatted? > Specifically, the "url" property is in a SPAN element, so I would use > "Melanie Kl??" as its value, but I take it that the author wanted me > to use the child element, so the href value of the A tag. --- you are correct in you parsing rules, if the class="url" is on the span, then it would use "Melanie Kl??" and if you did want the HTTP value, you would need to move the class="url" onto the 'a' element. > Is this behaviour correct, or shall I do it differently (and report > bugs to other uf-parser authors) ? --- you parsing and the parsing of other parsers is correct, it seems to be an issue on the website[1], it would be best to report the issue to them and help get it corrected. Thanks for spotting that one, -brian > [1] http://weblog.netzgeschaedigt.de/?p=763 -- brian suda http://suda.co.uk From rff.rff at gmail.com Mon Jun 16 09:05:43 2008 From: rff.rff at gmail.com (gabriele renzi) Date: Mon Jun 16 09:05:50 2008 Subject: [uf-dev] badly formatted hCard? (elements in children of class) In-Reply-To: <21e770780806160847h374e18cblbff2c882a280b381@mail.gmail.com> References: <828083e70806160627n464bb71l3338373854e58e94@mail.gmail.com> <21e770780806160847h374e18cblbff2c882a280b381@mail.gmail.com> Message-ID: <828083e70806160905w2d399908gdac11d7b01f03681@mail.gmail.com> On Mon, Jun 16, 2008 at 4:47 PM, Brian Suda wrote: > On 6/16/08, gabriele renzi wrote: >> given this code [..] found online[1], am I correct in asuming that it is badly formatted? >> Specifically, the "url" property is in a SPAN element, so I would use >> "Melanie Kl??" as its value, but I take it that the author wanted me >> to use the child element, so the href value of the A tag. > > --- you are correct in you parsing rules, if the class="url" is on the > span, then it would use "Melanie Kl??" and if you did want the HTTP > value, you would need to move the class="url" onto the 'a' element. yay :D would it make sense to add something on the lines of this to the test suite? I can provide a patch if needed, since I already have it in my test suite. >> Is this behaviour correct, or shall I do it differently (and report >> bugs to other uf-parser authors) ? > > --- you parsing and the parsing of other parsers is correct, it seems > to be an issue on the website[1], it would be best to report the issue > to them and help get it corrected. > I will, thanks again. -- goto 10: http://www.goto10.it blog it: http://riffraff.blogsome.com blog en: http://www.riffraff.info From dangiankit at gmail.com Thu Jun 19 02:30:37 2008 From: dangiankit at gmail.com (Ankit Dangi) Date: Thu Jun 19 02:30:59 2008 Subject: [uf-dev] XFN needs a tool for recommending users to help link to blog posts Message-ID: Hi XFN Mates, As per my understanding of XFN, I feel, it allows the owner of the blog to link to his/her friend's blog. Probably, that's what XFN stands for too. It seems to be a manual task. And, also not full-proof. My reasons are mentioned below. There are three things which I see, in concern to XFN, they are - a blog, blog posts, and a blog roll. As far as my understanding has developed over years, a blog is a container for blog posts, and blog roll. On the blog, the user links to his/her friend's blog via blog roll, but might also link to his/her friend's specific blog post, in his/her own blog post. We add XFN to the blog roll, and NOT to the blog post, of course we can! (Refer: http://gmpg.org/xfn/faq). But, if we seriously want the Friends Network to be strong enough, then, I see, a need for a tool which shall detect the user's friends URL (as matching against the ones at the blog roll), from the user's blog posts, and recommend the user to update those links, and add XFN accordingly. The user, then, has a choice to, add XFN to those links too, of which there are higher chances that those links go unnoticed. *The key idea is to recommend the user (that XFN is applicable, and useful) of/to any link that he/she is making to his/her friend's blog post.*Probably, using which, similar blog posts could be identified, from a friend's group, and they might be able to collaborate, in a better way. Adding the true sense, to the XHTML Friends Network (XFN). Cross posted at *microformats-discussmailing list. * -- Ankit Dangi -------------- next part -------------- An HTML attachment was scrubbed... URL: http://microformats.org/discuss/mail/microformats-dev/attachments/20080619/08cd78da/attachment.html From fberriman at gmail.com Fri Jun 20 03:40:40 2008 From: fberriman at gmail.com (Frances Berriman) Date: Fri Jun 20 03:40:42 2008 Subject: [uf-dev] Using class for non-human data Message-ID: Hey all, Firstly, with my BBC hat on, I wanted to point out that our Standards and Guidelines group have recently added a few additional clauses to our semantic markup standards. They are as follows (I don't think the most recent document is available yet, but I'll certainly link through to it when it's available): -- 5.1. Title attributes MUST contain human-readable data. -- 8.1. You MAY use microformats on your site where there are agreed specifications (refer to the Microformats community wiki site for details) with the exception of those that use the title attribute of HTML's abbr element. 8.1.1. Some microformats use the abbr element to conceal machine-readable data; for example, date-times and geographical coordinates. For screen-reader users that expand abbreviations they will hear the full date-time or coordinate; for example 2008-05-15T19:30:00+01:00 instead of 19:30. 8.1.2. If you want to use microformats in the abbr element you MUST first discuss this with the Editor, Standards and Guidelines. 8.2. If you do use microformats, you MUST ensure that the title attribute contains human-readable data. See also Title attributes above. -- Consequently, we've been looking at the machine-data proposals in the hope that we'll be able to keep using things like hCalendar. After having a chat with Ben about that document, we (myself and colleagues) have these additional concerns with the proposed solution: * The empty tag causes potential problems in CMS implementations (i.e. some of our tools, for example, will publish instead of the desired empty element). * Using two elements for one job. * The data is not discretely associated with what it *should* be surrounding. * Future proof? What if screen readers did start to implement always expanding title attributes, even on empty elements? Additionally, we felt a concern about using empty elements could encourage bad practices and also with our new (but not necessarily irreversible) guidelines about the contents of title attributes, we're a little stuck. Being that our main concerns centre around the questionable use of "title", we've been looking at the idea of using "class" instead. Something along the lines of: 10 o'clock on the 10th The pros to this would be that it's non-harmful and the HTML spec does suggest that user-agent data may be stored in class. On the downside, the semantics again could be questionable (but arguably less so than the semantics of title). What I'm interested in talking about is what other problems arise from using "class". Can it be used, should it be used and what problems could there be from the parsers point of view? Have we missed something fundamental about why we don't already use the class attribute more often? I'll create a wiki page for this shortly (any preference where you'd like this to live, anyone?). Just wanted to get this out there. Cheers :) F -- Frances Berriman http://fberriman.com From brian.suda at gmail.com Fri Jun 20 04:23:06 2008 From: brian.suda at gmail.com (Brian Suda) Date: Fri Jun 20 04:23:09 2008 Subject: [uf-dev] Using class for non-human data In-Reply-To: References: Message-ID: <21e770780806200423v7104791sdc476037c3a260a3@mail.gmail.com> On 6/20/08, Frances Berriman wrote: > Being that our main concerns centre around the questionable use of > "title", we've been looking at the idea of using "class" instead. > Something along the lines of: > > 10 o'clock on the 10th > > I'll create a wiki page for this shortly (any preference where you'd > like this to live, anyone?). Just wanted to get this out there. ---- much of this discussion has already happened and is documented here: http://microformats.org/wiki/datetime-design-pattern We can add, rebut, expand on what is there. -brian -- brian suda http://suda.co.uk From fberriman at gmail.com Sat Jun 21 05:56:19 2008 From: fberriman at gmail.com (Frances Berriman) Date: Sat Jun 21 05:56:22 2008 Subject: [uf-dev] Using class for non-human data In-Reply-To: <21e770780806200423v7104791sdc476037c3a260a3@mail.gmail.com> References: <21e770780806200423v7104791sdc476037c3a260a3@mail.gmail.com> Message-ID: 2008/6/20 Brian Suda : > On 6/20/08, Frances Berriman wrote: >> Being that our main concerns centre around the questionable use of >> "title", we've been looking at the idea of using "class" instead. >> Something along the lines of: >> >> 10 o'clock on the 10th >> >> I'll create a wiki page for this shortly (any preference where you'd >> like this to live, anyone?). Just wanted to get this out there. > > ---- much of this discussion has already happened and is documented here: > http://microformats.org/wiki/datetime-design-pattern > We can add, rebut, expand on what is there. > > -brian Cool - started a new section. http://microformats.org/wiki/datetime-design-pattern#Machine-data_in_class -- Frances Berriman http://fberriman.com From lists at ben-ward.co.uk Sat Jun 21 13:43:04 2008 From: lists at ben-ward.co.uk (Ben Ward) Date: Sat Jun 21 13:43:36 2008 Subject: [uf-dev] [value-excerption-pattern] Resolve Depth of Parsing Message-ID: <90CF2E70-3192-4166-832F-30ADDF86FCC6@ben-ward.co.uk> Hi devs, I'm back on the value-excerption-pattern issues list, and working on the open Depth of Parsing issue. I'll quote it here for your convenience, but it's live on the wiki here: >
>

Party on Sunday!

>
Tuesday class="value">2008-06-17
>

We're having a party on > Sunday, at 7pm! > 2008-06-22T19:00:00+0100 > . > Please bring your friends! >

>
Where the parsing rules for value-excerption-pattern parse all descendants by default, that results in the follow hAtom structure: > ENTRY > ENTRY-TITLE=Party on Sunday! > UPDATED=2008-06-17 > PUBLISHED=2008-06-17 > ENTRY-CONTENT=2008-06-22T19:00:00+0100 Note, this is not a case of one microformat embedded within another ? which alone could be resolved by including the ?mfo? pattern in this spec (assuming it were seen as a good idea, which would be debated in itself). Instead, I propose the following parsing behaviour. It would solve this issue, and would not introduce additional processing instructions to the class attribute for properties (or root nodes). So: * Specify that by default, parsers only parse *children* of the parent element and not all descendants * Ideally that would be it, Toby I expressed children-only is too restrictive, so also provision for individual properties to override the child-only default, and instead parse *all* descendants (where we do not feel a child will not contain other properties) This would result in a parse-depth flag on all fields, with some getting overridden to parse all descendants, which can be well structured solution and documented. Property name dictionaries in parsers would have to include the depth flag with the affected properties. I think this is a better solution than adding mfo and an equivalent property level processing flag ? which is lot of publishing complexity ? and I think it makes most sense to default to the more conservative model (children only) with overriding to a liberal descendants parse for properties where it is required. Feedback on this behaviour would be greatly appreciated, Thanks, Ben From fberriman at gmail.com Mon Jun 23 03:12:53 2008 From: fberriman at gmail.com (Frances Berriman) Date: Mon Jun 23 03:12:56 2008 Subject: [uf-dev] Using class for non-human data In-Reply-To: References: <21e770780806200423v7104791sdc476037c3a260a3@mail.gmail.com> Message-ID: On 21/06/2008, Frances Berriman wrote: > 2008/6/20 Brian Suda : > > > On 6/20/08, Frances Berriman wrote: > >> Being that our main concerns centre around the questionable use of > >> "title", we've been looking at the idea of using "class" instead. > >> Something along the lines of: > >> > >> 10 o'clock on the 10th > >> > >> I'll create a wiki page for this shortly (any preference where you'd > >> like this to live, anyone?). Just wanted to get this out there. > > > > ---- much of this discussion has already happened and is documented here: > > http://microformats.org/wiki/datetime-design-pattern > > We can add, rebut, expand on what is there. > > > > -brian > > > Cool - started a new section. > > > http://microformats.org/wiki/datetime-design-pattern#Machine-data_in_class > Again, more information pertaining to this. The Programmes team have just announced their upcoming removal of hCal from /programmes to backstage.bbc. http://www.bbc.co.uk/blogs/radiolabs/2008/06/removing_microformats_from_bbc.shtml -- Frances Berriman http://fberriman.com From glenn.jones at madgex.com Mon Jun 23 08:12:24 2008 From: glenn.jones at madgex.com (Glenn Jones) Date: Mon Jun 23 08:12:40 2008 Subject: [uf-dev] Using class for non-human data Message-ID: <36A319113CF910438942741C4727ADFF020DF7E0@MOBY.Clarence.local> >> Again, more information pertaining to this. The Programmes team have just announced their upcoming removal of hCal from /programmes >> to backstage.bbc. >> http://www.bbc.co.uk/blogs/radiolabs/2008/06/removing_microformats_from_ bbc.shtml I must say that although I am equal frustrate that there has not been a resolve the abbreviation design pattern accessible issue, the BBC response seems like a heavy handed ploy to force things. I sort of like the suggestion that Frances put forward, as Toby said on the wiki "least harmful solution proposed so far". It should not take too much to add this the UfXtract. I would have only used this pattern for data types meant as machine alternatives which remove human ambiguity. Datetime Durations Timezones Geo These formats do not use spaces and resolve some of parsing issue Toby raised on the wiki page. The following may of been easier for authors to understand? 10 o'clock
The use of {} for data is becoming more popular with OpenSearch etc. It directly links the property and value. Glenn Jones From fberriman at gmail.com Mon Jun 23 08:42:59 2008 From: fberriman at gmail.com (Frances Berriman) Date: Mon Jun 23 08:43:03 2008 Subject: [uf-dev] Using class for non-human data In-Reply-To: <36A319113CF910438942741C4727ADFF020DF7E0@MOBY.Clarence.local> References: <36A319113CF910438942741C4727ADFF020DF7E0@MOBY.Clarence.local> Message-ID: On 23/06/2008, Glenn Jones wrote: > The following may of been easier for authors to understand? > > 10 o'clock > > The use of {} for data is becoming more popular with OpenSearch etc. It > directly links the property and value. Merging it like that wouldn't be ideal for styling (we did toy with the idea of dstart-2005-10-10T10:10:10-0100, for example). The data- prefix would make the same data available to more than one attribute too - rather than having to repeat the same data more than once if it happens to be in the same element. I'm not sure if it's really easier to understand, to be honest. -- Frances Berriman http://fberriman.com From csarven at gmail.com Mon Jun 23 08:57:33 2008 From: csarven at gmail.com (Sarven Capadisli) Date: Mon Jun 23 08:57:38 2008 Subject: [uf-dev] Using class for non-human data In-Reply-To: References: <36A319113CF910438942741C4727ADFF020DF7E0@MOBY.Clarence.local> Message-ID: Earlier this year, Andy Mabbett proposed a clear use of the "data" prefix here: http://microformats.org/discuss/mail/microformats-discuss/2008-February/011583.html Doesn't conflict with styling. -Sarven On Mon, Jun 23, 2008 at 10:42 AM, Frances Berriman wrote: > On 23/06/2008, Glenn Jones wrote: > >> The following may of been easier for authors to understand? >> >> 10 o'clock >> >> The use of {} for data is becoming more popular with OpenSearch etc. It >> directly links the property and value. > > Merging it like that wouldn't be ideal for styling (we did toy with > the idea of dstart-2005-10-10T10:10:10-0100, for example). > > The data- prefix would make the same data available to more than one > attribute too - rather than having to repeat the same data more than > once if it happens to be in the same element. > > I'm not sure if it's really easier to understand, to be honest. > > > > -- > Frances Berriman > http://fberriman.com > _______________________________________________ > microformats-dev mailing list > microformats-dev@microformats.org > http://microformats.org/mailman/listinfo/microformats-dev > From mkaply at us.ibm.com Mon Jun 23 08:57:57 2008 From: mkaply at us.ibm.com (Michael Kaply) Date: Mon Jun 23 08:58:23 2008 Subject: [uf-dev] Using class for non-human data In-Reply-To: <36A319113CF910438942741C4727ADFF020DF7E0@MOBY.Clarence.local> Message-ID: > > The following may of been easier for authors to understand? > > 10 o'clock > > The use of {} for data is becoming more popular with OpenSearch etc. It > directly links the property and value. But how would you detect this in a parser? Currently we look for a class of dtstart. how would you do a getElementsByClassName? I personally don't like the BBC suggestion at all. Hiding data in the class tag just seems like a hack. Especially since I have to look at every class attribute to decide if it is data for the microformat. I'd almost rather use a non standard attribute. Michael Kaply Firefox Advocate mkaply@us.ibm.com http://www.kaply.com/weblog/ (External Blog) http://blogs.tap.ibm.com/weblogs/page/mkaply@us.ibm.com (Internal Blog) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://microformats.org/discuss/mail/microformats-dev/attachments/20080623/891e9c2e/attachment.html From mail at tobyinkster.co.uk Mon Jun 23 09:01:10 2008 From: mail at tobyinkster.co.uk (Toby A Inkster) Date: Mon Jun 23 09:01:37 2008 Subject: [uf-dev] Using class for non-human data Message-ID: <37129E32-A67B-44A3-BB57-4C3C1FE456BE@tobyinkster.co.uk> Of course the other approach is to say "to hell with validity" and embrace RDFa's "content" attribute, which can be introduced in a very easy and straight-forward manner without using the rest of RDFa: Today (Of course, this *can* be made to be valid by using a custom DTD, or indeed the RDFa DTD.) -- Toby A Inkster From jaffathecake at gmail.com Mon Jun 23 09:43:15 2008 From: jaffathecake at gmail.com (Jake Archibald) Date: Mon Jun 23 09:43:20 2008 Subject: [uf-dev] Using class for non-human data In-Reply-To: References: <36A319113CF910438942741C4727ADFF020DF7E0@MOBY.Clarence.local> Message-ID: <3be0bf100806230943i69524869w654648f77633a750@mail.gmail.com> 2008/6/23 Michael Kaply : > > > But how would you detect this in a parser? Currently we look for a class of > dtstart. how would you do a getElementsByClassName? > > > I personally don't like the BBC suggestion at all. Hiding data in the class > tag just seems like a hack. Especially since I have to look at every class > attribute to decide if it is data for the microformat. > > I'd almost rather use a non standard attribute. > It is a hack, but so is using title. I find using class less hacky because the data doesn't end up in a human readable space (as title does). "For general purpose processing by user agents" is what the HTML spec says of the class attribute. But yes, the dtstart class should remain, followed by a separate data class. In implementations and standards, the class attribute has always been for machine data. This is not true of title. Jake. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://microformats.org/discuss/mail/microformats-dev/attachments/20080623/adab769d/attachment.html From guillaume at lebleu.org Mon Jun 23 14:09:51 2008 From: guillaume at lebleu.org (Guillaume Lebleu) Date: Mon Jun 23 14:10:13 2008 Subject: [uf-dev] Using class for non-human data In-Reply-To: <37129E32-A67B-44A3-BB57-4C3C1FE456BE@tobyinkster.co.uk> References: <37129E32-A67B-44A3-BB57-4C3C1FE456BE@tobyinkster.co.uk> Message-ID: <4860111F.5010104@lebleu.org> Toby A Inkster wrote: > Of course the other approach is to say "to hell with validity" and > embrace RDFa's "content" attribute, which can be introduced in a very > easy and straight-forward manner without using the rest of RDFa Having followed the discussions on this matter for some time, it seems to me that we are indeed reaching a limit here, in terms of keeping both compliant with XHTML semantics and adhering to a (unwritten?) principle that microformats should not influence how the human-readable content is written in the first place. For those implementations not willing to say "to hell with validity", could they get away with a machine-readable content for dates that gets formatted in a human friendly way in JavaScript for display to humans? For instance, the HTML would be 2005-10-10T10:10:10-0100, but by way of a "data pretty printer" (something like http://ejohn.org/blog/javascript-pretty-date/), it would be displayed as "10:10am on October 10th 2005". Is this a heresy? What do you think? Guillaume From norm at cackhanded.net Mon Jun 23 14:21:46 2008 From: norm at cackhanded.net (Mark Norman Francis) Date: Mon Jun 23 14:21:51 2008 Subject: [uf-dev] Using class for non-human data In-Reply-To: <36A319113CF910438942741C4727ADFF020DF7E0@MOBY.Clarence.local> References: <36A319113CF910438942741C4727ADFF020DF7E0@MOBY.Clarence.local> Message-ID: <79025A92-1F49-4863-AB58-D4792C8DA6BC@cackhanded.net> > I must say that although I am equal frustrate that there has not > been a > resolve the abbreviation design pattern accessible issue, the BBC > response seems like a heavy handed ploy to force things. I just want to say as a little aside, I didn't take it as a ploy nor as heavy-handed myself. Although this could be a side-effect of my having made the same decision for what is probably very similar reasons in my job at Y!. I just didn't blog about it openly, whereas the BBC did. It's a matter of priorities -- and it would seem that, for the people who set the semantic standards at the BBC, accessibility and clarity of content for humans takes priority over encoding data to be machine readable. -- Norm. From brady.k at gmail.com Mon Jun 23 14:31:17 2008 From: brady.k at gmail.com (Kyle Brady) Date: Mon Jun 23 14:31:21 2008 Subject: [uf-dev] Implementation Question Message-ID: Hi, I've recently started working on a project that I've dubbed "mySocialBlog" ( http://code.google.com/p/my-social-blog), and was wondering if anyone would be interested in being the "microformats expert" on this project? Basically, I'm trying to create a way for people to implement the same social information they might put on Facebook on their blog, using CSV [exported spreadsheets, for now], and microformat it. In essence, an "all about me" network... not really social, but the profile aspect of social networks. Anyways, I want to make sure I'm doing it right, and could use some help. If you want to check out the progress so far, see it on my blog ( http://www.kyle-brady.com/my-library is a good example)... the code release is coming soon. Thanks, and hope to hear from some of you! -- Kyle Brady 750 Miller St., Apt. 404 San Jose, California 95110 408-828-3861 My Business: http://www.int-ind.com My OneSwirl: http://www.oneswirl.com/KyleBrady [all contact methods available at OneSwirl] -------------- next part -------------- An HTML attachment was scrubbed... URL: http://microformats.org/discuss/mail/microformats-dev/attachments/20080623/11362065/attachment-0001.html From aconbere at gmail.com Mon Jun 23 14:58:10 2008 From: aconbere at gmail.com (anders conbere) Date: Mon Jun 23 14:58:12 2008 Subject: [uf-dev] Implementation Question In-Reply-To: References: Message-ID: <8ca3fbe80806231458s60273f49i21856ad0d3975dea@mail.gmail.com> On Mon, Jun 23, 2008 at 2:31 PM, Kyle Brady wrote: > Hi, > > I've recently started working on a project that I've dubbed "mySocialBlog" > (http://code.google.com/p/my-social-blog), and was wondering if anyone would > be interested in being the "microformats expert" on this project? > > Basically, I'm trying to create a way for people to implement the same > social information they might put on Facebook on their blog, using CSV > [exported spreadsheets, for now], and microformat it. In essence, an "all > about me" network... not really social, but the profile aspect of social > networks. > > Anyways, I want to make sure I'm doing it right, and could use some help. > If you want to check out the progress so far, see it on my blog > (http://www.kyle-brady.com/my-library is a good example)... the code release > is coming soon. Not sure if you've talked to them, but it might be interesting for you to talk with the Diso project. ~ Anders > > Thanks, and hope to hear from some of you! > > -- > Kyle Brady > 750 Miller St., Apt. 404 > San Jose, California 95110 > 408-828-3861 > > My Business: http://www.int-ind.com > My OneSwirl: http://www.oneswirl.com/KyleBrady > > [all contact methods available at OneSwirl] > _______________________________________________ > microformats-dev mailing list > microformats-dev@microformats.org > http://microformats.org/mailman/listinfo/microformats-dev > > From danbri at danbri.org Mon Jun 23 14:58:45 2008 From: danbri at danbri.org (Dan Brickley) Date: Mon Jun 23 14:58:50 2008 Subject: [uf-dev] Using class for non-human data In-Reply-To: <3be0bf100806230943i69524869w654648f77633a750@mail.gmail.com> References: <36A319113CF910438942741C4727ADFF020DF7E0@MOBY.Clarence.local> <3be0bf100806230943i69524869w654648f77633a750@mail.gmail.com> Message-ID: <48601C95.2080205@danbri.org> Jake Archibald wrote: > 2008/6/23 Michael Kaply >: > > > > But how would you detect this in a parser? Currently we look for a > class of dtstart. how would you do a getElementsByClassName? > > > > I personally don't like the BBC suggestion at all. Hiding data in > the class tag just seems like a hack. Especially since I have to > look at every class attribute to decide if it is data for the > microformat. > > I'd almost rather use a non standard attribute. > > > It is a hack, but so is using title. I find using class less hacky > because the data doesn't end up in a human readable space (as title > does). "For general purpose processing by user agents" is what the HTML > spec says of the class attribute. > > But yes, the dtstart class should remain, followed by a separate data class. > > In implementations and standards, the class attribute has always been > for machine data. This is not true of title. That's my reading too; 'class' seems a home worth investigating for this data... Dan -- http://danbri.org/ From brady.k at gmail.com Mon Jun 23 15:14:33 2008 From: brady.k at gmail.com (Kyle Brady) Date: Mon Jun 23 15:14:37 2008 Subject: [uf-dev] Implementation Question In-Reply-To: <8ca3fbe80806231458s60273f49i21856ad0d3975dea@mail.gmail.com> References: <8ca3fbe80806231458s60273f49i21856ad0d3975dea@mail.gmail.com> Message-ID: I've heard of them, only just recently actually, but are you saying they would be interested in helping me with the microformats check? Or that they are doing something like I am? Thanks -- Kyle Brady 750 Miller St., Apt. 404 San Jose, California 95110 408-828-3861 My Business: http://www.int-ind.com My Blog: http://www.kyle-brady.com -------------- next part -------------- An HTML attachment was scrubbed... URL: http://microformats.org/discuss/mail/microformats-dev/attachments/20080623/c4bee9d0/attachment.html From jaffathecake at gmail.com Tue Jun 24 00:00:38 2008 From: jaffathecake at gmail.com (Jake Archibald) Date: Tue Jun 24 00:00:42 2008 Subject: [uf-dev] Using class for non-human data In-Reply-To: <4860111F.5010104@lebleu.org> References: <37129E32-A67B-44A3-BB57-4C3C1FE456BE@tobyinkster.co.uk> <4860111F.5010104@lebleu.org> Message-ID: <3be0bf100806240000q2b34da95lc895c1016f7707c8@mail.gmail.com> Any solution which requires CSS, JavaScript, prevents HTML4 / XHTML validation, or puts machine data in a human readable place isn't really an option for the BBC (and sites with a similar range of users). For me, the great thing about microformats is they don't break validation and *shouldn't* impact on usability & accessibility. On 6/23/08, Guillaume Lebleu wrote: > Toby A Inkster wrote: >> Of course the other approach is to say "to hell with validity" and >> embrace RDFa's "content" attribute, which can be introduced in a very >> easy and straight-forward manner without using the rest of RDFa > Having followed the discussions on this matter for some time, it seems > to me that we are indeed reaching a limit here, in terms of keeping both > compliant with XHTML semantics and adhering to a (unwritten?) principle > that microformats should not influence how the human-readable content is > written in the first place. > > For those implementations not willing to say "to hell with validity", > could they get away with a machine-readable content for dates that gets > formatted in a human friendly way in JavaScript for display to humans? > > For instance, the HTML would be 2005-10-10T10:10:10-0100, but by way of a "data pretty > printer" (something like http://ejohn.org/blog/javascript-pretty-date/), > it would be displayed as "10:10am on October 10th 2005". > > Is this a heresy? What do you think? > > Guillaume > > > _______________________________________________ > microformats-dev mailing list > microformats-dev@microformats.org > http://microformats.org/mailman/listinfo/microformats-dev > -- Sent from Google Mail for mobile | mobile.google.com From andr3.pt at gmail.com Tue Jun 24 04:52:23 2008 From: andr3.pt at gmail.com (=?ISO-8859-1?Q?Andr=E9_Lu=EDs?=) Date: Tue Jun 24 04:52:27 2008 Subject: [uf-dev] Using class for non-human data In-Reply-To: <48601C95.2080205@danbri.org> References: <36A319113CF910438942741C4727ADFF020DF7E0@MOBY.Clarence.local> <3be0bf100806230943i69524869w654648f77633a750@mail.gmail.com> <48601C95.2080205@danbri.org> Message-ID: Since the problems are arising from machine-data ending up on human-readable attributes, why can't we "compromise" and accept to have machine-data-values on non-human-readable attributes? Also, extending the document with namespaces limits the usage to xhtml, and according to POSH principles, we don't want that. Like you guys mentioned, leaving the dtstart but adding an extra value... would it be too much of a hassle for parsers? Today 1. grab elementByClassName( dtstart ) 2. get classnames as array 3. grab classname after dtstart(ie, i+1, i being the index of dtstart), does it match /data{[^}]*}/ ? 4. if yes, use it as value. What's so wrong with this approach? Isn't it widely accepted that this is the achilles' heel of all design patterns used by microformats? We must start accepting the fact that without extending html we don't have much attributes to choose from... -- Andr? Lu?s On Mon, Jun 23, 2008 at 10:58 PM, Dan Brickley wrote: > Jake Archibald wrote: >> >> 2008/6/23 Michael Kaply >: >> >> >> >> But how would you detect this in a parser? Currently we look for a >> class of dtstart. how would you do a getElementsByClassName? >> >> >> I personally don't like the BBC suggestion at all. Hiding data in >> the class tag just seems like a hack. Especially since I have to >> look at every class attribute to decide if it is data for the >> microformat. >> >> I'd almost rather use a non standard attribute. >> >> >> It is a hack, but so is using title. I find using class less hacky because >> the data doesn't end up in a human readable space (as title does). "For >> general purpose processing by user agents" is what the HTML spec says of the >> class attribute. >> >> But yes, the dtstart class should remain, followed by a separate data >> class. >> >> In implementations and standards, the class attribute has always been for >> machine data. This is not true of title. > > That's my reading too; 'class' seems a home worth investigating for this > data... > > Dan > > -- > http://danbri.org/ > _______________________________________________ > microformats-dev mailing list > microformats-dev@microformats.org > http://microformats.org/mailman/listinfo/microformats-dev > From eivindu at ifi.uio.no Tue Jun 24 05:13:57 2008 From: eivindu at ifi.uio.no (Eivind Uggedal) Date: Tue Jun 24 05:14:01 2008 Subject: [uf-dev] Using class for non-human data In-Reply-To: References: <36A319113CF910438942741C4727ADFF020DF7E0@MOBY.Clarence.local> <3be0bf100806230943i69524869w654648f77633a750@mail.gmail.com> <48601C95.2080205@danbri.org> Message-ID: <824b51d00806240513x36e9009cof64436ff7550c811@mail.gmail.com> > Today > > 1. grab elementByClassName( dtstart ) > 2. get classnames as array > 3. grab classname after dtstart(ie, i+1, i being the index of > dtstart), does it match /data{[^}]*}/ ? It would be potentially dangerous to have assumptions of the ordering of class names. Another level of unneeded complexity. This snippet should also be parseable: Today -- Cheers, Eivind Uggedal Engineer, Faculty of Social Science, MSc Computer Science, University of Oslo From andr3.pt at gmail.com Tue Jun 24 05:24:07 2008 From: andr3.pt at gmail.com (=?ISO-8859-1?Q?Andr=E9_Lu=EDs?=) Date: Tue Jun 24 05:24:10 2008 Subject: [uf-dev] Using class for non-human data In-Reply-To: <824b51d00806240513x36e9009cof64436ff7550c811@mail.gmail.com> References: <36A319113CF910438942741C4727ADFF020DF7E0@MOBY.Clarence.local> <3be0bf100806230943i69524869w654648f77633a750@mail.gmail.com> <48601C95.2080205@danbri.org> <824b51d00806240513x36e9009cof64436ff7550c811@mail.gmail.com> Message-ID: On Tue, Jun 24, 2008 at 1:13 PM, Eivind Uggedal wrote: >> Today >> >> 1. grab elementByClassName( dtstart ) >> 2. get classnames as array >> 3. grab classname after dtstart(ie, i+1, i being the index of >> dtstart), does it match /data{[^}]*}/ ? > > It would be potentially dangerous to have assumptions of the ordering > of class names. Another level of unneeded complexity. This snippet > should also be parseable: > > Today > > -- Eivind, I understand that. I was trying provide an example that allowed multiple classnames + associated values within the same element. I agree it's added complexity... if you never really need to add extra classnames to the same element and specify their data values, it's perfectly fine using whatever data{.*} you find (first?). :) Oh one thing I haven't seen mentioned is... this doesn't have to _replace_ abbr design pattern, does it? If parsers added this way to parse values, authors with accessibility concerns could use this instead, while avoding breaking current deployments of abbr DP. -- Andr? From scott at randomchaos.com Tue Jun 24 05:37:59 2008 From: scott at randomchaos.com (Scott Reynen) Date: Tue Jun 24 05:38:11 2008 Subject: [uf-dev] Using class for non-human data In-Reply-To: References: <36A319113CF910438942741C4727ADFF020DF7E0@MOBY.Clarence.local> <3be0bf100806230943i69524869w654648f77633a750@mail.gmail.com> <48601C95.2080205@danbri.org> Message-ID: <85F3290F-AA48-4A70-9FEB-00046724A2CB@randomchaos.com> On [Jun 24], at [ Jun 24] 5:52 , Andr? Lu?s wrote: > Today abbr> > > 1. grab elementByClassName( dtstart ) > 2. get classnames as array > 3. grab classname after dtstart(ie, i+1, i being the index of > dtstart), does it match /data{[^}]*}/ ? > 4. if yes, use it as value. > > What's so wrong with this approach? I'd say there's nothing "so" wrong about it, but there are problems. Specifically, "data{2008-06-23}" doesn't seem to be an actual classification of "Today." It doesn't make much sense to say "Today belongs to the class data{2008-06-23}." But the HTML spec says " the element may be said to belong to these classes." Unfortunately we may not see the practical implications of such a seemingly insignificant deviation from the spec until after a decision is made, as happened with the abbr pattern. Another seemingly small issue: this solution binds us to machine-readable data formats that have no spaces. These may not be reasons to discard this solution, but I hope they're at least reasons to more thoroughly research potential problems so we don't make the same type of mistake again. Peace, Scott From fberriman at gmail.com Tue Jun 24 05:56:06 2008 From: fberriman at gmail.com (Frances Berriman) Date: Tue Jun 24 05:56:21 2008 Subject: [uf-dev] Using class for non-human data In-Reply-To: References: <36A319113CF910438942741C4727ADFF020DF7E0@MOBY.Clarence.local> <3be0bf100806230943i69524869w654648f77633a750@mail.gmail.com> <48601C95.2080205@danbri.org> <824b51d00806240513x36e9009cof64436ff7550c811@mail.gmail.com> Message-ID: On 24/06/2008, Andr? Lu?s wrote: > Oh one thing I haven't seen mentioned is... this doesn't have to > _replace_ abbr design pattern, does it? If parsers added this way to > parse values, authors with accessibility concerns could use this > instead, while avoding breaking current deployments of abbr DP. > No - I don't think it should replace it. If an author feels the abbr is the correct option, they should still be able to use that. -- Frances Berriman http://fberriman.com From mail at tobyinkster.co.uk Tue Jun 24 06:40:58 2008 From: mail at tobyinkster.co.uk (Toby A Inkster) Date: Tue Jun 24 06:41:29 2008 Subject: [uf-dev] Using class for non-human data Message-ID: <1EB717A2-C738-4D1C-A166-DFBEEF55CA61@tobyinkster.co.uk> Scott Reynen wrote: > Another seemingly small issue: this solution > binds us to machine-readable data formats that have no spaces. If you take a look at the Wiki section for this proposal, you'll see details of my experimental implementation of this pattern. It allows publishers to percent-encode characters such as spaces, which can't occur in class names. For example: UK (Though of course, in the example above, the design pattern is perfectly accessible.) http://microformats.org/wiki/datetime-design-pattern#Machine- data_in_class -- Toby A Inkster From jaffathecake at gmail.com Tue Jun 24 08:54:42 2008 From: jaffathecake at gmail.com (Jake Archibald) Date: Tue Jun 24 08:54:46 2008 Subject: [uf-dev] Using class for non-human data In-Reply-To: <85F3290F-AA48-4A70-9FEB-00046724A2CB@randomchaos.com> References: <36A319113CF910438942741C4727ADFF020DF7E0@MOBY.Clarence.local> <3be0bf100806230943i69524869w654648f77633a750@mail.gmail.com> <48601C95.2080205@danbri.org> <85F3290F-AA48-4A70-9FEB-00046724A2CB@randomchaos.com> Message-ID: <3be0bf100806240854x4eacb35epec1590b4771d8b2e@mail.gmail.com> One possible issue with the data{blah} pattern, if you were to point at that with a css selector, you'd need to escape the curly braces. span.dtstart.data\{20080101\} { color:red; } obviously the above wouldn't work at all in IE6, but you see what I'm getting at. This wouldn't be an issue with data-blah. On 6/24/08, Scott Reynen wrote: > On [Jun 24], at [ Jun 24] 5:52 , Andr? Lu?s wrote: > >> Today> abbr> >> >> 1. grab elementByClassName( dtstart ) >> 2. get classnames as array >> 3. grab classname after dtstart(ie, i+1, i being the index of >> dtstart), does it match /data{[^}]*}/ ? >> 4. if yes, use it as value. >> >> What's so wrong with this approach? > > > I'd say there's nothing "so" wrong about it, but there are problems. > Specifically, "data{2008-06-23}" doesn't seem to be an actual > classification of "Today." It doesn't make much sense to say "Today > belongs to the class data{2008-06-23}." But the HTML spec says " the > element may be said to belong to these classes." Unfortunately we may > not see the practical implications of such a seemingly insignificant > deviation from the spec until after a decision is made, as happened > with the abbr pattern. Another seemingly small issue: this solution > binds us to machine-readable data formats that have no spaces. These > may not be reasons to discard this solution, but I hope they're at > least reasons to more thoroughly research potential problems so we > don't make the same type of mistake again. > > Peace, > Scott > > > _______________________________________________ > microformats-dev mailing list > microformats-dev@microformats.org > http://microformats.org/mailman/listinfo/microformats-dev > -- Sent from Google Mail for mobile | mobile.google.com From guillaume at lebleu.org Wed Jun 25 13:13:47 2008 From: guillaume at lebleu.org (Guillaume Lebleu) Date: Wed Jun 25 13:13:54 2008 Subject: [uf-dev] impact of new vCard on hCard Message-ID: <4862A6FB.8060901@lebleu.org> I noticed that the latest vCard draft specification [1] requires the content of the TEL property to be of type URI with tel scheme. Operator supports this and hCard allows it, but if I understand correctly, that would mean that an hCard compliant with this new spec would require the phone number to always be represented with an HTML anchor: +1 (415) 407 5856 Thoughts? should I add this to hCard issues on the wiki? Guillaume --- [1] http://www.ietf.org/internet-drafts/draft-ietf-vcarddav-vcardrev-02.txt Excerpt: 7.4.1. TEL Purpose: To specify the telephone number for telephony communication with the object the vCard represents. Value type: A single URI value. It is expected that the URI scheme will be "tel", as specified in [RFC3966], but other schemes MAY be used. From mail at tobyinkster.co.uk Wed Jun 25 14:09:20 2008 From: mail at tobyinkster.co.uk (Toby A Inkster) Date: Wed Jun 25 14:09:59 2008 Subject: [uf-dev] impact of new vCard on hCard Message-ID: <9D659279-6A24-4A47-BA63-046A101A0D57@tobyinkster.co.uk> Guillaume Lebleu wrote: > I noticed that the latest vCard draft specification [1] requires the > content of the TEL property to be of type URI with tel scheme. > > Operator supports this and hCard allows it, but if I understand > correctly, that would mean that an hCard compliant with this new spec > would require the phone number to always be represented with an > HTML anchor: > > +1 (415) 407 5856 > > Thoughts? should I add this to hCard issues on the wiki? I'm not sure why this would be an issue. Firstly, hCard normatively references version *3.0* of vCard. The draft spec is for vCard 4.0. Secondly, just because hCard re-uses vCard's terms and ideas, it does not follow that hCard re-uses vCard's syntax. For example, address components in vCard need to be separated by semicolons -- but they do not need to be separated by semicolons in hCard. As another example, the "N" property in vCard is always presented in family name, given name, additional name, honorific prefix, honorific suffix order, but the sub-properties of "n" in hCard may be given in any order. A change in vCard syntax does not need to carry over to hCard, as they already have entirely different syntaxes. Some of the more interesting implications of a new version of vCard is the new properties available. For example, people using hCard for geneology purposes may have been frustrated that although hCard offers "bday" for marking up a date of birth, it does not offer a corresponding property for date of death. Now that vCard has added a DDAY property for marking up a contact's date of death, it is fairly safe to say, that if ever hCard does include a property for marking up dates of death, then it will almost certainly be called "dday". Geneologists can start replacing their own custom class="date-of- death", class="died", etc markup with class="dday". Since April, Cognition has included additional support for the following vCard 4.0 properties: - kind (e.g. "individual", "org", etc) - gender - birth (place of birth) - dday - death (place of death) - impp (like "url", but for instant messaging) - lang (preferred spoken/written languages) This support is documented here: http://buzzword.org.uk/cognition/uf-plus.html#hcard I see that as of today, they've also added "related" and "member". The former will be especially interesting to fans of XFN. I'll look into implementing them in Cognition too. -- Toby A Inkster From fberriman at gmail.com Thu Jun 26 09:17:50 2008 From: fberriman at gmail.com (Frances Berriman) Date: Thu Jun 26 09:17:52 2008 Subject: [uf-dev] Re: Using class for non-human data In-Reply-To: References: Message-ID: On 20/06/2008, Frances Berriman wrote: > Hey all, > > Firstly, with my BBC hat on, I wanted to point out that our Standards > and Guidelines group have recently added a few additional clauses to > our semantic markup standards. They are as follows (I don't think the > most recent document is available yet, but I'll certainly link through > to it when it's available): As promised: http://www.bbc.co.uk/guidelines/newmedia/technical/semantic_markup.shtml#microformats -- Frances Berriman http://fberriman.com From glenn.jones at madgex.com Sun Jun 29 07:17:17 2008 From: glenn.jones at madgex.com (Glenn Jones) Date: Sun Jun 29 07:17:24 2008 Subject: [uf-dev] Human and machine readable data format Message-ID: <36A319113CF910438942741C4727ADFF02132AFF@MOBY.Clarence.local> As we turnaround on the spot about machine data issue, the question of Natural Language Processing (NPL) has come up again. The main problem with any form of NLP is there are too many ambiguities in reading dates or any other form of freeform human written text. I don't want us to go down this path it is unworkable with currently available technologies. Against this we have statements like Tantek's. "I'm vehemently opposed to putting data in the class attribute. We must find better alternatives. We must not go down the path of invisible (dark) (meta)data - IMHO that principle is inviolable for microformats." So I have tried to look at this again and reconcile the two opposing drivers above. Each time it makes me think of a mixed mode human and machine readable format. The date format which is human readable but has a very strict format which can be parsed. So rather than talk about it I have built a little prototype which demos the idea. http://ufxtract.com/experimental/hm-readable-date.htm This approach is not without its own problems, but it would provide a semantic use of the abbr pattern which does not raise any accessibility concerns. Jan 25 08 On the down side we would have to re-invent the wheel with yet another date format. This approach would make parsers a lot heavier. Authors would have to understand the strict nature of the extended format using the abbr title. etc I thought I would put this forward - to get shot down ;-) This concept could be extended to the other data formats: Date: 25 January 2008 Date: 25 January 2008 at 15:30 Date: 25 January 2008 at 15:30, Time zone +1:30 Duration: 3 minutes, 47 seconds Location: latitude 37.77, longitude -122.41 Time zone: +1:30 Rated 1 out of 5 Glenn Jones From danny.ayers at gmail.com Sun Jun 29 10:35:56 2008 From: danny.ayers at gmail.com (Danny Ayers) Date: Sun Jun 29 10:36:00 2008 Subject: [uf-dev] Human and machine readable data format In-Reply-To: <36A319113CF910438942741C4727ADFF02132AFF@MOBY.Clarence.local> References: <36A319113CF910438942741C4727ADFF02132AFF@MOBY.Clarence.local> Message-ID: <1f2ed5cd0806291035t556ec1datcbbe2340d3244a97@mail.gmail.com> 2008/6/29 Glenn Jones : > As we turnaround on the spot about machine data issue, the question of > Natural Language Processing (NPL) has come up again. The main problem > with any form of NLP is there are too many ambiguities in reading dates > or any other form of freeform human written text. I don't want us to go > down this path it is unworkable with currently available technologies. I'm sure others are more capable than I of giving good responses to your date format suggestions. But I find it interesting you should bring NLP up over here. I'm afraid I can't resist chipping in on that ;-) So the basic scenario is presumably the producer(s) wish to convey information to the consumer(s). [Either of which may be human or largely automated systems] * With an isolated Plain Old Semantic HTML document, the majority of the information is encoded in human-readable text, enhanced with markup elements (e.g. for emphasis). * With HTML+HTTP, we get extra semantics through linking - even if it's just pageA is somehow related to pageB * With microformats there can be communication of machine-readable data embedded in the HTML - caveat: as generally found in the wild, interpretation of the message from producer to consumer relies on them both having prior knowledge of the conventions of microformats.org - effectively a registry of keywords (though only discoverable with manual intervention - Google etc) - however where @profile URIs are provided, the consumer can "follow their nose" to these other resources to discover the semantics intended by the producer, * Other languages are available (notably RDF, in this context especially RDFa and microformats used in concert with GRDDL) where there is, thanks to the 'follow your nose' discovery of URIs/HTTP, a more direct route to machine-interpretability In all these cases, at the end of the chain (of authority) there will be a human element - the folks that designed the super-duper furniture ontology may have their own world view that differs from those of others in the furniture trade. They may simply have got stuff wrong. Fortunately use of URIs allows potentially conflicting statements (in data, as in Web documents) to coexist, and it's up to the consumer to apply their own judgement on what to trust (based on provenance etc). Now in the case of NLP, consumer-side heuristics will be applied to extract something from text which *may* correspond to the producer's intended message. So now not only do you have issues of provenance/trust, there's also the margin of error of the heuristics to be factored in. Overall, this seems to be a situation with a range of communication possibilities - from lo-fidelity tag soup markup up to generally unambiguous hi-fidelity communication thanks to data expressed as microformats with @profile URIs, or (more or less equivalently) using web data oriented languages such as RDF. Going back to the "extra semantics through linking" remark above, in whichever of the above approaches the data is expressed and/or interpreted, the value of that data can be significantly increased through using linked data techniques. Yeah, I had to get that in. http://en.wikipedia.org/wiki/Linked_Data Bottom line is that the Web is a vastly broad church, and ideally we should maximising the benefit from all these approaches in as interoperable fasion as possible - something like the old "think global act local" slogan. Cheers, Danny. -- http://dannyayers.com ~ http://blogs.talis.com/nodalities/this_weeks_semantic_web/ -------------- next part -------------- An HTML attachment was scrubbed... URL: http://microformats.org/discuss/mail/microformats-dev/attachments/20080629/66f652a7/attachment-0001.html From norm at cackhanded.net Mon Jun 30 01:58:09 2008 From: norm at cackhanded.net (Mark Norman Francis) Date: Mon Jun 30 01:58:14 2008 Subject: [uf-dev] Human and machine readable data format In-Reply-To: <36A319113CF910438942741C4727ADFF02132AFF@MOBY.Clarence.local> References: <36A319113CF910438942741C4727ADFF02132AFF@MOBY.Clarence.local> Message-ID: On 29 Jun 2008, at 15:17, Glenn Jones wrote: > Glenn, in this page you state: > MUST format must follow the pattern order i.e.in English date, > month year, time, timezone That is an internationalisation no-no. It's not longer "human readable" if the language is question demands a different order and you break it for the sake of easier machine parsing. -- Norm. From glenn.jones at madgex.com Mon Jun 30 02:37:24 2008 From: glenn.jones at madgex.com (Glenn Jones) Date: Mon Jun 30 02:37:29 2008 Subject: [uf-dev] Human and machine readable data format In-Reply-To: References: <36A319113CF910438942741C4727ADFF02132AFF@MOBY.Clarence.local> Message-ID: <36A319113CF910438942741C4727ADFF02132BD8@MOBY.Clarence.local> On 29 Jun 2008, Norm wrote: >That is an internationalisation no-no. It's not longer "human >readable" if the language is question demands a different order and >you break it for the sake of easier machine parsing. That's only an example for English. At the bottom of the page http://ufxtract.com/experimental/hm-readable-date.htm you will find a language descriptions which will be need to configure parser's. There is a pattern property which allow for different orders. The Simplified Chinese data format has a different order to the others. The idea is that the parsers read the lang attribute on the abbr and applies the correct language description. It will be a pain to build up all the international descriptions needed, but it's the only way if we wish to have human readable date's that can be parsed by machines. Glenn -----Original Message----- From: microformats-dev-bounces@microformats.org [mailto:microformats-dev-bounces@microformats.org] On Behalf Of Mark Norman Francis Sent: 30 June 2008 09:58 To: A list for people developing tools with microformats. Subject: Re: [uf-dev] Human and machine readable data format On 29 Jun 2008, at 15:17, Glenn Jones wrote: > Glenn, in this page you state: > MUST format must follow the pattern order i.e.in English date, > month year, time, timezone That is an internationalisation no-no. It's not longer "human readable" if the language is question demands a different order and you break it for the sake of easier machine parsing. -- Norm. _______________________________________________ microformats-dev mailing list microformats-dev@microformats.org http://microformats.org/mailman/listinfo/microformats-dev From danbri at danbri.org Mon Jun 30 02:52:29 2008 From: danbri at danbri.org (Dan Brickley) Date: Mon Jun 30 02:52:34 2008 Subject: [uf-dev] Human and machine readable data format In-Reply-To: <36A319113CF910438942741C4727ADFF02132BD8@MOBY.Clarence.local> References: <36A319113CF910438942741C4727ADFF02132AFF@MOBY.Clarence.local> <36A319113CF910438942741C4727ADFF02132BD8@MOBY.Clarence.local> Message-ID: <4868ACDD.6090102@danbri.org> Glenn Jones wrote: > On 29 Jun 2008, Norm wrote: >> That is an internationalisation no-no. It's not longer "human >> readable" if the language is question demands a different order and >> you break it for the sake of easier machine parsing. > > That's only an example for English. At the bottom of the page > > http://ufxtract.com/experimental/hm-readable-date.htm > > you will find a language descriptions which will be need to configure > parser's. There is a pattern property which allow for different orders. > The Simplified Chinese data format has a different order to the others. > > The idea is that the parsers read the lang attribute on the abbr and > applies the correct language description. It will be a pain to build up > all the international descriptions needed, but it's the only way if we > wish to have human readable date's that can be parsed by machines. That's a very interesting approach. But do you think this can reasonably extend to use of other calendars, rather than just other scripts / natural languages? cheers, Dan -- http://danbri.org/ From norm at cackhanded.net Mon Jun 30 03:04:32 2008 From: norm at cackhanded.net (Mark Norman Francis) Date: Mon Jun 30 03:04:36 2008 Subject: [uf-dev] Human and machine readable data format In-Reply-To: <36A319113CF910438942741C4727ADFF02132BD8@MOBY.Clarence.local> References: <36A319113CF910438942741C4727ADFF02132AFF@MOBY.Clarence.local> <36A319113CF910438942741C4727ADFF02132BD8@MOBY.Clarence.local> Message-ID: <5C71CA7E-7E22-493E-8C3D-AC0E23AA10A3@cackhanded.net> > The idea is that the parsers read the lang attribute on the abbr and > applies the correct language description. It will be a pain to build > up > all the international descriptions needed, but it's the only way if we > wish to have human readable date's that can be parsed by machines. Ah, missed that. My apologies, then. -- Norm. From mdagn at spraci.com Mon Jun 30 03:12:25 2008 From: mdagn at spraci.com (Michael MD) Date: Mon Jun 30 03:12:28 2008 Subject: [uf-dev] Human and machine readable data format References: <36A319113CF910438942741C4727ADFF02132AFF@MOBY.Clarence.local> <36A319113CF910438942741C4727ADFF02132BD8@MOBY.Clarence.local> Message-ID: <004801c8da99$cb631960$116bacca@COMCEN> > The idea is that the parsers read the lang attribute on the abbr and > applies the correct language description. It will be a pain to build up > all the international descriptions needed, but it's the only way if we > wish to have human readable date's that can be parsed by machines. > and what do we do about people who write something like "25th January" when they really mean "25th January 2008" ? I think we have opened a nasty can of worms here! Some libraries for parsing dates will assume that it is this year.... which is VERY bad ... it should be rejected as being ambiguous. From danbri at danbri.org Mon Jun 30 03:24:04 2008 From: danbri at danbri.org (Dan Brickley) Date: Mon Jun 30 03:24:08 2008 Subject: [uf-dev] Human and machine readable data format In-Reply-To: <004801c8da99$cb631960$116bacca@COMCEN> References: <36A319113CF910438942741C4727ADFF02132AFF@MOBY.Clarence.local> <36A319113CF910438942741C4727ADFF02132BD8@MOBY.Clarence.local> <004801c8da99$cb631960$116bacca@COMCEN> Message-ID: <4868B444.8030102@danbri.org> Michael MD wrote: >> The idea is that the parsers read the lang attribute on the abbr and >> applies the correct language description. It will be a pain to build up >> all the international descriptions needed, but it's the only way if we >> wish to have human readable date's that can be parsed by machines. >> > > and what do we do about people who write something like "25th January" > when they really mean "25th January 2008" ? > > I think we have opened a nasty can of worms here! > > Some libraries for parsing dates will assume that it is this year.... > which is VERY bad > ... it should be rejected as being ambiguous. Similarly, times of day without specifying a reference timezone... cheers, Dan -- http://danbri.org/ From glenn.jones at madgex.com Mon Jun 30 03:47:51 2008 From: glenn.jones at madgex.com (Glenn Jones) Date: Mon Jun 30 03:47:57 2008 Subject: [uf-dev] Human and machine readable data format In-Reply-To: <004801c8da99$cb631960$116bacca@COMCEN> References: <36A319113CF910438942741C4727ADFF02132AFF@MOBY.Clarence.local><36A319113CF910438942741C4727ADFF02132BD8@MOBY.Clarence.local> <004801c8da99$cb631960$116bacca@COMCEN> Message-ID: <36A319113CF910438942741C4727ADFF02132C71@MOBY.Clarence.local> What I was suggesting is that the date in the title of abbr be in a fixed format, but also human readable. The text of the abbr tag could any format the author wanted. If a date in the did not comply to the fix format in any way it would be completely rejected by the parser The format I suggested, has enough data to not be ambiguous. Date: 25 January 2008 Date: 25 January 2008 at 15:30 Date: 25 January 2008 at 15:30, Time zone +1:30 Under this pattern if someone created the following Jan 25 08 Two weeks Monday they would be converted into 2008-01-25 If they got the format wrong in the title attribute Jan 25 08 It would be rejected. We would have internationalise the scheme so Jan 25 08 Would also parse correctly What I am suggesting is exchanging the title attribute from ISO format to a human readable format, not freeform text. Glenn -----Original Message----- From: microformats-dev-bounces@microformats.org [mailto:microformats-dev-bounces@microformats.org] On Behalf Of Michael MD Sent: 30 June 2008 11:12 To: A list for people developing tools with microformats. Subject: Re: [uf-dev] Human and machine readable data format > The idea is that the parsers read the lang attribute on the abbr and > applies the correct language description. It will be a pain to build up > all the international descriptions needed, but it's the only way if we > wish to have human readable date's that can be parsed by machines. > and what do we do about people who write something like "25th January" when they really mean "25th January 2008" ? I think we have opened a nasty can of worms here! Some libraries for parsing dates will assume that it is this year.... which is VERY bad ... it should be rejected as being ambiguous. _______________________________________________ microformats-dev mailing list microformats-dev@microformats.org http://microformats.org/mailman/listinfo/microformats-dev From gulopine at gamemusic.org Mon Jun 30 07:28:27 2008 From: gulopine at gamemusic.org (Marty Alchin) Date: Mon Jun 30 07:28:30 2008 Subject: [uf-dev] A sensible alternative for representing dates Message-ID: <7e8d40920806300728k3456aaa1ubd86b8ffae7569d5@mail.gmail.com> This is my first foray into the microformats community, so I apologize if I'm missing some necessary past history on this topic. I'm sure it's been discussed before, I know it's being discussed now, and I'd just like to add another option to the discussion. Also, yes I realize that by using the word "sensible" in the subject of this email, I'm introducing a likelihood of wild tangents regarding the subjectivity of such a word. I'll just try to stem it by saying that yes, I do realize it's subjective, and it's my opinion that what I'm proposing is sensible. Enough said. Since the BBC announcement, I keep seeing discussions about how to make the abbr's title attribute more accessible, and I keep wondering, why are we so stuck on using abbr at all? I read the justification for it, and it makes sense, but it's hardly the only way to go, so I'd like to take a different approach and see what you make of it. Many sites I've seen include daily archives, whether they be of events, blog posts, new links, whatever. Pages including lists of such events, or just the detail of a single event, will usually link to that daily archive. The key is that that URL for the daily archive is typically in just one of the two following formats: * /2008/06/30/ * /2008/jun/30/ Call me crazy, but that looks as much like a machine-readable date format as any I've ever seen. Better yet, the first form is completely internationalized already, so it doesn't rely on NLP or anything. The second form is also common, though, so it seems like an allowable alternative (but maybe that's just because I use that form myself). Links of this form could either use a class, as is currently done for the abbr tag, or use something like rel="date", since it's fairly similar to the rel-tag format. Also like rel-tag, it would look at the *end* of the URL only. If someone had a link like /weblog/2008/jun/30/, that would work just the same as /corporate/public/events/2008/06/30/. If microformat dates used URLs in links, rather than the titles of abbreviations, the data would be just as visible as the rel-tag pattern, wouldn't have "machine data" presented to users as standard content, allows for flexible human-readable presentation (since it could just be ignored), and has the added benefit of encouraging daily archives on those sites that might not currently implement them. (And yes, I realize that the "benefit" of daily archives is debatable. Please don't bother, since that debate is irrelevant to this discussion.) Of course, that still leaves the issue of time, but times are much easier to parse automatically than dates, as data formats are far fewer and much more recognizable. This is the one area of the link's content that I'd suggest to be parsed, so that times become part of the link. Essentially, there are two dominant formats for time: * 13:23 * 1:23 pm Given the need for internationalization, I'd suggest that a 24-hour time be assumed, if no suffix is given. The "pm" we use in English is common, but isn't necessarily so throughout the world, so it should be an available alternative, but not the assumed standard. It should be acceptable with or without periods, and perhaps it could even look only at the first letter, so even just "a" or "p" could be allowable. Of course, that means that anyone who publishes a 12-hour time without a suffix will cause all of their events between noon and midnight to be misinterpreted, but that could happen even without microformats. In summary, I'd like to offer a couple options for displaying dates, though I'm not sure which one is "best". Feel free to discuss, comment on, improve, or outright deny them. Consider some possible representations of the following: June 30, 2008 at 1:00 pm. * June 30, 2008 at 1:00 pm * 13:00 To add a bit more flexibility, I think it might be better still to include a new rel-date pattern, which would specify the date as a URL, as well as a class which defines the entire date/time combination. Such an approach would allow for a more more usable markup structure, such as: June 30, from 1:00p to 2:00p If nested formats aren't allowed, I'll concede that, but I think there's value in allowing the time to be separate from the link, since the destination won't be tied to a particular time, but rather just the day. Also, note that in the event that a dtend doesn't specify a date, it should be assumed to be the same as dtstart, which must always specify a date. Also, before I get accused of not thinking about it, I don't know yet how would be best to deal with time zones. On one hand, I think they could be parsed as part of the time format above, but I don't know if it's any more accessible to include "-0500" in the content, or if the names or abbreviations of time zones are standardized enough (especially internationally) to work well. It's also possible to store just the time zone as an offset in some meta tag on the page itself, providing a hint for microformat parsers on how to process times within that page. I have no suggestion on that issue, but I do acknowledge that it's left undefined at the moment. Suggestions are welcome. -Gul From guillaume at lebleu.org Mon Jun 30 07:37:37 2008 From: guillaume at lebleu.org (Guillaume Lebleu) Date: Mon Jun 30 07:37:41 2008 Subject: [uf-dev] Human and machine readable data format In-Reply-To: <36A319113CF910438942741C4727ADFF02132C71@MOBY.Clarence.local> References: <36A319113CF910438942741C4727ADFF02132AFF@MOBY.Clarence.local><36A319113CF910438942741C4727ADFF02132BD8@MOBY.Clarence.local> <004801c8da99$cb631960$116bacca@COMCEN> <36A319113CF910438942741C4727ADFF02132C71@MOBY.Clarence.local> Message-ID: <4868EFB1.7010807@lebleu.org> Glenn Jones wrote: > What I was suggesting is that the date in the title of abbr be in a > fixed format, but also human readable. The text of the abbr tag could > any format the author wanted. > > If a date in the did not comply to the fix format in any way it would be > completely rejected by the parser > > The format I suggested, has enough data to not be ambiguous. > Date: 25 January 2008 > Date: 25 January 2008 at 15:30 > Date: 25 January 2008 at 15:30, Time zone +1:30 That looks to me like a possible good compromise. A couple questions: * What is the purpose of "Date:". Couldn't this be moved to the class attribute? or in the hCalendar context be inferred from class="dstart"? * What do you think of my earlier suggestion to base the human and machine-readable on official writing practices in each locale (ex. in en-us: "January 25, 2008") * What do you think of the idea of making title optional if the date/time is already written in official writing practices in each locale. Guillaume From jaffathecake at gmail.com Mon Jun 30 09:29:57 2008 From: jaffathecake at gmail.com (Jake Archibald) Date: Mon Jun 30 09:30:00 2008 Subject: [uf-dev] A sensible alternative for representing dates In-Reply-To: <7e8d40920806300728k3456aaa1ubd86b8ffae7569d5@mail.gmail.com> References: <7e8d40920806300728k3456aaa1ubd86b8ffae7569d5@mail.gmail.com> Message-ID: <3be0bf100806300929y57b8bd5cy4c9dcfd9a64a663c@mail.gmail.com> 2008/6/30 Marty Alchin : > If microformat dates used URLs in links, rather than the titles of > abbreviations, the data would be just as visible as the rel-tag > pattern, wouldn't have "machine data" presented to users as standard > content, allows for flexible human-readable presentation (since it > could just be ignored) > * June 30, 2008 at 1:00 pm > * 13:00 > The problem with this is it requires an anchor, and requires the author to build a meaningful page at that address. Jake. -------------- next part -------------- An HTML attachment was scrubbed... URL: http://microformats.org/discuss/mail/microformats-dev/attachments/20080630/ec188412/attachment.html From glenn.jones at madgex.com Mon Jun 30 21:59:13 2008 From: glenn.jones at madgex.com (Glenn Jones) Date: Mon Jun 30 21:59:21 2008 Subject: [uf-dev] Human and machine readable data format In-Reply-To: <4868EFB1.7010807@lebleu.org> References: <36A319113CF910438942741C4727ADFF02132AFF@MOBY.Clarence.local><36A319113CF910438942741C4727ADFF02132BD8@MOBY.Clarence.local> <004801c8da99$cb631960$116bacca@COMCEN><36A319113CF910438942741C4727ADFF02132C71@MOBY.Clarence.local> <4868EFB1.7010807@lebleu.org> Message-ID: <36A319113CF910438942741C4727ADFF02132EAD@MOBY.Clarence.local> Guillaume Lebleu wrote > * What is the purpose of "Date:". Couldn't this be moved to the > class attribute? or in the hCalendar context be inferred from > class="dstart"? I added a prefix which describes the data type, in this case "Date:" to help the parser developers test the format by using a string sartsWith functions. Whatever solution we come up with ISO dates will most likely be kept for backwards compatibility. It may be possible to drop the prefix. The ISO duration is a hard format to test for without a prefix. > * What do you think of my earlier suggestion to base the human and > machine-readable on official writing practices in each locale (ex. > in en-us: "January 25, 2008") The language descriptions that I suggested is flexible enough to allow for language and culture/locale differences. I.e. we could use a British format "25 January 2008" { "language-name" : "English", "language-codes" : ["en-gb"], ""pattern": "date,month,year,time,timezone", "scrub-terms": ["Date:", "at", "," "Time zone"], "month-names": ["January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"] } I.e. we could use a US format "January 25, 2008" { "language-name" : "English", "language-codes" : ["en-us"], ""pattern": "month,date,year,time,timezone", "scrub-terms": ["Date:", "at", "," "Time zone"], "month-names": ["January", "February", "March", "April", "May", "June", "July", "August", "September", "October", "November", "December"] } The pattern property allows for a reordering of the elements. Working out the fall back to just language code "en" would be fun. > * What do you think of the idea of making title optional if the > date/time is already written in official writing practices in each > locale. That rule already exists as part of how the parsers work today. 2008-01-25 The above is valid, this would natural be extended to any new format. Glenn