Value Excerption Pattern Issues
Open issues concerning the parsing of the value excerption pattern.
These issues are awaiting resolution and reflection in the specification, but may not be blockers on the implementation of the specification.
There seem to be some properties within which value excerpting is NOT allowed (or should not be allowed!) e.g. "type" in hCard. TobyInk 07:38, 22 May 2008 (PDT)
- You mean
typeas a sub-property of
tel? That's one of the identified Machine Data in Microformats items that needs a means of including the publisher's choice text along with the microformat specified one. Not to conflate two separate issues, but just noting that separation of
type valueneeds to be handled somewhere, and value-excerption-pattern could be considered as part of the solution. BenWard 07:54, 22 May 2008 (PDT)
- Some fields make sense to exclude this, as it seems unintuitive, and can be used to avoid many of nested-microformat problems that may avoid a messier mfo pattern. E.g.
entry-contentin hAtom 0.1, both could very feasibly have nested formats or any kind, but doesn't strike me as useful to segregate into "value" at all. BenWard 10:41, 6 Jun 2008 (PDT)
- Total other alternative, make value-excerption opt-in. Would need a bit of effort to go through all the specs and clarify, but actually might make more sense. It's a useful pattern for some properties (especially those with data patterns). BenWard 10:41, 6 Jun 2008 (PDT)
- Setting the rules over depth-of-parsing (see below) to children-only would obviate the remaining need for this issue.
White-space behaviour when concatenating value nodes.
We specify that no characters get inserted between concatenated occurrences of ‘value’. Need to audit all properties to ensure that this behaviour would be correct in all cases.
Possibly specify that individual properties can override this behaviour, specifying a separator character. Possibly specify that this should be a provision of parsing implementations, so as to maintain flexibility for future publishing.
Depth of Parsing
Currently any descendent is parsed, which causes issues if a microformat field using the value-excerption-pattern is nested within another.
<code><div class="hentry vevent"> <h1 class="entry-title summary">Party on Sunday!</h1> <div class="updated published">Tuesday <span class="value">2008-06-17</span></div> <p class="entry-content description">We're having a party on <span class="dtstart">Sunday, at 7pm! <span class="value">2008-06-22T19:00:00+0100</span></span>. Please bring your friends!</p> </div></code>
In this example, hAtom and hCalendar are interleaved. The DTSTART property of the event is contained within the entry-content of the hAtom entry, using the value-excerption-pattern to include the machine-data datetime. However, with full descendent parsing, the hAtom model will come out as the following:
ENTRY ENTRY-TITLE=Party on Sunday! UPDATED= 2008-06-17 PUBLISHED=2008-06-17 ENTRY-CONTENT=2008-06-22T19:00:00+0100
- e.g. an hCalendar 1.0
veventnested inside hAtom 0.1
entry-contentmust not result in
- e.g. hCalendar 1.0 defines
organizer, which may be an hCard 1.0, which may have a
telproperty containing a sub-property
value. Under these parsing rules, the entire
organizerfield would be parsed as the telephone number.
- Cognition copes with this OK -- the organizer is parsed as a full contact with an hCard - not just a number. TobyInk 07:38, 22 May 2008 (PDT)
- Specify the
mfo(‘microformat object’) class be used when nesting microformats, as a processing instruction to parsers not to parse unrelated nested items
- Specify that
valuemust only be read from children, not from all descendants. Restrictive, unlikely to work for existing hCard TEL usage.
- Specify the above (
parse children, not all descendants), but allow individual properties (such as
telto override and parse all descendants. This would result in a parse-depth flag on all fields, and many getting overridden for all descendants, but again, seems to be a well structured solution. Property name dictionaries in parsers would have to include the depth flag with the property.
title from Empty
As a solution to the invisible data requirements sometimes presented by Machine Data in Microformats in microformats, a parsing rule is proposed where the
value element is empty (contains no non-whitespace characters), the
title attribute instead be parsed.
<span class="dtstart">Tuesday the 24th at 6pm <span class="value" title="20080624T180000+1000"></span>lt;/span>
- This is parsable, should be specced.
- Suggest restricting to instances where a single
valueelement exists, e.g. Disallow concatenation of multiple embedded values, and disallow embedded values from being appended to visible data. This pattern exists to solve the machine data problem, and restricting it more will discourage it being used for hiding other, useful data.
- Standard builds of HTMLTidy drop empty elements, which is unfortunate. However, it is trivial to compile tidy with a patch to not drop empty elements which have class attributes (See tidy-microformats.zip). The linked file contains an intel binary, and a diff for patching against the HTML Tidy source.
These issues are closed, and either dismissed with reason, or the specification has been updated in resolution.
<span class="value">Foo <span class="value">Bar</span></span> parse as
foo bar or
value elements be allowed to be nested within
Resolution: Disallowed. Deemed complex to parse, and unnecessary when publishing.