value-class-pattern-issues
Value Excerption Pattern Issues
Open issues concerning the parsing of the value excerption pattern.
Open Issues
Excluded Fields
There seem to be some properties within which value excerpting is NOT allowed (or should not be allowed!) e.g. "type" in hCard. TobyInk 07:38, 22 May 2008 (PDT)
- You mean
type
as a sub-property oftel
? That's one of the identified machine-data items that needs a means of including the publisher's choice text along with the microformat specified one. Not to conflate two separate issues, but just noting that separation oftype
text andtype value
needs to be handled somewhere, and value-excerption-pattern could be considered as part of the solution. BenWard 07:54, 22 May 2008 (PDT) - Some fields make sense to exclude this, as it seems unintuitive, and can be used to avoid many of nested-microformat problems that may avoid a messier mfo pattern. E.g.
entry-summary
andentry-content
in hAtom, both could very feasibly have nested formats or any kind, but doesn't strike me as useful to segregate into "value" at all. BenWard 10:41, 6 Jun 2008 (PDT) - Total other alternative, make value-excerption opt-in. Would need a bit of effort to go through all the specs and clarify, but actually might make more sense. It's a useful pattern for some properties (especially those with data patterns). BenWard 10:41, 6 Jun 2008 (PDT)
White-space behaviour when concatenating value nodes.
Currently we specify that a single space (Unicode 0020) character separate each concatinated value. The appropriateness of this varies with different fields. Telelphone numbers will drop white-space, whilst textual items should be separated. Currently this behaviour is left to parsers to figure out on a case by case basis, but we need to document the exceptions, and clarify how future spec properties should opt into no-whitespace behaviour.
Depth of Parsing
Currently any descendent is parsed, which causes issues if a microformat field using the value-excerption-pattern is nested within another.
- e.g. an hCalendar
vevent
nested inside hAtomentry-content
must not result inentry-content
parsing as20080627T12:34:00+100
. - e.g. hCalendar defines
organizer
, which may be an hCard, which may have atel
property containing a sub-propertyvalue
. Under these parsing rules, the entireorganizer
field would be parsed as the telephone number.
- Cognition copes with this OK -- the organizer is parsed as a full contact with an hCard - not just a number. TobyInk 07:38, 22 May 2008 (PDT)
Possible resolutions:
- Specify the
mfo
(‘microformat object’) class be used when nesting microformats, as a processing instruction to parsers not to parse unrelated nested items - Specify that
value
must only be read from children, not from all descendants. Restrictive, but workable. - Specify the above (
parse children, not all descendants), but allow individual properties (such as
tel
to override and parse all descendants.
Parsing title
from Empty value
Elements
As a solution to the invisible data requirements sometimes presented by machine-data in microformats, a parsing rule is proposed where the value
element is empty (contains no non-whitespace characters), the title
attribute instead be parsed.
e.g. <span class="dtstart">Tuesday the 24th at 6pm <span class="value" title="20080624T180000+1000"></span>lt;/span>
Possible resolutions:
- This is parsable, should be specced.
- Suggest restricting to instances where a single
value
element exists, e.g. Disallow concatenation of multiple embedded values, and disallow embedded values from being appended to visible data. This pattern exists to solve the machine data problem, and restricting it more will discourage it being used for hiding other, useful data.
Closed Issues
Nested value
Should <span class="value">Foo <span class="value">Bar</span>lt;/span>
parse as foo bar
or bar
? Should value
elements be allowed to be nested within value
elements?
Resolution: Disallowed. Deemed complex to parse, and unnecessary when publishing.