[uf-discuss] Appeal for Issues: Empty spans in
value-excerption-pattern
Ben Ward
lists at ben-ward.co.uk
Thu Nov 6 01:53:23 PST 2008
Hi everyone.
So, a few months ago I was working on the ongoing value-excerption-
pattern specification. Then I moved to San Francisco and my work went
a little stagnant, but I'm trying to pick it up again.
The value-excerption-pattern is an attempt to fully spec the
class="value" behaviour from "tel" in hCard, which has since been
supported globally in some parsers for a while, and has proved
somewhat useful. In addition to fully spec'ing the behaviour for
parsing class="value" elements for visible data, I've been working on
additional specification to handle inclusion of machine-centric data
alongside human forms (http://microformats.org/wiki/machine-data).
It's this machine-centic portion that I'm trying to nail down at the
moment, since it would provide an in-demand solution for various
recurring complaints (abbr-pattern dependencies, for example).
Also, note that recent brainstorming regarding patterns dervice from
the semantics of the <object> element and value excerption has shown
that current, in-use browsers (Microsoft Internet Explorer and
Apple's Safari 2) do not handle object acceptably for inline content (http://microformats.org/wiki/value-excerption-pattern-brainstorming#object_param_handling
). So we're definitely stuck with needing to spec this pattern using
generic mark-up. (http://microformats.org/wiki/value-excerption-pattern-brainstorming#object_param_handling
)
Since it's been a while, this mail serves to summarise the current
state of this spec and proposed resolutions to open issues. PLEASE, if
you have additional issues to raise, add them to the wiki page (http://microformats.org/wiki/value-excerption-pattern-issues#Parsing_title_from_Empty_value_Elements
)
Couple of Examples:
----------------------------
<span class="dtstart"><span class="value"
title="2008-08-27T23:25:00-0700"></span> 11:25pm, August 27th 2008</
span>
<p class="tel">
<span class="type"><span class="value" title="cell"></span>
Mobile</span>
<span class="value">415-123-4567</span>
</p>
Purpose
-----------
This pattern allows you to embed fixed format content — such as the
telephone type enumeration and parser-required data formats —
alongside the visible format of the publisher's choice.
Responses to Issues so Far
--------------------------------------
1. DRY Violation worse than current ABBR-pattern. DRY is a problem
when data is repeated in a document and risks one copy of the data not
being maintained in sync with another. Maintenance of the document
results in broken data.
Resolution: To address this, the empty-span part of the value
excerption pattern will specify that the empty-span MUST be the first,
non-whitespace-text-node child of the property element. Thus, this
will parse:
<span class="dtstart"><span class="value" title="2008-11-04"></
span>4th November</span>
But this will fail:
<span class="dtstart">On 4th November 2008 Barack Obama was elected
the first African American president of the United States of American.
He was really pleased about it. <span class="value"
title="2008-11-04"></span> </span>
The first pattern keeps the code distance small between the data form
(class=value) and the property name (class=dtstart). It disallows the
machine-data portion from being separated from the property.
Furthermore, the spec should encourage conformance checking tools to
attempt to verify the machine date form against the human form and
warn the user if they data does not match.
2. Violating the principal of visible data
Resolution: Microformats maintain a principal of marking up visible
data. However, we have exceptional circumstances where the data
required for parsing is not the data that publishers wish to display.
Whilst parsers are a lower priority than publishers, the cost and
complexity of parsing unstructured dates, or translated terms, is
accepted as too high. Therefore it is necessary to violate DRY to
include explicit representations for machines.
Currently authors may use CSS to hide the machine-form of dates.
Microformats exists only in the HTML layer, and must not depend on CSS
to meet publisher requirements.
The specification may also restrict this part of the pattern to
certain properties where a machine-data form is required, as a means
to discourage abuse.
3. Broken parsers drop empty elements
There are some broken but widespread HTML parsers which discard empty
elements, resulting in the empty-span-value element being removed from
documents (e.g. HTMLTIdy). HTMLTidy is easily patched not to do this,
but may already exist in publishing platforms.
Resolution: Without numbers, we don't know how many publishing systems
would be affected but this. It's a problem for which the only
resolution is to use a completely different pattern. As such, this
proposal must put legacy broken parsers down as an accepted loss.
CMS's locked to old versions of HTML Tidy would not be able to use
this pattern without modification.
So, there aren't many issues against this part of the pattern, and the
rules for it are coming together. There's likely some feeling about
matters of taste as to how to achieve this function. This is my
favoured version, but a lot of the issues resolved here would apply
equally to other patterns too, so I'd appreciate further input to see
if this pattern can be thoroughly specified.
Please, if you have problems to raise with this proposal, add them to
the -issues page on the wiki at:
http://microformats.org/wiki/value-excerption-pattern-issues#Parsing_title_from_Empty_value_Elements
Thank you,
Ben
More information about the microformats-discuss
mailing list