Value Excerption Pattern: Parsing 'value' from an empty element

From Microformats Wiki
Jump to navigation Jump to search


This page was part of the development of the value-class-pattern and is here for historical purposes.

Please see the value-class-pattern page instead.

This page is targetted at those already experienced with microformats.

Please carefully note, this page is about a pre-draft, experimental and unfinished microformats proposal. You cannot use this pattern on your live pages, it is not supported by any stable parser and you should not assume that this pattern will be finalized as-is! We're just asking for help in testing this thoroughly. Thank you.

This is a special page to introduce and gather results to widespread testing of a proposed extension to the value-excerption pattern. See value excerption pattern brainstorming: value-title for the specific proposal.

This pattern can be used to resolve some long standing issues with including machine-data in microformats; it's imperative we test thoroughly before adding it to any pattern specification. Following are a number of example tests. Please try them out.

The pattern we're testing looks a little something like this. Those experienced with microformats should immediately see what we're trying to do:

<p class='tel'>
    <span class='type'>
        <span class='value-title' title='cell'></span>
        mobile
    </span>
    <span class='value'>+44 7773 000 000</span>
</p>
<p class='dtstart'>
    <span class='value-title' title='2009-01-06T22:54:00-0800'></span>
    January 6th, in the evening
</p>

It allows you to include machine-form data alongside the human form, without polluting visible formatted content with undesired machine form data.

This covers cases where a microformat uses a fixed format of data that is either inappropriate for visible inclusion in a page (such as a full date-time and timezone string), or where an American-English keyword is needed — such as cell instead of ‘mobile’ in a British English page, or any number of non-English translations.

This pattern is based on rendering behavior in browsers whereby an empty element — that is one containing no text-nodes or other child elements — remains in the DOM tree (for parsing) but is not rendered visibly to a page. This allows an element to be included in the document with a title attribute (as in the example), but without a tooltip being exposed to users, and without the data being read out by screen readers.

You can use value-title on non-empty elements as well; whatever makes most sense to your publishing scenario. This page is dedicated to the empty-element version though, since that offers up the consumption unknowns.

Based on everything we know up to this point, we believe this pattern will work. But, it's wide ranging and the web is broad, and we want to be sure. Please, help us out testing this pattern proposal. Examples tests are below, please push them or your own variants into publishing systems, content management systems, editor applications and tools. Check that it comes out the other side with the data intact, and exposed (or hidden) as expected: Render it in desktop browsers, mobile browers, screen readers, in braille… anything you can test, we want to know about! We need to see any quirks, oddities and so on.

Also, by all means provide thoughts on the publishing flow for this. An empty element is an uncommon structure outside of forms and scripts, but the reasoning is as follows: ‘Machine formatted data’ is not metadata, it is content. Therefore, it's structurally appropriate to have it as a sibling to the human-formatted content.

Note that valid HTML is a cornerstone of microformats. Inventing new attributes, depending on unstable drafts of HTML5, using non-standard DOCTYPEs or XML extensions is not an applicable option. We're trying to achieve something as gracefully as we can within the limitations of HTML4, and without harming user experience.

The proposed parsing rules

The current, likely incomplete, parsing rules and restrictions for this pattern are as follows:

  • Only one value-title element may be included as a child of a property. No splitting or concatenation, no combining with other value-excerption elements.
  • An empty value-title element must be the first-child of the property (not including any preceding whitespace). To alleviate the negative impact of non-visible data, the value should be as near as possible to declaring the property.
  • The machine-data value must represent the same data as the visible text; the parent property must not contain arbitrary data. Validator tools will be encouraged to verify this where possible (for example, some programming languages have access to powerful date parsing algorithms that can compare human dates to the ISO form).
  • The empty element can be any element, but a generic span is most appropriate. You could use b if you want to save bytes, or an input type=hidden if it makes sense to you. That choice will not matter to parsers. You are in complete control of that publishing decision. As per usual µf documentation, span will be used for generic examples.
  • The value-title property does not have to be empty. If you do want a tool-tip to expose a useful data-form, you can. e.g. <span class='value-title' title='2008'>last year</span> is valid too.

Example Tests

The following snippets are example tests for the new pattern. You can use them as is, or use them as a base for your own tests with your own content. If you write your own tests, please document them under ‘additional test cases’ so that any failing tests can be checked for validity.

hAtom#1: An hAtom published/updated Property

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
    "http://www.w3.org/TR/html4/strict.dtd">
<html lang="en">
    <head>
        <title>Value Excerption Pattern Test hAtom#1</title>
    </head>
    <body>
        <div class="hentry">
            <h1 class="entry-title">An introduction to Microformats</h1>
            <p>
                Published on <span class="published updated">
                <span class="value-title" title="2009-01-09T11:33:00-0800"></span>
                January 9th, around lunchtime</span>
                by <span class="author vcard">
                  <a class="url fn" href="http://example.com">
                      Joe Blogger</a></span>.
            </p>
            <p class="entry-content">Wow, microformats are really useful! You can
                learn loads about them on the 
                <a href="http://microformats.org/wiki">microformats wiki</a>.
            </p>
        </div>
    </body>
</html>

hCal#1: An hCalendar dtstart

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
    "http://www.w3.org/TR/html4/strict.dtd">
<html lang="en">
    <head>
        <title>Value Excerption Pattern Test hCal#1</title>
    </head>
    <body>
        <div class="vevent">
            <h1 class="summary">Value Exception Test Day!</h1>
            <p class="description">Come help <span class="organizer vcard">
                <a class="fn url org" href="http://microformats.org">microformats.org</a>
                </span> test a new value-excerption pattern for sanity and 
                robustness!
            </p>
            <p>Help out by running some tests at 
              <span class="dtstart">
                <span class="value-title" title="2009-01-12T12:00:00-0800"></span>
                midday on Monday January 12th</span>.
            </p>
            <p>See <a class="url" href="http://microformats.org/wiki/value-excerption-pattern-issues/empty-value-element-test">the
                wiki</a> for more details!
            </p>
        </div>
    </body>
</html>

hCard#1: An hCard bday

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
    "http://www.w3.org/TR/html4/strict.dtd">
<html lang="en">
    <head>
        <title>Value Excerption Pattern Test hCard#1</title>
    </head>
    <body>
        <!-- Behind Test -->
        <div class="vcard">
            <h1 class="fn">Ben Ward</h1>
            <p>Ben Ward's birthday is 
                <span class="bday">
                    <span class="value-title" title="1984-02-09"></span>
                    February 9th
                </span>.
                You should throw him a party! Or call his <span class="tel">
                <span class="type"><span class="value-title" title="cell"></span>mobile</span>
                on <span class="value">415.123.123</span></span> to wish him well!
            </p>
        </div>
        <!-- End Test -->
    </body>
</html>

hAudio#1: An hAudio duration

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
    "http://www.w3.org/TR/html4/strict.dtd">
<html lang="en">
    <head>
        <title>Value Excerption Pattern Test hAudio#1</title>
    </head>
    <body>
        <h1>Song of the year?</h1>
        <!-- Behind Test -->
        <p class="haudio">Did you hear ‘<span class="fn">Heavy Water</span>’ on
            <span class="contributor">Foals</span><span class="album">Antidodes</span>’ record 
            <span class="published">
                <span class="value-title" title="2008"></span>
                last year
            </span>? It's
            <span class="duration">
                <span class="value-title" title="PT04M32S"></span>
                4 and a half minutes long
            </span>, you should make time to hear it!</p>
        </div>
        <!-- End Test -->
    </body>
</html>

If you believe there is an error in any of these tests, or in any others that people contribute, please post on the mailing list.

Evil Tests

If you want to give existing microformat parsers a good run out, construct ‘evil’ tests using nesting, combination and interpolation of different microformats.

hAtom + hCalendar

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
    "http://www.w3.org/TR/html4/strict.dtd">
<html lang="en">
    <head>
        <title>Value Excerption Pattern Test hAtomhCalEvil#1</title>
    </head>
    <body>
        <div class="hentry vevent">
            <h1 class="entry-title summary">An introduction to Microformats</h1>
            <p>
                Published on <span class="published updated">
                <span class="value-title" title="2009-01-09T11:33:00-0800"></span>
                January 11th, late afternoon</span>
                by <span class="author organizer vcard">
                  <a class="url fn" href="http://example.com">
                      Joe Blogger</a></span>.
            </p>
            <p class="entry-content description">
                <span class="dtstart">
                    <span class="value-title" title="2009-01-14T19:00:00"></span>
                    this coming Wednesday at 7
                </span> is not the date of a completely fictional microformats
                event. If it existed, it would promise to be informative and get
                you up to speed on microformats.org for 2009! Now you've 
                learned to work with microformats a little, why not attend and
                get involved! Why not? Because this event is a test case, not 
                for real.
            </p>
        </div>
    </body>
</html>

Second Phase Test

Following the first wave of example tests (above), we had a handful of failures in publishing tools caused by the requirement of using the empty span element. Tools (including the widespread HTML-Tidy) drop the span, thus throwing away the data. Thus, we have a second test to claim confirmation on, please. This version includes a single whitespace (space) character in the value-title span. The result of this is that the publishing tools that failed in the above example pass. We now need to confirm that the single item of whitespace will also collapse (we believe it does). Here's a rewrite of the first hAtom test from above, with the whitespace variation:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
    "http://www.w3.org/TR/html4/strict.dtd">
<html lang="en">
    <head>
        <title>Value Excerption Pattern Test hAtom#1</title>
    </head>
    <body>
        <div class="hentry">
            <h1 class="entry-title">An introduction to Microformats</h1>
            <p>
                Published on <span class="published updated">
                <span class="value-title" title="2009-01-09T11:33:00-0800"> </span>
                January 9th, around lunchtime</span>
                by <span class="author vcard">
                  <a class="url fn" href="http://example.com">
                      Joe Blogger</a></span>.
            </p>
            <p class="entry-content">Wow, microformats are really useful! You can
                learn loads about them on the 
                <a href="http://microformats.org/wiki">microformats wiki</a>.
            </p>
        </div>
    </body>
</html>

The WYSIWYG editors can also publish the original pattern using the input type=hidden element rather than a span, but we're keen to avoid prescribing mark-up to any publisher (especially mark-up of ‘elaborate’ semantics).

A second column has been added to the results to confirm each item that also passes with the inclusion of whitespace. Failures should be documented in the same place, please.

This test is also hosted on Ben Ward's domain, so you can run it right in your browser by going to http://ben-ward.co.uk/microformats/value-excerption-pattern/hAtom2.html.

Verifying the tests

To verified a successfully passing test, you need to check for the following:

In consumers (browsers, screen readers, etc.)

  • The empty element should appear in the page DOM
  • When hovering over and near the visible data, a tooltip displaying the machine-form must not be displayed.
    • You can doubly verify this by opening you browser's DOM Inspector and confirming that the value-title element has a width of 0 (or 0 px).
  • When rendered to speech using assistive technology, the machine-form data must not be read aloud.
    • Any variance in this behavior with different verbosity settings should be noted too, please.

In publishing tools

  • The tool allows you to create an empty span element
    • The tool allows you to add value-title to the class attribute of this empty element.
    • The tool allows you to add corresponding date to the attribute attribute of this empty element.
  • The element remains available in the editor whilst other edits are made
  • When the content is ‘published’ to the web, the empty element is present in the page output, and therefore in the DOM for the document.

Response

  • Don't like the empty element? Don't like the use of the title attribute? Please file general issues concerning the proposed pattern on the main value excerption brainstorming page, or discuss them on the mailing list.
  • Add results of tests and responses to these tests themselves on this page.

Misplaced responses will be moved, and having to do so will make Ben growly, so, y'know, please try to keep the wiki tidy.

Please Also Try

Please also try the value-excerption-dt-separation-test. This is not either or. Ideally both will work and can be carried forward.

Successful Tests

List successfully tested environments here. Add new environments as new list items, and expand existing list items with your name and platform variants to indicate verified successes.

Results of tests across various publishing/rendering environments
Product P/C? Platforms Test By Notes T2
MediaWiki/Linux Publishing Safari 3.2.1 (Mac OSX 10.5) User:BenWard The empty span elements are maintained in MW output. Note that a elements in the tests get escaped by this MediaWiki install.
TinyMCE 3.2.1.1 Publishing Safari 3.2.1 (Mac OSX 10.5) User:BenWard Fails when trying to publish empty element (see below) Pass
FCKEditor 2.6.4 Beta Publishing Safari 3.2.1 (Mac OSX 10.5) User:BenWard Fails when trying to publish empty element (see below) Pass
Safari 2.0.4 Consuming Mac OSX 10.5 User:GeorgeBrock Empty-span reamins in DOM. No tooltip. Pass
Safari 3.0.3 Consuming Windows XP (SP3) User:EmilyLewis No tooltip
Safari 3.2.1 Consuming Mac OSX 10.5, User:BenWard, User:EmilyLewis Empty-span remains in DOM. Web Inspector reports the element has width and height of ‘0px’. No tooltip due to zero-dimensions. Pass
Firefox 2.0 Consuming Mac OSX 10.5 User:GeorgeBrock, User:BenWard Empty-span remains in DOM. Firebug reports the element has width of '0px' and height of ‘16px’. No tooltip due to zero-width. Pass
Firefox 3.0.x Consuming Mac OSX 10.5, User:BenWard, User:EmilyLewis Empty-span remains in DOM. Firebug reports the element has width of '0px' and height of ‘16px’. No tooltip due to zero-width. Pass
Firefox 3.0.x Consuming Windows XP SP3, User:EmilyLewis no tooltip
Firefox 3.1ß1 Consuming Mac OSX 10.5, User:BenWard Empty-span remains in DOM. No tooltip.
Opera 9.6 Consuming Mac OSX 10.5.6 User:EmilyLewis No tooltip
Opera 9.62 Consuming Mac OSX 10.5, Windows XP SP3 User:BenWard, User:EmilyLewis Empty-span remains in DOM. Dragonfly reports the element has width of '0px' and height of ‘0px’. No tooltip due to zero-dimensions. Pass
Opera 10 Alpha Consuming Mac OSX 10.5 User:BenWard Empty-span remains in DOM. Dragonfly reports the element has width of '0px' and height of ‘0px’. No tooltip due to zero-dimensions. Pass
Internet Explorer 5.2 Consuming Mac OSX 10.5, User:BenWard No tooltip. Pass
Internet Explorer 6.0 Consuming Windows XP (SP3) User:GeorgeBrock, User:EmilyLewis Empty-span remains in DOM (visible to Web Developer Toolbar, accessible from Javascript). No tooltip. Pass
Internet Explorer 7.0 Consuming Windows XP (SP3) User:GeorgeBrock, User:EmilyLewis Empty-span remains in DOM (visible to Web Developer Toolbar, accessible from Javascript). No tooltip. Pass
Internet Explorer 8.0 beta 2 Consuming Windows XP (SP3) User:GeorgeBrock Empty-span remains in DOM. Developer Tools (built in to IE8b2) report the element has width of '0px' and height of '19px'. No tooltip due to zero-width. Pass
Camino 1.6.5 Consuming Mac OSX 10.5.6 User:EmilyLewis No tooltip
Camino 1.6.6 Consuming Mac OSX 10.5.6 User:GeorgeBrock Empty-span remains in DOM. Firebug Lite reports width of '0px' and height of '16px'. No tooltip due to zero-width. Pass
Flock 2.0.2 Consuming Mac OSX 10.5.6 User:EmilyLewis No tooltip
Flock 2.0.2 Consuming Windows XP (SP3) User:EmilyLewis No tooltip
Chrome 1.0.154.43 Consuming Windows XP (SP3) User:EmilyLewis No tooltip
BlackBerry Browser Consuming BlackBerry Storm OS 4.0.7.85 (leaked, non-official OS build from Verizon) User:EmilyLewis Tested in ‘BlackBerry mode’. No tooltip
MobileSafari 2.2 Consuming iPhone OS 2.2 User:EmilyLewis, User:BenWard No tooltip Pass
IE Mobile 7.11 Consuming Sprint HTC Mogul, OS: Windows Mobile 6.1 User:EmilyLewis No tooltip
JAWS 10.0.512 (demo mode) Consuming Windows XP (SP3) User:EmilyLewis No voice
Fire Vox (Firefox 3.0.5) Consuming Mac OSX 10.5.6 User:EmilyLewis No voice
WebbIE (Thunder 1.43) Consuming Windows XP (SP3) User:EmilyLewis No voice
VoiceOver (Safari 3.2.1) Consuming Mac OSX 10.5.6 User:EmilyLewis, User:BenWard No voice. When interacting with elements, cannot focus on the empty-span. Pass

Failed Tests

For failures, please provide as much information as you can. The precise impact of the error, whether the behavior could be regarded as a bug in the software you're testing, whether it works in subsequent releases, whether you changed any settings in the software to produce the result, and if so, whether enabling/disabling that setting should be regarded a showstopper if this pattern were certified.

Since we want more detail, please expand failures into headed sections rather than cramming into a table.

For example, take this entirely plausible scenario as a template:

Example: Fake Publisher 3.1ß

Platform
Windows Vista
Test By
User:BenWard
Description
When trying to enter an empty span in my HTML editor, which I wrote myself whilst I was high, the application immediately crashes, performs rm -rf / on all UNIX boxes connected to my local network (which also appears to cause Android phones within Bluetooth range to do the same…), and then causes all attached peripherals to combust. I was not able to reproduce, as my house was now on fire. I think using a self closing XHTML tag instead might work-around the problem because as we know, it's been proved by Real Scienticians that XML is always better than HTML. Alternatively, it may be a bug in the beta software.
Notes
This is a beta release, and a bug has been filed.
This product has a known history of flammability bugs.
The user must explicitly enable the ‘Endanger My Life’ checkbox under the ‘Advanced Mislabelled Checkboxes’ tab of the ‘Complicated Preferences’ preferences pane.
You get the idea.


Browser WYSIWYG editor: TinyMCE 3.2.1.1

Platform
Firefox 3.0 on Mac OSX 10.5
Test By
User:GeorgeBrock
Description
Empty spans entered using TinyMCE's source editor are removed when switching back to the default WYSIWYG view.
This is not a bug with TinyMCE, the editor is designed to remove empty instances of most elements by default. The handling of empty elements can be changed by modifying the valid-elements setting (e.g. change -span to span and empty spans will no longer be removed), however settings can only be changed by modifying the source code of the page that contains the TinyMCE instance, so it is likely that some users will not be able to apply this fix.
Notes
This behavior can be easily reproduced using the online TinyMCE examples.
TinyMCE is the default WYSIWYG editor in WordPress[1].
With a single whitespace character (‘space’), the span is preserved, but yet still seems to collapse in Safari. Will write a follow-up test.--BenWard 23:11, 20 January 2009 (UTC)
You can publish this pattern using the input type='hidden' class='value-title' title='2009-01-20' hinted above. Semantically odd, although it's been suggested on µf lists before. Publishing workaround exists through element agnosticism. --BenWard 23:11, 20 January 2009 (UTC)

Browser WYSIWYG editor: FCKEditor 2.6.4 Beta

Platform
Firefox 3.0 on Mac OSX 10.5
Test By
User:GeorgeBrock
Description
Empty spans entered using FCKEditor's source editor are removed when switching back to the default WYSIWYG view or submitting the form that contains the FCKEditor instance.
Notes
This behavior can be easily reproduced using the online FCKEditor demo.
FCKEditor is used by various content management systems, frameworks and applications[2].
With a single whitespace character (‘space’), the span is preserved, but yet still seems to collapse in Safari. Will write a follow-up test.--BenWard 23:12, 20 January 2009 (UTC)
You can publish this pattern using the input type='hidden' class='value-title' title='2009-01-20' mentioned above. --BenWard 23:12, 20 January 2009 (UTC)
This issue does not effect FCKEditor instances that use the Placeholders plugin (this plugin comes with FCKEditor, enable it using FCKConfig.Plugins.Add('placeholder')). Examining the Placeholders plugin code may yield an easier work-around than including the plugin (I had a quick look but didn't see anything that was obviously responsible for allowing empty spans) -GeorgeBrock 23:20, 20 January 2009 (UTC)


General Test Feedback

  • Any general feedback you have on this test is most welcome. However, if you have issues with the pattern or alternate suggestions, please file them on the main value-excerption-pattern-issues page. Also, please remember to sign your comments with —~~~~ —BenWard 00:12, 9 January 2009 (UTC)

Related Pages