[uf-dev] Proper use of value

Manu Sporny msporny at digitalbazaar.com
Tue Apr 22 12:37:56 PDT 2008


Michael Kaply wrote:
> OK, how about this.
> 
> When retrieving individual values from the documenting if there is any
> whitespace, it is collapsed into one space, and leading and trailing
> white space is NOT removed.

Just my $0.02 on this - we had a very involved discussion (lasting
several months) when tackling this problem at the W3C with regards to
how to do whitespace canonicalization  in RDFa. In the end, we stated
that the parser should keep the original text as is (including all
whitespace), and it's up to the application to normalize spaces in a way
that makes sense to the application.

Note that we make a strong distinction between the parser (eg:
librdfa[1]) and the application using the parser (Firefox + Fuzzbot[2]).

The primary reasoning for this is that several people had different ways
that they wanted to canonicalize whitespace and at the end of the day,
we didn't want to force application writers into a certain method of
whitespace canonicalization. Here's the actual text that we settled upon
at the W3C with regard to whitespace canonicalization:

PLAIN LITERAL (aka: basic text) CANONICALIZATION:

"The actual literal is ... a string created by concatenating the text
content of each of the descendant elements of the [current element] in
document order."

This means that all new lines, tabs, spaces and other whitespace
characters are preserved for processing at a later time by the
application that is using the parser.

I think the above is the proper approach - otherwise you end up with the
issues that we had with whitespace canonicalization and Internet
Explorer 6. IE6 assumes that you want the whitespace canonicalized in a
certain way, thus the non-canonicalized whitespace isn't available in
the DOM accessed via Javascript. When you choose to perform whitespace
canonicalization in a certain way - you're bound to tick off a sub-set
of developers/authors. :)

Does this approach sound like a better one to take?

-- manu

[1] http://rdfa.digitalbazaar.com/librdfa/
[2] http://rdfa.digitalbazaar.com/fuzzbot/

-- 
Manu Sporny
President/CEO - Digital Bazaar, Inc.
blog: RDFa Basics in 8 minutes (video)
http://blog.digitalbazaar.com/2008/01/07/rdfa-basics/


More information about the microformats-dev mailing list