[uf-discuss] multi-word tags
Andy Mabbett
andy at pigsonthewing.org.uk
Tue Sep 25 06:12:01 PDT 2007
The rel-tag spec says:
Spaces can be encoded either as + or %20.
(I think that "can" ought to be an RFC2119 "SHOULD", BTW; other specs
may similarly need to have RFC2119 applied more rigorously).
However, I see sites, in the wild, using dashes (hyphens) and others
which encode spaces as underscores (e.g. del.icio.us; Wikipedia and
other MediaWiki sites). See previously-compiled evidence, at:
<http://microformats.org/wiki/rel-tag-spaces>
Indeed, pages on the microformats wiki use hyphens in URLs which would
seem suitable for use as tag spaces:
<a
rel="tag"
href="http://microformats.org/wiki/existing-rel-values"
>
existing rel values
</a>
Operator, for example, regards:
West+Midland+Bird+Club
West-Midland-Bird-Club
and:
West_Midland_Bird_Club
as three distinct tags, and does not discard them as duplicates (see
test page at <http://www.westmidlandbirdclub.com/tag-test.htm> ).
What do other parsers and implementations do? Should the spec be
altered, so that the above, and:
West%20Midland%20Bird%20Club
are all deemed equal?
Likewise, for that matter:
West+Midland-Bird_Club
This might be achieved by saying that spaces in tags SHOULD be encoded
as encoded as either + or %20, but that parsers MUST treat dashes and
underscores as spaces.
Or we could simply say that spaces in tags SHOULD be encoded as encoded
as +, %20, - or _
--
Andy Mabbett
More information about the microformats-discuss
mailing list