[uf-discuss] Tentative proposal for "What's New" listings
David Osolkowski
qidydl at gmail.com
Wed Sep 27 21:17:43 PDT 2006
On 9/26/06, Andy Mabbett <andy at pigsonthewing.org.uk> wrote:
> >there instead of " would
> >be perfectly legal and solve the problem, the escaped ampersand is my
> >code escaping out your HTML entities, which the validator then finds
> >bad because there should be no enitities in a <title>).
>
> it seems reasonable to me that, if the HTML in question contains "&"
> then the corresponding title component of the feed should contain
> "&". Why is that not the case?
Unfortunately, escaping special characters in RSS feeds is almost
entirely unspecified. They can be unescaped, single-escaped,
double-escaped, even triple-escaped, and there's not always
standardization on one method. This is one of the big reasons the
Atom format was developed in the first place. So if the HTML *source*
contains "&" (for the sake of playing nice), converting that to
RSS could produce any of "&", "&", or "&amp;" and each one
would be considered valid by different people and software. I believe
this is also why the feed validator prints a warning; it honestly
doesn't know whether this will work or not.
http://weblog.philringnalda.com/2005/12/18/who-knows-a-title-from-a-hole-in-the-ground
illustrates some of the variety in support for handling different
methods of escaping even when using a format with well-defined rules.
If possible, it makes things easier to just not use any special
characters in your title at all.
- David
More information about the microformats-discuss
mailing list