[uf-discuss] a question about concatenation and hAtom entry content

Ben Wiley Sittler bsittler at gmail.com
Wed May 30 10:49:39 PDT 2007

On 5/30/07, David Janes <davidjanes at blogmatrix.com> wrote:
> On 5/24/07, Ben Wiley Sittler <bsittler at gmail.com> wrote:
> > excerpted from http://microformats.org/wiki/hAtom#Entry_Content :
> > > an Entry MAY have 0 or more Entry Content elements. The "logical Entry Content" of an Entry is the
> > > concatenation, in order of appearance, of all the Entry Contents within the Entry
> > >
> > >  Many weblogs split content into multiple sections with a "Read More" link and javascript tricks. This
> > >  is also needed in cases where Entry Titles are coded inline and are considered part of the content.
> >
> > so if an hAtom entry contains
> >
> > > <p class="entry-content">Content</p>
> > > <!-- ad --><p><a href="http://mozilla.com"><img src="chrome://branding/content/about.png" alt="Get Firefox!" /></a></p>
> > > <p class="entry-content">More Content</p>
> >
> > is the logical entry content
> > 1. "ContentMore Content" (concatenation with no intervening space),
> > 2. "Content More Content" (concatenation with space),
> > 3. "Content
> > More Content" (concatenation with newline), or
> > 4. something else entirely?
> I need to think more about this, though I'm fairly certain the answer
> should be (1).

i lean more toward "newline if at least one element is block-level,
nothing otherwise" since this preserves the html semantics better, i
think. this is approximately equivalent to taking all the individual
html elements, putting them inside a <div>...</div>, and calculating
the (nonportable) innerText. this also means that you can concatenate
the elements inside a <div>...</div> and namespace it as XHTML and
create your atom feed without ever converting to plain text, e.g.

<atom:content type="xhtml" xmlns:atom="http://www.w3.org/2005/Atom">
 <div xmlns="http://www.w3.org/1999/xhtml"><p>Content</p><p>More

(this is equivalent to concatenating the xhtml with no intervening
whitespace, although the class names have been removed for clarity)

the no-intervening-whitespace approach leaves us with

<atom:content type="xhtml" xmlns:atom="http://www.w3.org/2005/Atom">
 <div xmlns="http://www.w3.org/1999/xhtml">ContentMore Content</div>

(this is equivalent to concatenating the innerText with no intervening

it seems to me that the former is closer to what the author wrote (and
intended), but i would like to hear other viewpoints and rationales.


More information about the microformats-discuss mailing list