[uf-discuss] Nested Microformats

Brian Suda brian.suda at gmail.com
Sat Dec 9 08:35:50 PST 2006

On 12/9/06, David Janes <davidjanes at blogmatrix.com> wrote:
> The issue that I've been trying to solve in my mind (and I'm sure
> we're all on the same page here) is given an attribute A nested in
> micrformats M, N and P (from inner to outer), is "what does A belong
> to". If the answer is "all of them" then there seems to seems to be a
> potential conflict "consistent meaning" and "same meaning".
> Consider this nesting:
> <body>
> <div class="hentry">
>   ...
>   <div class="published">8 December 2006</div>
> </div>
> <div class="published">9 December 2006</div>
> </body>

--- correct, the one rule we do have for some of these issues is the
"singleton property" where "published" will only be extracted/decoded
from the first instance. So, if you have the page level "published"
before the hentry one, then everything would work out fine. The first
instance of published (the page level one) would be associated with
the whole page and any subsequent "published" values would be ignored
- now when parsing just hentry, since the parser would "start" looking
for microformatted data "below"/"nested-in" the hentry instance,  the
"first instance" of "published" is outside the hentry and therefore
would NOT be the first instance in relation to the hentry and the
first instance of the nested "published" would be used for the hentry
published date. (i hope that makes sense, i re-read it and i think it
is clear - if not i can write-out a full example.)

> In this example, I'm reusing "published" to mean to "the date of
> publication of a microformatted object"; in one case, a blog entry and
> in the other case, the page itself. This reuses the "published" class
> from hAtom to a new microformat for describing the publication date of
> the page (some research has happened on this in the past). If we ask
> the parser for "give me the publication date of the page", then
> obviously it has the sort out which to use. We could define a whole
> new class for describing the publication date of the page, but then we
> have multiple classes meaning more or less the same thing.

--- yes, for better or worse, sometime we have created several
properties which sometime mean the same thing, but with different
names. in hAtom we have published and updated, with hCard we have REV
and with hCalendar we have last-updated. So we SORT OF have 4 values
with similar semantics. If this is a issue we could start a discussion
about collapsing some of this data, or we could just not worry about
it until we get to real-world issues (which is what i would prefer).

We could ALSO make new parsing rules. If we do the proper research and
find that all the pages we find that publish a "last-updated"
timestamp are at the bottom, then it is not inconcievable that the
rule for parsing last-page-update-date-time is the "last instance",
now that too has its pro's and con's...

> I don't have a happy solution for this and maybe it just comes down to
> "work it out case by case". However, I potentially see it to be very
> useful to reuse things like "fn" in nested microformats.

--- There was a discussion ALONG time ago for an MSO, value. Which (i
think) stood for "microformats something/simple object" which
basically, was a "don't look in here" (if i remember correctly) it
never really went anywhere because the "only use the first instance"
solved all of the known issues, but i'll let someone who knows more
about MSO elaborate about it.


brian suda

More information about the microformats-discuss mailing list