Microformat Objects Representing an Entire Page

(Difference between revisions)

Jump to: navigation, search

BenWard (Talk | contribs)
(Initial representative-object page, brainstorming how to get the more representative microformat objects from a page.)
Next diff →

Revision as of 06:39, 25 June 2009

Pages may (and often do) contain multiple microformats; objects of different vocabularies, multiple objects from the same vocabulary, and objects from other sources of structure data. There are use cases (such as Search Engine Results) that want to use microformats in the page to better represent the page, but must work out which object is the most important, the one that really represents the page, as opposed to being an incidental piece of data.

This page is for brainstorming methods of ranking all the structured data objects in a page, prioritizing and de-prioritizing them according to conditions, such that a consumer tool could pick the item ranked highest to confidently represent the page.


Examples of Problems

Note that the similarity of some of these problems highlights how subtle

  1. A personal blog homepage contains an hCard for the author and hAtom blog entries. You would represent the page using the hCard.
  2. A group blog contains hCards for the authors and hAtom entries. You would represent the page with information about the feed.

Please add more

Methods of Prioritisation

When you document a possible technique for analyzing/prioritizing each object, please give it a new heading. Follow this template for each new idea:

Invalid language.

You need to specify a language like this: <source lang="html">...</source>

Supported languages for syntax highlighting:

abap, actionscript, actionscript3, ada, apache, applescript, apt_sources, asm, asp, autoit, bash, basic4gl, blitzbasic, bnf, boo, c, c_mac, caddcl, cadlisp, cfdg, cfm, cil, cpp, cpp-qt, csharp, css, d, delphi, diff, div, dos, dot, eiffel, fortran, freebasic, genero, gettext, glsl, gml, groovy, haskell, html4strict, idl, ini, inno, io, java, java5, javascript, kixtart, klonec, klonecpp, latex, lisp, lotusformulas, lotusscript, lua, m68k, matlab, mirc, mpasm, mxml, mysql, nsis, objc, ocaml, ocaml-brief, oobas, oracle8, pascal, per, perl, php, php-brief, plsql, powershell, python, qbasic, rails, reg, robots, ruby, sas, scala, scheme, sdlbasic, smalltalk, smarty, sql, tcl, text, thinbasic, tsql, vb, vbnet, verilog, vhdl, visualfoxpro, winbatch, xml, xorg_conf, xpp, z80

Deprioritize Compound Objects

Many microformats including hCalendar, hAtom, hReview include sub-properties which are themselves microformats (author, organizer, agent etc.) Although parsable as standalone microformats as well, when used directly as a component of another microformat, they should be deprioritized. --BenWard 06:39, 25 June 2009 (UTC)

Deprioritize Objects Contained in hAtom Entries

Since hAtom entries represent articles, the content of each hentry may contain other microformat objects — blog posts about an event or another person for example may contain hCalendar and hCard microformats.

In a blogging context these entries are chronological content, their content is passing through the page as more content is written. As such, microformats nested inside entries could be deprioritized. --BenWard 06:39, 25 June 2009 (UTC)

Prioritize hCards with rel=me or any object with a uid property on the same domain

See also, representative-hcard for ways of working out which hCard is representative of a page, when compared to others (such as a blog author hcard in relation to hcards of article commentors)

Where an object is in a page with a property uid pointing to the same domain as the current page, and/or making a rel=me link to the same domain (hCards only), that object should be weighted in favour.

Some weight could be given to any object that links identifies the same domain in its url property. --BenWard 06:39, 25 June 2009 (UTC)

Prioritize hCards nested inside or containing address elements

Since the address element may only be used by a person or organisation that is responsible for the page, the present of this on or within an hCard should add weight to that hCard. --BenWard 06:39, 25 June 2009 (UTC)

Create a new microformat for authors to explicitly publish the representative object for their page

If authors could add a representative classname (or functional equivalent) to any microformat on a page, it could indicate to the parser that the object described is the definitive object over all others, and bypass heuristics. --BenWard 06:39, 25 June 2009 (UTC)

Microformat Objects Representing an Entire Page was last modified: Wednesday, December 31st, 1969