microformats2-implied-properties: Difference between revisions

From Microformats Wiki
Jump to navigation Jump to search
(→‎additional markup patterns to consider: provide finer grouping / explanation, and add root element with one child and text based on Barnaby Walters's example in the wild)
Line 138: Line 138:


== additional markup patterns to consider ==
== additional markup patterns to consider ==
Document more use-cases from actual publishing examples in the wild of simple items, e.g. from Wikipedia.
And some per the principle of least surprise - fleshing out some of the patterns to consider additional different combinations of elements, but in the same basic pattern(s).
=== root class on one element ===
Let's consider root class only on:
Let's consider root class only on:
* '''treat single hyperlink child as existing single hyperlink case above.''' I.e. imply u-url from its href. e.g.: <code>&lt;span class="h-card"&gt;&lt;a href="http://example.com"&gt;Ace Brown&lt;/a&gt;&lt;/span&gt;</code> - this markup is generated in MediaWiki installs (e.g. Wikipedia) when you surround a MediaWiki link like <nowiki>[URL label] or [[Wiki page name]]</nowiki> with a <code>&lt;span&gt;</code> with a microformats root class name. Real world examples:  
* '''<code>&lt;abbr class="h-card" title&gt;</code>''' - imply '''<code>p-name</code>''' from <code>title</code>
* '''<code>&lt;object class="h-card" type="image/..." data="..."></code>''' - imply '''<code>u-photo</code>''' from <code>data</code> and '''<code>p-name</code>''' from text contents as with elements in general.
* ...
=== root element and one child ===
* '''<code>&lt;a class="h-card" href>&lt;object type="image/..." data="..."></code>''' - imply '''<code>u-url</code>''' from <code>href</code> and imply '''<code>u-photo</code>''' from the object element's <code>data</code> and '''<code>p-name</code>''' from text contents as with elements in general.
* '''<code>&lt;a class="h-card" href>&lt;abbr title&gt;</code>''' - imply '''<code>u-url</code>''' from <code>href</code> and imply '''<code>p-name</code>''' from the abbr element's <code>title</code>
* '''<code>&lt;span class="h-card"&gt;&lt;a href="http://example.com"&gt;Ace Brown&lt;/a&gt;&lt;/span&gt;</code>''' - '''treat single hyperlink child as existing single hyperlink case.''' I.e. imply '''<code>u-url</code>''' from its href. This markup is generated in MediaWiki installs (e.g. Wikipedia) when you surround a MediaWiki link like <nowiki>[URL label] or [[Wiki page name]]</nowiki> with a <code>&lt;span&gt;</code> with a microformats root class name. Real world examples:  
** http://www.w3.org/wiki/TPAC2011-Committee
** http://www.w3.org/wiki/TPAC2011-Committee
* ...similarly with single hyperlink child itself with single img child etc.
* '''<code>&lt;span class="h-card"&gt;&lt;a href="http://example.com"&gt;&lt;img src alt="Ace Brown"/&lt;/a&gt;&lt;/span&gt;</code>''' ...similarly with single hyperlink child itself with single img child etc.
* <code>&lt;abbr title&gt;</code> - imply '''<code>p-name</code>''' from <code>title</code>
=== root element with one child and text ===
* <code>&lt;a href>&lt;abbr title&gt;</code> - imply '''<code>u-url</code>''' from <code>href</code> and imply '''<code>p-name</code>''' from the abbr element's <code>title</code>
* '''<code>&lt;a class="h-card" href>&lt;img alt="" src/> Ace Brown&lt;/a></code>''' - a variant of the hyperlink with image pattern, this is an example where the alt text is provided as actual visible text rather than an alternative, and therefore an explicitly empty alt="" is provided to avoid duplicate/redundant information in non-visual renderings (e.g. speech). Imply '''<code>u-url</code>''' from href, '''<code>u-photo</code>''' from src, and '''<code>p-name</code>''' from the text content of the root element, with leading/trailing white-space removed. Real world examples:
* <code>&lt;object type="image/..." data="..."></code> - imply '''<code>u-photo</code>''' from <code>data</code> and '''<code>p-name</code>''' from text contents as with elements in general.
** http://test.waterpigs.co.uk/activity/ - the element with class name "microcard"
* <code>&lt;a href>&lt;object type="image/..." data="..."></code> - imply '''<code>u-url</code>''' from <code>href</code> and imply '''<code>u-photo</code>''' from the object element's <code>data</code> and '''<code>p-name</code>''' from text contents as with elements in general.
 
Can we document some use-cases, e.g. from Wikipedia?


== rejected root only markup patterns ==
== rejected root only markup patterns ==

Revision as of 20:31, 1 September 2012

<entry-title>microformats 2 implied properties</entry-title>

summary

As part of the further simplifications in microformats-2, the following generic properties are automatically implied by microformats with only a root class name with certain common semantic markup patterns.

  • p-name on elements in general
  • u-url on hyperlinks (<a href>) elements
  • u-photo on images (<img src>) elements

and combinations thereof.

root class only and name property

Web pages with microformats nearly always markup proper nouns, which have a primary label of some sort, commonly known as a name.

Thus if a microformat only has a root class name and no properties, then the entire text contents of the element is parsed as the p-name property of the microformat.

Markup example:

Simple person reference:

<span class="h-card">Frances Berriman</span>

Parsed JSON with implied name property:

{
  "type": ["h-card"],
  "properties": {
    "name": ["Frances Berriman"] 
  }
}

to-do:

  • try examples with organizations, events, products (however note that those are almost always capitalized thus implying proper noun / name semantics)

use-cases:

  • every single proper noun on any web page.

issues:

  • may not work for all microformats, e.g. how would adr or geo work with this?

aside name vs fn

In short, why use 'name' instead of the well established 'fn' from existing microformats? (per naming-principle etc.) Per feedback:

  • from microformats mailing-lists over the years
    • numerous individuals preferring 'name'
    • asking why 'fn' repeatedly despite FAQ
  • Google Rich Snippets renaming 'fn' as 'name' in RDFa and microdata variants of vocabularies
  • ...

Choosing to use 'name' as the generic term moving forward instead of 'fn' - the benefits outweigh costs.

hyperlink and url property

A significant proportion of proper noun references on web pages are a hyperlink to a page about that proper noun, AKA a URL for a proper noun.

Thus if a microformat is on a hyperlink element with only a root class name and no properties, then in addition to implying the name from its text contents, the href of the element is parsed as the u-url property of the microformat.

Markup example:

Simple hyperlinked person reference

<a class="h-card" href="http://benward.me">Ben Ward</a>

Parsed JSON:

{ 
  "type": ["h-card"],
  "properties": {
    "name": ["Ben Ward"],
    "url": ["http://benward.me"]
  }
}

to-do:

  • try examples with organizations, events, products

use-cases:

  • nearly every link to a person ever like on blog posts, blog rolls etc. all the same data/use-cases that informed XFN.

image and name photo properties

Many proper noun references on web pages are simply an embedded image of that proper noun, typically a representative photo of the proper noun, often with its name as alt text.

Thus if a microformat is on an image element (<img src>) with only a root class name and no properties, then the src of the element is parsed as the u-photo property of the microformat, and if present, the alt of the element is parsed as the p-name property of the microformat.

Markup example:

<img class="h-card" src="http://example.org/pic.jpg" alt="Chris Messina" />

Parsed JSON:

{ 
  "type": ["h-card"],
  "properties": {
    "name": ["Chris Messina"],
    "photo": ["http://example.org/pic.jpg"]
  }
}

to-do:

  • try examples with organizations, events, products

use-cases:

  • nearly every social networking site that shows a grid of people without text already does so with img tags and their names in the alt attribute. Most of those are linked too, which brings us to the next markup structure which implies properties:

hyperlinked image and name photo url properties

Often proper noun references on web pages are a representative image of the proper noun that is hyperlinked to a page about that proper noun.

Thus if a microformat is on a hyperlink element with only a root class name and no properties, and that element has but one child element that is an image, then in addition to implying the the URL from its href, the src of the child image element is parsed as the u-photo property of the microformat, and if present, its alt attribute is parsed as the p-name property of the microformat.

Markup example:

<a class="h-card" href="http://rohit.khare.org/">
 <img alt="Rohit Khare"
      src="https://s3.amazonaws.com/twitter_production/profile_images/53307499/180px-Rohit-sq_bigger.jpg" />
</a>

Parsed JSON:

{ 
  "type": ["h-card"],
  "properties": {
    "name": ["Rohit Khare"],
    "url": ["http://rohit.khare.org"],
    "photo": ["https://s3.amazonaws.com/twitter_production/profile_images/53307499/180px-Rohit-sq_bigger.jpg"]
  }
}

to-do:

  • try examples with organizations, events, products

use-cases:

  • nearly every social networking site that shows a grid of people without text already does so with hyperlinked img tags and their names in the alt attribute.


additional markup patterns to consider

Document more use-cases from actual publishing examples in the wild of simple items, e.g. from Wikipedia.

And some per the principle of least surprise - fleshing out some of the patterns to consider additional different combinations of elements, but in the same basic pattern(s).

root class on one element

Let's consider root class only on:

  • <abbr class="h-card" title> - imply p-name from title
  • <object class="h-card" type="image/..." data="..."> - imply u-photo from data and p-name from text contents as with elements in general.
  • ...

root element and one child

  • <a class="h-card" href><object type="image/..." data="..."> - imply u-url from href and imply u-photo from the object element's data and p-name from text contents as with elements in general.
  • <a class="h-card" href><abbr title> - imply u-url from href and imply p-name from the abbr element's title
  • <span class="h-card"><a href="http://example.com">Ace Brown</a></span> - treat single hyperlink child as existing single hyperlink case. I.e. imply u-url from its href. This markup is generated in MediaWiki installs (e.g. Wikipedia) when you surround a MediaWiki link like [URL label] or [[Wiki page name]] with a <span> with a microformats root class name. Real world examples:
  • <span class="h-card"><a href="http://example.com"><img src alt="Ace Brown"/</a></span> ...similarly with single hyperlink child itself with single img child etc.

root element with one child and text

  • <a class="h-card" href><img alt="" src/> Ace Brown</a> - a variant of the hyperlink with image pattern, this is an example where the alt text is provided as actual visible text rather than an alternative, and therefore an explicitly empty alt="" is provided to avoid duplicate/redundant information in non-visual renderings (e.g. speech). Imply u-url from href, u-photo from src, and p-name from the text content of the root element, with leading/trailing white-space removed. Real world examples:

rejected root only markup patterns

Is there anything that should be implied semantically from root class only on:

  • <audio src> - no, because there is no common practice of simple audio element usage for proper nouns, it's nearly always nested in a broader context.
  • <time datetime> - no, because there are no microformats that are just a datetime
  • <video src> - no, because there is no common practice of simple video element usage, it's nearly always nested in a broader context.

Use-case counter-examples welcome.

see also