microformats2-brainstorming: Difference between revisions

From Microformats Wiki
Jump to navigation Jump to search
(make implied properties consistent with microformats 2 property naming/prefixing conventions)
(→‎adopt itemref: Adds quote about itemref from whatwg html5 spec.)
 
(18 intermediate revisions by 4 users not shown)
Line 1: Line 1:
{{stub}}
Brainstorming experimental / undeveloped / rejected ideas for [[microformats-2]].
 
For the original brainstorming of microformats2 itself, see:
* [[microformats2-origins]]


Brainstorming experimental / undeveloped / rejected ideas for [[microformats-2]].


== further simplifications ==
== brainstorms to consider ==
=== document url ===
Since [[microformats2-parsing]] produces JSON with "rels" and "rel-urls", it may be useful to have a root context object which provides the URL of the document itself.


=== more on allow root class name only ===
=== adopt itemref ===
There many existing real-world use-cases where either:
* several microformats in a page want to share some common data without repeating it.
** e.g. a page about a product with multiple reviews of that product (very common, products sites, Amazon/CNET et al, review aggregators, Yelp et al)
** e.g. representing the author of multiple hAtom entries on a page. Currently this is possible with the <code>&lt;address class="hcard"></code> optimisation, which would be rendered obsolete by the proposed new generic parsing rules.
* a microformat in a page needs to incorporate information spread across different parts of a page, without assigning the entire page to that microformat


==== name default property on all ====
The [[include-pattern]] provides the necessary functionality for existing microformats (1.0).
* pick 'p-name' (per feedback from microformats mailing-lists over the years, Google Rich Snippets renaming, etc. as better to use 'name' instead of 'fn' - benefits outweigh costs) as the single required and thus implied property of *every* microformat. that is, make this part of the syntax - we're marking up proper nouns. [[User:Tantek|Tantek]] 14:50, 9 June 2011 (UTC)
** may not work for all microformats, e.g. how would [[adr]] or [[geo]] work with this?


use-cases: every single proper noun on any web page.
For 2.0 it may be reasonable to simply re-use the nice <code>itemref</code> attribute from microdata, with identical/analogous functionality.


to-do: try examples with organizations, events, products (however note that those are almost always capitalized thus implying proper noun / name semantics)
That is, when present on the root element of a microformat, the <code>itemref</code> attribute provides a space separated list of ids of elements in the document which are then incorporated as children of the microformat, before its actual children in the document. This is a simple coarse summary of course, and the actual itemref inclusion algorithm should be followed.


==== imply url property from root only a href ====
Questions and possible issues:
* imply 'u-url' property from a root class name only on an &lt;a href&gt;
<div class="discussion">
E.g.
* Does use of 'itemref' mean requiring [[HTML5]]?
<source lang=html4strict>
** No, <code>itemref</code> is not part of the HTML5 specification, it is currently only part of the [[microdata]] ''last call'' working draft. Thus we would be adding a "new" attribute to HTML above and beyond HTML5, though one that is already specified, and validated by current HTML validators.
<a class="h-card" href="http://chrismessina.me/">Chris Messina</a>
* Is 'itemref' documented as a stable draft?
</source>
** 'itemref' is defined in the Last Call Working Draft of microdata. Being "last call" it has some amount of stability, but could still change before it goes to candidate recommendation (CR).
* Doesn't microformats try to avoid introducing new attributes? (e.g. from RDFa in the past).
** Yes, in general microformats try to avoid introducing new attributes. It may be ok for the set of use-cases that need "itemref". That is, they *are* a minority of actual use-cases, and thus making them use a new attribute is probably ok.
* Wouldn't it be dangerous to adopt features of separate technologies that are unstable, may change, or may disappear?
** Indeed any time we consider adopting anything from technologies in working drafts we should consider their stability and dependability on a case-by-case basis. When we do decide to re-use such technologies, we should be sure to ''copy'' their definition/functionality and provide a non-normative reference to the source, rather than normatively depend on anything that could change or disappear. In the case of 'itemref', it's been stable for a while, and if we believe the [[schema.org]] implementation announcements, there are multiple real-world implementations that surface it in common ([[search]]) user interfaces.
* Wasn't RDFa a ''stable'' augmentation of HTML, and yet we resisted incorporated attributes from it?
** In practice, no, RDFa has continued to change evolve (which is good) in response to market feedback about its complexity. There was very little real world use (and thus exercising) of RDFa until Google provided it as a Rich Snippets alternative syntax in 2009. At this point, it may be reasonable to also consider attributes from RDFa (e.g. 'vocab' instead of 'profile'), however for this particular purpose (providing inclusion functionality), the 'itemref' attribute/feature makes the most sense.


parses as:
Note that itemref is only sort of part of microdata. According to the [https://html.spec.whatwg.org/multipage/microdata.html#attr-itemref whatwg html spec]:
* microformat: h-card
<blockquote cite="https://html.spec.whatwg.org/multipage/microdata.html#attr-itemref">
* implied 'name' from root element contents: Chris Messina
The itemref attribute is not part of the microdata data model. It is merely a syntactic construct to aid authors in adding annotations to pages where the data to be annotated does not follow a convenient tree structure. For example, it allows authors to mark up data in a table so that each column defines a separate item, while keeping the properties in the cells.
* implied 'url' from a href: http://chrismessina.me/
</blockquote>
</div>


use-cases - nearly every link to a person ever like on blog posts, blog rolls etc. all the same data/use-cases that informed [[XFN]].
=== hReview item backward compatibility ===
It maybe necessary (pending research/evidence) to add backward compatible parsing for the class name "item" inside the backward compatible parsing for the root class name "hreview". If so, here are some notes on how to add that to the [[microformats2]] spec on v2 vocabularies.


to-do: try examples with organizations, events, products
==== h-item ====
...


==== imply name and photo properties from root only img ====
For backward compatibility, microformats 2 parsers {{should}}, when parsing an "hreview" for backwards compatibility, detect the following root class name and property names. A microformats 2 parser may use existing microformats [[parsers]] to extract these properties. If an "h-item" is found, don't look for an "item" on the same element.
* imply 'u-photo' property from a root class name only on an &lt;img src alt&gt;
E.g.
<source lang=html4strict>
<img class="h-card" src="http://example.org/pic.jpg" alt="Chris Messina" />
</source>


parses as:
compat root class name: <code id="item">item</code><br/>
* microformat: h-card
properties: (parsed as '''p-''' plain text unless otherwise specified)
* implied 'p-name' from img alt: Chris Messina
* <code>fn</code> - parse as '''<code>p-name</code>'''
* implied 'u-photo' from img src: <nowiki>http://example.org/pic.jpg</nowiki>
* <code>photo</code> - parse as '''<code>u-photo</code>'''
* <code>url</code> - parse as '''<code>u-url</code>'''


use-cases - nearly every social networking site that shows a grid of people without text already does so with img tags and their names in the alt attribute. most of those are linked too, which brings me to the next imply:
Note: we should analyze [[hreview-examples-in-wild]] to see if there are any that actually depend on parsing for "item", or if simply looking for "fn", "photo", and "url" directly inside a root class name of "hreview" is sufficient. If so, we should '''DROP''' this backward-compat parsing for the class name "item" as it may otherwise produce too many false positives ("item" is a fairly common term).


to-do: try examples with organizations, events, products
==== h-review ====
...
* <code>item</code> - including compat root vcard|vevent in the absence of h-card|h-event
...


==== imply name and photo property from img only child of a href ====
Note: we should analyze [[hreview-examples-in-wild]] to see if there are any that actually depend on parsing for "item", or if simply looking for "fn", "photo", and "url" directly inside a root class name of "hreview" is sufficient. If so, we should '''DROP''' the backward-compat parsing for the class name "item" as it may otherwise produce too many false positives ("item" is a fairly common term).
Combining the above:
<source lang=html4strict>
<a class="h-card" href="http://chrismessina.me/">
<img src="http://example.org/pic.jpg" alt="Chris Messina" />
</a>
</source>


parses as:
=== register a mime type? ===
* microformat: h-card
See [[microformats2-mime-type]].
* implied 'u-url' from a href: http://chrismessina.me/
* implied 'p-name' from img alt: Chris Messina
* implied 'u-photo' from img src: <nowiki>http://example.org/pic.jpg</nowiki>


use-cases - nearly every social networking site that shows a grid of people without text already does so with hyperlinked img tags and their names in the alt attribute.


to-do: try examples with organizations, events, products
== accepted ideas ==
=== more on allow root class name only ===
This has been stable for a while, see:
* [[microformats-2-implied-properties]]


== rejected ideas ==
== rejected ideas ==
=== n prefix for multiple numbers ===
=== n prefix for multiple numbers ===
Idea:
Idea:
Line 73: Line 80:


Rejected because while this *might* work for some properties in *English* it will NOT localize/internationalize well (orders of numbers in phrases change in different languages), and it will also limit the human expressivity of the plain text.  Thanks to Ben Ward for this feedback at the 2011-06-02 microformats dinner. [[User:Tantek|Tantek]] 14:25, 9 June 2011 (UTC)
Rejected because while this *might* work for some properties in *English* it will NOT localize/internationalize well (orders of numbers in phrases change in different languages), and it will also limit the human expressivity of the plain text.  Thanks to Ben Ward for this feedback at the 2011-06-02 microformats dinner. [[User:Tantek|Tantek]] 14:25, 9 June 2011 (UTC)
=== incorporate rel ===
Update: requiring rel with class tends to confuse web authors (lots of anecdotal experience here), thus [[microformats2]] itself does not require any use of rel by publishers, and keeps it only as part of backcompat handling of a small handful of classic microformats (e.g. [[rel-tag]] in [[hAtom]], [[hCard]]).
The rest of this proposal (of formally adding rel-* handling inside of microformats root class names) was rejected long ago.
==== profile h-card rel-me ====
Many sites have profile pages (personal home pages, and social networks), marked up with hCard, and permitting one or more rel-me values to other profiles, e.g. if this content were on <nowiki>http://tantek.com/</nowiki> :
Tantek Çelik ([https://twitter.com/t @t], [http://github.com/tantek github.com/tantek])
with source:
<source lang=html4strict>
<span class="h-card">
  <span class="p-name">Tantek Çelik</span>
(<a class="u-url" rel="me"
    href="https://twitter.com/t"
    >@t</a>,
  <a class="u-url" rel="me"
    href="http://github.com/tantek"
    >github.com/tantek</a>)
</span>
</source>
Parsed JSON per microformats2 properties:
<source lang=javascript>
{
  "items": [{
    "type": ["h-card"],
    "properties": {
      "name": ["Tantek Çelik"],
      "url": ["https://twitter.com/t",
              "http://github.com/tantek"]
    }
  }]
}
</source>
We could incorporate rel property parsing either as just another property (like a 'u-' property) scoped to the microformat:
<source lang=javascript>
{
  "items": [{
    "type": ["h-card"],
    "properties": {
      "name": ["Tantek Çelik"],
      "url": ["https://twitter.com/t",
              "http://github.com/tantek"],
      "rel-me": ["https://twitter.com/t",
              "http://github.com/tantek"]
    }
  }]
}
</source>
Or as always within global scope (closer to HTML5's currently defined scoping for 'rel' attributed links in a document):
<source lang=javascript>
{
  "items": [{
    "url": "http://tantek.com/",
    "rel-stylesheet": ["...", "..."],
    "rel-me": ["https://twitter.com/t",
              "http://github.com/tantek"]
    },
    {
    "type": ["h-card"],
    "properties": {
      "name": ["Tantek Çelik"],
      "url": ["https://twitter.com/t",
              "http://github.com/tantek"],
    }
  }]
}
</source>
Note: "url" in that root level object is the URL of the document itself, which is necessary for:
* 'rel' semantics: which URL from to which other URL the rels apply to.
* vCard .vcf export "SOURCE:" property (URL that the vCards were derived from).


== see also ==
== see also ==
* [[microformats-2]]
* [[microformats2]]
* [[microformats2-parsing]]
* [[microformats2-brainstorming]]
* [[microformats2-experimental-properties]]
* [[microformats2-prefixes]]
* [[microformats2-implied-properties]]
* [[microformats2-faq]]
* [[html-stripping-examples]]
* [[microformats2-origins]]

Latest revision as of 23:41, 14 December 2020

Brainstorming experimental / undeveloped / rejected ideas for microformats-2.

For the original brainstorming of microformats2 itself, see:


brainstorms to consider

document url

Since microformats2-parsing produces JSON with "rels" and "rel-urls", it may be useful to have a root context object which provides the URL of the document itself.

adopt itemref

There many existing real-world use-cases where either:

  • several microformats in a page want to share some common data without repeating it.
    • e.g. a page about a product with multiple reviews of that product (very common, products sites, Amazon/CNET et al, review aggregators, Yelp et al)
    • e.g. representing the author of multiple hAtom entries on a page. Currently this is possible with the <address class="hcard"> optimisation, which would be rendered obsolete by the proposed new generic parsing rules.
  • a microformat in a page needs to incorporate information spread across different parts of a page, without assigning the entire page to that microformat

The include-pattern provides the necessary functionality for existing microformats (1.0).

For 2.0 it may be reasonable to simply re-use the nice itemref attribute from microdata, with identical/analogous functionality.

That is, when present on the root element of a microformat, the itemref attribute provides a space separated list of ids of elements in the document which are then incorporated as children of the microformat, before its actual children in the document. This is a simple coarse summary of course, and the actual itemref inclusion algorithm should be followed.

Questions and possible issues:

  • Does use of 'itemref' mean requiring HTML5?
    • No, itemref is not part of the HTML5 specification, it is currently only part of the microdata last call working draft. Thus we would be adding a "new" attribute to HTML above and beyond HTML5, though one that is already specified, and validated by current HTML validators.
  • Is 'itemref' documented as a stable draft?
    • 'itemref' is defined in the Last Call Working Draft of microdata. Being "last call" it has some amount of stability, but could still change before it goes to candidate recommendation (CR).
  • Doesn't microformats try to avoid introducing new attributes? (e.g. from RDFa in the past).
    • Yes, in general microformats try to avoid introducing new attributes. It may be ok for the set of use-cases that need "itemref". That is, they *are* a minority of actual use-cases, and thus making them use a new attribute is probably ok.
  • Wouldn't it be dangerous to adopt features of separate technologies that are unstable, may change, or may disappear?
    • Indeed any time we consider adopting anything from technologies in working drafts we should consider their stability and dependability on a case-by-case basis. When we do decide to re-use such technologies, we should be sure to copy their definition/functionality and provide a non-normative reference to the source, rather than normatively depend on anything that could change or disappear. In the case of 'itemref', it's been stable for a while, and if we believe the schema.org implementation announcements, there are multiple real-world implementations that surface it in common (search) user interfaces.
  • Wasn't RDFa a stable augmentation of HTML, and yet we resisted incorporated attributes from it?
    • In practice, no, RDFa has continued to change evolve (which is good) in response to market feedback about its complexity. There was very little real world use (and thus exercising) of RDFa until Google provided it as a Rich Snippets alternative syntax in 2009. At this point, it may be reasonable to also consider attributes from RDFa (e.g. 'vocab' instead of 'profile'), however for this particular purpose (providing inclusion functionality), the 'itemref' attribute/feature makes the most sense.

Note that itemref is only sort of part of microdata. According to the whatwg html spec:

The itemref attribute is not part of the microdata data model. It is merely a syntactic construct to aid authors in adding annotations to pages where the data to be annotated does not follow a convenient tree structure. For example, it allows authors to mark up data in a table so that each column defines a separate item, while keeping the properties in the cells.

hReview item backward compatibility

It maybe necessary (pending research/evidence) to add backward compatible parsing for the class name "item" inside the backward compatible parsing for the root class name "hreview". If so, here are some notes on how to add that to the microformats2 spec on v2 vocabularies.

h-item

...

For backward compatibility, microformats 2 parsers SHOULD, when parsing an "hreview" for backwards compatibility, detect the following root class name and property names. A microformats 2 parser may use existing microformats parsers to extract these properties. If an "h-item" is found, don't look for an "item" on the same element.

compat root class name: item
properties: (parsed as p- plain text unless otherwise specified)

  • fn - parse as p-name
  • photo - parse as u-photo
  • url - parse as u-url

Note: we should analyze hreview-examples-in-wild to see if there are any that actually depend on parsing for "item", or if simply looking for "fn", "photo", and "url" directly inside a root class name of "hreview" is sufficient. If so, we should DROP this backward-compat parsing for the class name "item" as it may otherwise produce too many false positives ("item" is a fairly common term).

h-review

...

  • item - including compat root vcard|vevent in the absence of h-card|h-event

...

Note: we should analyze hreview-examples-in-wild to see if there are any that actually depend on parsing for "item", or if simply looking for "fn", "photo", and "url" directly inside a root class name of "hreview" is sufficient. If so, we should DROP the backward-compat parsing for the class name "item" as it may otherwise produce too many false positives ("item" is a fairly common term).

register a mime type?

See microformats2-mime-type.


accepted ideas

more on allow root class name only

This has been stable for a while, see:

rejected ideas

n prefix for multiple numbers

Idea:

  • "n-*" for (one or more) numbers, e.g. "n-rating", "n-geo", leaving the semantics of more than one number up to specific format. e.g. for an "n-rating" inside an "h-review", the first number would presumably be the rating value, when only two numbers the second would be the "best" value (e.g. rated <span class="n-rating">3 out of 4</span>), when three numbers the second would be the "worst" and the third would be the "best" (e.g. <span class="n-rating">7.5 out of 1 to 10</span>). similarly "n-geo" would specify the first number to be the latitude and the second to be the longitude.

Rejected because while this *might* work for some properties in *English* it will NOT localize/internationalize well (orders of numbers in phrases change in different languages), and it will also limit the human expressivity of the plain text. Thanks to Ben Ward for this feedback at the 2011-06-02 microformats dinner. Tantek 14:25, 9 June 2011 (UTC)

incorporate rel

Update: requiring rel with class tends to confuse web authors (lots of anecdotal experience here), thus microformats2 itself does not require any use of rel by publishers, and keeps it only as part of backcompat handling of a small handful of classic microformats (e.g. rel-tag in hAtom, hCard).

The rest of this proposal (of formally adding rel-* handling inside of microformats root class names) was rejected long ago.

profile h-card rel-me

Many sites have profile pages (personal home pages, and social networks), marked up with hCard, and permitting one or more rel-me values to other profiles, e.g. if this content were on http://tantek.com/ :

Tantek Çelik (@t, github.com/tantek)

with source:

<span class="h-card">
  <span class="p-name">Tantek Çelik</span>
 (<a class="u-url" rel="me"
     href="https://twitter.com/t"
     >@t</a>, 
  <a class="u-url" rel="me"
     href="http://github.com/tantek"
     >github.com/tantek</a>)
</span>

Parsed JSON per microformats2 properties:

{
  "items": [{ 
    "type": ["h-card"],
    "properties": {
      "name": ["Tantek Çelik"],
      "url": ["https://twitter.com/t", 
              "http://github.com/tantek"]
    }
  }]
}

We could incorporate rel property parsing either as just another property (like a 'u-' property) scoped to the microformat:

{
  "items": [{ 
    "type": ["h-card"],
    "properties": {
      "name": ["Tantek Çelik"],
      "url": ["https://twitter.com/t", 
              "http://github.com/tantek"],
      "rel-me": ["https://twitter.com/t", 
              "http://github.com/tantek"]
    }
  }]
}

Or as always within global scope (closer to HTML5's currently defined scoping for 'rel' attributed links in a document):

{
  "items": [{
    "url": "http://tantek.com/",
    "rel-stylesheet": ["...", "..."],
    "rel-me": ["https://twitter.com/t", 
               "http://github.com/tantek"]
    },
    { 
    "type": ["h-card"],
    "properties": {
      "name": ["Tantek Çelik"],
      "url": ["https://twitter.com/t", 
              "http://github.com/tantek"],
    }
  }]
}

Note: "url" in that root level object is the URL of the document itself, which is necessary for:

  • 'rel' semantics: which URL from to which other URL the rels apply to.
  • vCard .vcf export "SOURCE:" property (URL that the vCards were derived from).


see also