microformats2-brainstorming: Difference between revisions

From Microformats Wiki
Jump to navigation Jump to search
(add items hash wrappers to JSON examples that should have been there ages ago)
(→‎adopt itemref: Adds quote about itemref from whatwg html5 spec.)
 
(6 intermediate revisions by 3 users not shown)
Line 1: Line 1:
Brainstorming experimental / undeveloped / rejected ideas for [[microformats-2]].
Brainstorming experimental / undeveloped / rejected ideas for [[microformats-2]].


== further simplifications ==
For the original brainstorming of microformats2 itself, see:
* [[microformats2-origins]]


== brainstorms to consider ==
=== document url ===
Since [[microformats2-parsing]] produces JSON with "rels" and "rel-urls", it may be useful to have a root context object which provides the URL of the document itself.
=== adopt itemref ===
There many existing real-world use-cases where either:
* several microformats in a page want to share some common data without repeating it.
** e.g. a page about a product with multiple reviews of that product (very common, products sites, Amazon/CNET et al, review aggregators, Yelp et al)
** e.g. representing the author of multiple hAtom entries on a page. Currently this is possible with the <code>&lt;address class="hcard"></code> optimisation, which would be rendered obsolete by the proposed new generic parsing rules.
* a microformat in a page needs to incorporate information spread across different parts of a page, without assigning the entire page to that microformat
The [[include-pattern]] provides the necessary functionality for existing microformats (1.0).
For 2.0 it may be reasonable to simply re-use the nice <code>itemref</code> attribute from microdata, with identical/analogous functionality.
That is, when present on the root element of a microformat, the <code>itemref</code> attribute provides a space separated list of ids of elements in the document which are then incorporated as children of the microformat, before its actual children in the document. This is a simple coarse summary of course, and the actual itemref inclusion algorithm should be followed.
Questions and possible issues:
<div class="discussion">
* Does use of 'itemref' mean requiring [[HTML5]]?
** No, <code>itemref</code> is not part of the HTML5 specification, it is currently only part of the [[microdata]] ''last call'' working draft. Thus we would be adding a "new" attribute to HTML above and beyond HTML5, though one that is already specified, and validated by current HTML validators.
* Is 'itemref' documented as a stable draft?
** 'itemref' is defined in the Last Call Working Draft of microdata. Being "last call" it has some amount of stability, but could still change before it goes to candidate recommendation (CR).
* Doesn't microformats try to avoid introducing new attributes? (e.g. from RDFa in the past).
** Yes, in general microformats try to avoid introducing new attributes. It may be ok for the set of use-cases that need "itemref". That is, they *are* a minority of actual use-cases, and thus making them use a new attribute is probably ok.
* Wouldn't it be dangerous to adopt features of separate technologies that are unstable, may change, or may disappear?
** Indeed any time we consider adopting anything from technologies in working drafts we should consider their stability and dependability on a case-by-case basis. When we do decide to re-use such technologies, we should be sure to ''copy'' their definition/functionality and provide a non-normative reference to the source, rather than normatively depend on anything that could change or disappear. In the case of 'itemref', it's been stable for a while, and if we believe the [[schema.org]] implementation announcements, there are multiple real-world implementations that surface it in common ([[search]]) user interfaces.
* Wasn't RDFa a ''stable'' augmentation of HTML, and yet we resisted incorporated attributes from it?
** In practice, no, RDFa has continued to change evolve (which is good) in response to market feedback about its complexity. There was very little real world use (and thus exercising) of RDFa until Google provided it as a Rich Snippets alternative syntax in 2009. At this point, it may be reasonable to also consider attributes from RDFa (e.g. 'vocab' instead of 'profile'), however for this particular purpose (providing inclusion functionality), the 'itemref' attribute/feature makes the most sense.
Note that itemref is only sort of part of microdata. According to the [https://html.spec.whatwg.org/multipage/microdata.html#attr-itemref whatwg html spec]:
<blockquote cite="https://html.spec.whatwg.org/multipage/microdata.html#attr-itemref">
The itemref attribute is not part of the microdata data model. It is merely a syntactic construct to aid authors in adding annotations to pages where the data to be annotated does not follow a convenient tree structure. For example, it allows authors to mark up data in a table so that each column defines a separate item, while keeping the properties in the cells.
</blockquote>
</div>
=== hReview item backward compatibility ===
It maybe necessary (pending research/evidence) to add backward compatible parsing for the class name "item" inside the backward compatible parsing for the root class name "hreview". If so, here are some notes on how to add that to the [[microformats2]] spec on v2 vocabularies.
==== h-item ====
...
For backward compatibility, microformats 2 parsers {{should}}, when parsing an "hreview" for backwards compatibility, detect the following root class name and property names. A microformats 2 parser may use existing microformats [[parsers]] to extract these properties. If an "h-item" is found, don't look for an "item" on the same element.
compat root class name: <code id="item">item</code><br/>
properties: (parsed as '''p-''' plain text unless otherwise specified)
* <code>fn</code> - parse as '''<code>p-name</code>'''
* <code>photo</code> - parse as '''<code>u-photo</code>'''
* <code>url</code> - parse as '''<code>u-url</code>'''
Note: we should analyze [[hreview-examples-in-wild]] to see if there are any that actually depend on parsing for "item", or if simply looking for "fn", "photo", and "url" directly inside a root class name of "hreview" is sufficient. If so, we should '''DROP''' this backward-compat parsing for the class name "item" as it may otherwise produce too many false positives ("item" is a fairly common term).
==== h-review ====
...
* <code>item</code> - including compat root vcard|vevent in the absence of h-card|h-event
...
Note: we should analyze [[hreview-examples-in-wild]] to see if there are any that actually depend on parsing for "item", or if simply looking for "fn", "photo", and "url" directly inside a root class name of "hreview" is sufficient. If so, we should '''DROP''' the backward-compat parsing for the class name "item" as it may otherwise produce too many false positives ("item" is a fairly common term).
=== register a mime type? ===
See [[microformats2-mime-type]].
== accepted ideas ==
=== more on allow root class name only ===
=== more on allow root class name only ===
This has been stable for a while, see:
This has been stable for a while, see:
* [[microformats-2-implied-properties]]
* [[microformats-2-implied-properties]]


== incorporate rel ==
== rejected ideas ==
Many microformats use 'rel' attribute values, e.g. [[rel-tag]], [[rel-license]], [[rel-me]]. It would be good to continue supporting 'rel' values explicitly in the microformats2 model. Here are some existing use-cases and ways we would support them.
=== n prefix for multiple numbers ===
Idea:
 
* '''"n-*" for (one or more) numbers''', e.g. "n-rating", "n-geo", leaving the semantics of more than one number up to specific format. e.g. for an "n-rating" inside an "h-review", the first number would presumably be the rating value, when only two numbers the second would be the "best" value (e.g. rated <code>&lt;span class="n-rating"&gt;3 out of 4&lt;/span&gt;</code>), when three numbers the second would be the "worst" and the third would be the "best" (e.g. <code>&lt;span class="n-rating"&gt;7.5 out of 1 to 10&lt;/span&gt;</code>).  similarly "n-geo" would specify the first number to be the latitude and the second to be the longitude.
 
Rejected because while this *might* work for some properties in *English* it will NOT localize/internationalize well (orders of numbers in phrases change in different languages), and it will also limit the human expressivity of the plain text.  Thanks to Ben Ward for this feedback at the 2011-06-02 microformats dinner. [[User:Tantek|Tantek]] 14:25, 9 June 2011 (UTC)
 
=== incorporate rel ===
Update: requiring rel with class tends to confuse web authors (lots of anecdotal experience here), thus [[microformats2]] itself does not require any use of rel by publishers, and keeps it only as part of backcompat handling of a small handful of classic microformats (e.g. [[rel-tag]] in [[hAtom]], [[hCard]]).
 
The rest of this proposal (of formally adding rel-* handling inside of microformats root class names) was rejected long ago.


=== profile h-card rel-me ===
==== profile h-card rel-me ====
Many sites have profile pages (personal home pages, and social networks), marked up with hCard, and permitting one or more rel-me values to other profiles, e.g. if this content were on <nowiki>http://tantek.com/</nowiki> :
Many sites have profile pages (personal home pages, and social networks), marked up with hCard, and permitting one or more rel-me values to other profiles, e.g. if this content were on <nowiki>http://tantek.com/</nowiki> :


Line 84: Line 160:
* vCard .vcf export "SOURCE:" property (URL that the vCards were derived from).
* vCard .vcf export "SOURCE:" property (URL that the vCards were derived from).


== document url ==
As described above in the rel section, it's useful to have a root context object which provides the URL of the document itself.
== adopt itemref ==
There many existing real-world use-cases where either:
* several microformats in a page want to share some common data without repeating it.
** e.g. a page about a product with multiple reviews of that product (very common, products sites, Amazon/CNET et al, review aggregators, Yelp et al)
** e.g. representing the author of multiple hAtom entries on a page. Currently this is possible with the <code>&lt;address class="hcard"></code> optimisation, which would be rendered obsolete by the proposed new generic parsing rules.
* a microformat in a page needs to incorporate information spread across different parts of a page, without assigning the entire page to that microformat
The [[include-pattern]] provides the necessary functionality for existing microformats (1.0).
For 2.0 it may be reasonable to simply re-use the nice <code>itemref</code> attribute from microdata, with identical/analogous functionality.
That is, when present on the root element of a microformat, the <code>itemref</code> attribute provides a space separated list of ids of elements in the document which are then incorporated as children of the microformat, before its actual children in the document. This is a simple coarse summary of course, and the actual itemref inclusion algorithm should be followed.
Questions and possible issues:
<div class="discussion">
* Does use of 'itemref' mean requiring [[HTML5]]?
** No, <code>itemref</code> is not part of the HTML5 specification, it is currently only part of the [[microdata]] ''last call'' working draft. Thus we would be adding a "new" attribute to HTML above and beyond HTML5, though one that is already specified, and validated by current HTML validators.
* Is 'itemref' documented as a stable draft?
** 'itemref' is defined in the Last Call Working Draft of microdata. Being "last call" it has some amount of stability, but could still change before it goes to candidate recommendation (CR).
* Doesn't microformats try to avoid introducing new attributes? (e.g. from RDFa in the past).
** Yes, in general microformats try to avoid introducing new attributes. It may be ok for the set of use-cases that need "itemref". That is, they *are* a minority of actual use-cases, and thus making them use a new attribute is probably ok.
* Wouldn't it be dangerous to adopt features of separate technologies that are unstable, may change, or may disappear?
** Indeed any time we consider adopting anything from technologies in working drafts we should consider their stability and dependability on a case-by-case basis. When we do decide to re-use such technologies, we should be sure to ''copy'' their definition/functionality and provide a non-normative reference to the source, rather than normatively depend on anything that could change or disappear. In the case of 'itemref', it's been stable for a while, and if we believe the [[schema.org]] implementation announcements, there are multiple real-world implementations that surface it in common ([[search]]) user interfaces.
* Wasn't RDFa a ''stable'' augmentation of HTML, and yet we resisted incorporated attributes from it?
** In practice, no, RDFa has continued to change evolve (which is good) in response to market feedback about its complexity. There was very little real world use (and thus exercising) of RDFa until Google provided it as a Rich Snippets alternative syntax in 2009. At this point, it may be reasonable to also consider attributes from RDFa (e.g. 'vocab' instead of 'profile'), however for this particular purpose (providing inclusion functionality), the 'itemref' attribute/feature makes the most sense.
</div>
== hReview item backward compatibility ==
It maybe necessary (pending research/evidence) to add backward compatible parsing for the class name "item" inside the backward compatible parsing for the root class name "hreview". If so, here are some notes on how to add that to the [[microformats2]] spec on v2 vocabularies.
=== h-item ===
...
For backward compatibility, microformats 2 parsers {{should}}, when parsing an "hreview" for backwards compatibility, detect the following root class name and property names. A microformats 2 parser may use existing microformats [[parsers]] to extract these properties. If an "h-item" is found, don't look for an "item" on the same element.
compat root class name: <code id="item">item</code><br/>
properties: (parsed as '''p-''' plain text unless otherwise specified)
* <code>fn</code> - parse as '''<code>p-name</code>'''
* <code>photo</code> - parse as '''<code>u-photo</code>'''
* <code>url</code> - parse as '''<code>u-url</code>'''
Note: we should analyze [[hreview-examples-in-wild]] to see if there are any that actually depend on parsing for "item", or if simply looking for "fn", "photo", and "url" directly inside a root class name of "hreview" is sufficient. If so, we should '''DROP''' this backward-compat parsing for the class name "item" as it may otherwise produce too many false positives ("item" is a fairly common term).
=== h-review ===
...
* <code>item</code> - including compat root vcard|vevent in the absence of h-card|h-event
...
Note: we should analyze [[hreview-examples-in-wild]] to see if there are any that actually depend on parsing for "item", or if simply looking for "fn", "photo", and "url" directly inside a root class name of "hreview" is sufficient. If so, we should '''DROP''' the backward-compat parsing for the class name "item" as it may otherwise produce too many false positives ("item" is a fairly common term).
== rejected ideas ==
=== n prefix for multiple numbers ===
Idea:
* '''"n-*" for (one or more) numbers''', e.g. "n-rating", "n-geo", leaving the semantics of more than one number up to specific format. e.g. for an "n-rating" inside an "h-review", the first number would presumably be the rating value, when only two numbers the second would be the "best" value (e.g. rated <code>&lt;span class="n-rating"&gt;3 out of 4&lt;/span&gt;</code>), when three numbers the second would be the "worst" and the third would be the "best" (e.g. <code>&lt;span class="n-rating"&gt;7.5 out of 1 to 10&lt;/span&gt;</code>).  similarly "n-geo" would specify the first number to be the latitude and the second to be the longitude.
Rejected because while this *might* work for some properties in *English* it will NOT localize/internationalize well (orders of numbers in phrases change in different languages), and it will also limit the human expressivity of the plain text.  Thanks to Ben Ward for this feedback at the 2011-06-02 microformats dinner. [[User:Tantek|Tantek]] 14:25, 9 June 2011 (UTC)


== see also ==
== see also ==
* [[microformats-2]]
* [[microformats2]]
* [[microformats-2-brainstorming]]
* [[microformats2-parsing]]
* [[microformats-2-prefixes]]
* [[microformats2-brainstorming]]
* [[microformats-2-implied-properties]]
* [[microformats2-experimental-properties]]
* [[microformats-2-faq]]
* [[microformats2-prefixes]]
* [[microformats2-implied-properties]]
* [[microformats2-faq]]
* [[html-stripping-examples]]
* [[microformats2-origins]]

Latest revision as of 23:41, 14 December 2020

Brainstorming experimental / undeveloped / rejected ideas for microformats-2.

For the original brainstorming of microformats2 itself, see:


brainstorms to consider

document url

Since microformats2-parsing produces JSON with "rels" and "rel-urls", it may be useful to have a root context object which provides the URL of the document itself.

adopt itemref

There many existing real-world use-cases where either:

  • several microformats in a page want to share some common data without repeating it.
    • e.g. a page about a product with multiple reviews of that product (very common, products sites, Amazon/CNET et al, review aggregators, Yelp et al)
    • e.g. representing the author of multiple hAtom entries on a page. Currently this is possible with the <address class="hcard"> optimisation, which would be rendered obsolete by the proposed new generic parsing rules.
  • a microformat in a page needs to incorporate information spread across different parts of a page, without assigning the entire page to that microformat

The include-pattern provides the necessary functionality for existing microformats (1.0).

For 2.0 it may be reasonable to simply re-use the nice itemref attribute from microdata, with identical/analogous functionality.

That is, when present on the root element of a microformat, the itemref attribute provides a space separated list of ids of elements in the document which are then incorporated as children of the microformat, before its actual children in the document. This is a simple coarse summary of course, and the actual itemref inclusion algorithm should be followed.

Questions and possible issues:

  • Does use of 'itemref' mean requiring HTML5?
    • No, itemref is not part of the HTML5 specification, it is currently only part of the microdata last call working draft. Thus we would be adding a "new" attribute to HTML above and beyond HTML5, though one that is already specified, and validated by current HTML validators.
  • Is 'itemref' documented as a stable draft?
    • 'itemref' is defined in the Last Call Working Draft of microdata. Being "last call" it has some amount of stability, but could still change before it goes to candidate recommendation (CR).
  • Doesn't microformats try to avoid introducing new attributes? (e.g. from RDFa in the past).
    • Yes, in general microformats try to avoid introducing new attributes. It may be ok for the set of use-cases that need "itemref". That is, they *are* a minority of actual use-cases, and thus making them use a new attribute is probably ok.
  • Wouldn't it be dangerous to adopt features of separate technologies that are unstable, may change, or may disappear?
    • Indeed any time we consider adopting anything from technologies in working drafts we should consider their stability and dependability on a case-by-case basis. When we do decide to re-use such technologies, we should be sure to copy their definition/functionality and provide a non-normative reference to the source, rather than normatively depend on anything that could change or disappear. In the case of 'itemref', it's been stable for a while, and if we believe the schema.org implementation announcements, there are multiple real-world implementations that surface it in common (search) user interfaces.
  • Wasn't RDFa a stable augmentation of HTML, and yet we resisted incorporated attributes from it?
    • In practice, no, RDFa has continued to change evolve (which is good) in response to market feedback about its complexity. There was very little real world use (and thus exercising) of RDFa until Google provided it as a Rich Snippets alternative syntax in 2009. At this point, it may be reasonable to also consider attributes from RDFa (e.g. 'vocab' instead of 'profile'), however for this particular purpose (providing inclusion functionality), the 'itemref' attribute/feature makes the most sense.

Note that itemref is only sort of part of microdata. According to the whatwg html spec:

The itemref attribute is not part of the microdata data model. It is merely a syntactic construct to aid authors in adding annotations to pages where the data to be annotated does not follow a convenient tree structure. For example, it allows authors to mark up data in a table so that each column defines a separate item, while keeping the properties in the cells.

hReview item backward compatibility

It maybe necessary (pending research/evidence) to add backward compatible parsing for the class name "item" inside the backward compatible parsing for the root class name "hreview". If so, here are some notes on how to add that to the microformats2 spec on v2 vocabularies.

h-item

...

For backward compatibility, microformats 2 parsers SHOULD, when parsing an "hreview" for backwards compatibility, detect the following root class name and property names. A microformats 2 parser may use existing microformats parsers to extract these properties. If an "h-item" is found, don't look for an "item" on the same element.

compat root class name: item
properties: (parsed as p- plain text unless otherwise specified)

  • fn - parse as p-name
  • photo - parse as u-photo
  • url - parse as u-url

Note: we should analyze hreview-examples-in-wild to see if there are any that actually depend on parsing for "item", or if simply looking for "fn", "photo", and "url" directly inside a root class name of "hreview" is sufficient. If so, we should DROP this backward-compat parsing for the class name "item" as it may otherwise produce too many false positives ("item" is a fairly common term).

h-review

...

  • item - including compat root vcard|vevent in the absence of h-card|h-event

...

Note: we should analyze hreview-examples-in-wild to see if there are any that actually depend on parsing for "item", or if simply looking for "fn", "photo", and "url" directly inside a root class name of "hreview" is sufficient. If so, we should DROP the backward-compat parsing for the class name "item" as it may otherwise produce too many false positives ("item" is a fairly common term).

register a mime type?

See microformats2-mime-type.


accepted ideas

more on allow root class name only

This has been stable for a while, see:

rejected ideas

n prefix for multiple numbers

Idea:

  • "n-*" for (one or more) numbers, e.g. "n-rating", "n-geo", leaving the semantics of more than one number up to specific format. e.g. for an "n-rating" inside an "h-review", the first number would presumably be the rating value, when only two numbers the second would be the "best" value (e.g. rated <span class="n-rating">3 out of 4</span>), when three numbers the second would be the "worst" and the third would be the "best" (e.g. <span class="n-rating">7.5 out of 1 to 10</span>). similarly "n-geo" would specify the first number to be the latitude and the second to be the longitude.

Rejected because while this *might* work for some properties in *English* it will NOT localize/internationalize well (orders of numbers in phrases change in different languages), and it will also limit the human expressivity of the plain text. Thanks to Ben Ward for this feedback at the 2011-06-02 microformats dinner. Tantek 14:25, 9 June 2011 (UTC)

incorporate rel

Update: requiring rel with class tends to confuse web authors (lots of anecdotal experience here), thus microformats2 itself does not require any use of rel by publishers, and keeps it only as part of backcompat handling of a small handful of classic microformats (e.g. rel-tag in hAtom, hCard).

The rest of this proposal (of formally adding rel-* handling inside of microformats root class names) was rejected long ago.

profile h-card rel-me

Many sites have profile pages (personal home pages, and social networks), marked up with hCard, and permitting one or more rel-me values to other profiles, e.g. if this content were on http://tantek.com/ :

Tantek Çelik (@t, github.com/tantek)

with source:

<span class="h-card">
  <span class="p-name">Tantek Çelik</span>
 (<a class="u-url" rel="me"
     href="https://twitter.com/t"
     >@t</a>, 
  <a class="u-url" rel="me"
     href="http://github.com/tantek"
     >github.com/tantek</a>)
</span>

Parsed JSON per microformats2 properties:

{
  "items": [{ 
    "type": ["h-card"],
    "properties": {
      "name": ["Tantek Çelik"],
      "url": ["https://twitter.com/t", 
              "http://github.com/tantek"]
    }
  }]
}

We could incorporate rel property parsing either as just another property (like a 'u-' property) scoped to the microformat:

{
  "items": [{ 
    "type": ["h-card"],
    "properties": {
      "name": ["Tantek Çelik"],
      "url": ["https://twitter.com/t", 
              "http://github.com/tantek"],
      "rel-me": ["https://twitter.com/t", 
              "http://github.com/tantek"]
    }
  }]
}

Or as always within global scope (closer to HTML5's currently defined scoping for 'rel' attributed links in a document):

{
  "items": [{
    "url": "http://tantek.com/",
    "rel-stylesheet": ["...", "..."],
    "rel-me": ["https://twitter.com/t", 
               "http://github.com/tantek"]
    },
    { 
    "type": ["h-card"],
    "properties": {
      "name": ["Tantek Çelik"],
      "url": ["https://twitter.com/t", 
              "http://github.com/tantek"],
    }
  }]
}

Note: "url" in that root level object is the URL of the document itself, which is necessary for:

  • 'rel' semantics: which URL from to which other URL the rels apply to.
  • vCard .vcf export "SOURCE:" property (URL that the vCards were derived from).


see also