machine-data: Difference between revisions
| m (→As Supplementary Data:  Update references to value-excerpting) | m (Replace <entry-title> with {{DISPLAYTITLE:}}) | ||
| (28 intermediate revisions by 6 users not shown) | |||
| Line 1: | Line 1: | ||
| {{DISPLAYTITLE:Machine Data in Microformats}} | |||
| {{TOC-right}} | {{TOC-right}} | ||
| Line 17: | Line 17: | ||
| ===hCalendar=== | ===hCalendar=== | ||
| * Uses ISO 8601 for <code>dtstart</code>, <code>dtend</code>, <code>duration</code> and <code> | * Uses ISO 8601 for <code>dtstart</code>, <code>dtend</code>, <code>duration</code>, <code>rdate</code> and <code>exdate</code> | ||
| * enumerated value for the <code>role</code> subproperty of the <code>attendee</code> property. Example documented in [[hcalendar-brainstorming#hCard_attendees|hCalendar brainstorming: hCard attendees]] | |||
| ===hCard=== | ===hCard=== | ||
| Line 29: | Line 30: | ||
| ===hReview=== | ===hReview=== | ||
| * Uses an ISO 8601 date-time for <code>dtreviewed</code> | * Uses an ISO 8601 date-time for <code>dtreviewed</code> | ||
| ===hAtom=== | ===hAtom=== | ||
| Line 51: | Line 50: | ||
| * Uses ISO 8601 for track <code>duration</code>, e.g. <code>PT3M23S</code> | * Uses ISO 8601 for track <code>duration</code>, e.g. <code>PT3M23S</code> | ||
| == Misconceptions of Fixed Data Formats in Microformats== | |||
| There are also cases (at least one) of apparent fixed data formats in microformats which should not require the providing of a separate value.  It is useful to document these as a way to clear up apparent misconceptions. | |||
| ===hReview=== | |||
| * ''Uses fixed-point integer values from <var>0</var>-<var>5</var> for <code>rating</code> (publishers may, for example, display a percentage rating)'' | |||
| There are several misconceptions here. | |||
| # The ''default'' rating values in [[hReview]] are [[hreview#In_General|from 1.0-5.0]] (not 0-5) | |||
| # hReview permits the author to state their own 'worst' to 'best' range for any given 'rating'. | |||
| Thus a publisher that wants to display a percentage rating can do so by simply specifying a 'worst' value for a rating of 0, and a 'best' value for a rating of 100.  Then the actual percentage rating can simply be marked up inline and no separate machine value is necessary. | |||
| ==Embedding Fixed Data Formats in Microformats== | ==Embedding Fixed Data Formats in Microformats== | ||
| Line 78: | Line 89: | ||
| Whilst the ''data'' ‘PT3M23S’ is an expanded form of ‘3 minutes, 23 seconds’, the text is not; ‘PT3M23S’ is nonsense to most human beings. <code>abbr</code> is an element that describes the ''text'', not the data. HTML4 has no way to mark up arbitrary data. | Whilst the ''data'' ‘PT3M23S’ is an expanded form of ‘3 minutes, 23 seconds’, the text is not; ‘PT3M23S’ is nonsense to most human beings. <code>abbr</code> is an element that describes the ''text'', not the data. HTML4 has no way to mark up arbitrary data. | ||
| === | ===Using the value-class-pattern=== | ||
| === | |||
| See [[value-class-pattern]]. | |||
| ==Related Pages== | ==Related Pages== | ||
Latest revision as of 16:28, 18 July 2020
Microformats are designed to mark-up human consumable information, as commonly found in the wild. But, in a number of exceptional cases it has been necessary to specify precise data formats for particular properties. Formats for dates, times and locations are standardised in a way that doesn't always match the way information is visibly published. This is necessary to make the data understandable to parsers. Similarly, there are keywords in hCard that must be written in English (telephone ‘type’ in hCard, for example).
It is necessary for these data formats to be fixed to make the data parsable by machines; the cost for a parser to support every commonly published date-time format in the world (include approximations like ‘five minutes ago’) is too high, as is handling international translation (such as mobile telephones; US-English ‘cell’ published as British English ‘mobile’).
In some cases, the human version of the data can be semantically described as an abbreviated form of the machine data, and the machine data may also be human consumable. For example, the date-design-pattern uses HTML's abbr element to expand one human date representation into the ISO 8601 form date: ‘January 1st’ is an abbreviated form of ‘2008-01-01’. The latter is also legible to humans (and can be exposed to them through tool-tips and assistive screen readers).
In other cases, this machine data is not legible to humans. In hAudio, the duration property uses ISO 8601, resulting in machine data of PT3M23S; not understandable to humans, and therefore not a valid expansion of ‘three minutes and twenty-three seconds’.
Cases of Fixed Data Formats in Microformats
The following are all current uses of fixed format machine data required by the various microformats.
hCalendar
- Uses ISO 8601 for dtstart,dtend,duration,rdateandexdate
- enumerated value for the rolesubproperty of theattendeeproperty. Example documented in hCalendar brainstorming: hCard attendees
hCard
- Telephone typekeywords:voice,home,msg,work,pref,fax,cell,video,pager,bbs,modem,car,isdn,pcs.
- Address type keywords: INTL,POSTAL,PARCEL,WORK,dom,home,pref.
- Email type keywords: INTERNET,x400,pref.
- Uses ISO 8601 date for bday
- ISO 8601 time zone for tz
- Telephone numbers requires a numerical form, whilst phone numbers can be presented in alpha-numeric form: e.g. +1-555-FORMATS
hReview
- Uses an ISO 8601 date-time for dtreviewed
hAtom
- Uses ISO 8601 date-time for updated
- Uses ISO 8601 date-time for published
hResume
- Uses ISO 8601 date for individual experienceitems.
- Uses ISO 8601 date for individual educationitems.
Geo
- Requires latitudeandlongitudein decimal form (1.23232;-2.343535), but may be published in degrees:N 37° 24.491,W 122° 08.313
- Locations are most often published just as place names (not abbreviated co-ordinates)
hAudio
- Uses ISO 8601 for track duration, e.g.PT3M23S
Misconceptions of Fixed Data Formats in Microformats
There are also cases (at least one) of apparent fixed data formats in microformats which should not require the providing of a separate value. It is useful to document these as a way to clear up apparent misconceptions.
hReview
- Uses fixed-point integer values from 0-5 for rating(publishers may, for example, display a percentage rating)
There are several misconceptions here.
- The default rating values in hReview are from 1.0-5.0 (not 0-5)
- hReview permits the author to state their own 'worst' to 'best' range for any given 'rating'.
Thus a publisher that wants to display a percentage rating can do so by simply specifying a 'worst' value for a rating of 0, and a 'best' value for a rating of 100. Then the actual percentage rating can simply be marked up inline and no separate machine value is necessary.
Embedding Fixed Data Formats in Microformats
There are currently three supported methods of including these fixed data formats in a microformatted document.
As Visible Page Content
You may use the standard class-design-pattern to mark-up the data visibly in the page.
Ben was born on <span class="bday">1984-02-09</span>.
We're meeting up on Northumberland Avenue (<span class="geo">51.507033,-0.126343</span>).
As An Abbreviation
In some cases, the data formats specified make valid expansions of common human forms, such as dates in in an hCard birthday field:
Ben was born on <abbr class="bday" title="1984-02-09">9th February</abbr>
Note, however, that not all data formats are valid expansions. In HTML, the abbr element is working semantically at a text level, not a data level. Both the abbreviated form (the inner text) and the expanded form (the title) need to be consumable by humans.
This means that in hAudio, using an abbreviation for duration is incorrect:
<abbr class="duration" title="PT3M23S">3 minutes, 23 seconds</abbr>
Whilst the data ‘PT3M23S’ is an expanded form of ‘3 minutes, 23 seconds’, the text is not; ‘PT3M23S’ is nonsense to most human beings. abbr is an element that describes the text, not the data. HTML4 has no way to mark up arbitrary data.
Using the value-class-pattern
See value-class-pattern.
Related Pages
- value-excerption-pattern
- class-design-pattern
- abbr-design-pattern
- date-design-pattern
- datetime-design-pattern
- HTML 4.01 definition of <abbr>element