measure

From Microformats Wiki
Revision as of 12:26, 24 March 2008 by TobyInk (talk | contribs) (Build on current work, bring in some ideas from the brainstorming page.)
Jump to navigation Jump to search

Measure microformat

Currently this microformat is in exploratory stage. Contributions should focus on real examples from the Web, existing formats/encoding of measures.

The problem

Measures (e.g. weights, sizes, temperatures) occur frequently on the Web, they are constituted of a value a unit-measure and, in scientific and technical contexts, an experimental uncertainty. These 3 elements should be marked-up consistently across websites so that they can be easily identified and acted upon (export, compute, convert) in collaborative distributed applications.

Unit-measures differ from locale to locale (e.g. Fahrenheit vs. Celsius, pound versus Kilogram), making comparison and matching of offerings difficult.

The Measurement microformat will enable unambiguous description of physical quantities and thus provide a solid ground for data sharing and automation in many areas.

Draft Schema

Rationale: The names "value" and "type" are taken from hCard; "item" is used from hReview. Should we include "tolerance" as well?

Standard Measure Schema

  • hmeasure
    • value {1} (numeric)
    • unit {1} (unit)
    • item? (text | hCard | hCalendar)
    • type ? (text, e.g. "height", "width", "weight")

Angular Measure Schema

  • hmeasure
    • value {1} (degree)
    • item? (text | hCard | hCalendar)
    • type ? (text, e.g. "angle of elevation")

Value

Arbitrary white space MAY be included in the value to improve readability. Parsers MUST strip out all white space before further processing.

In the standard schema, the value MUST be a number, formatted according to the following EBNF pattern:

non-zero-digit = "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" ;
digit          = "0" | non-zero-digit ;
natural        = non-zero-digit , {digit} ;
integer        = "0" | [ "-" ] , natural ;
dot-decimal    = integer , "." , {digit} ;
comma-decimal  = integer , "," , {digit} ;
e-sign         = "e" | "E" ;
mantissa       = dot-decimal | comma-decimal | integer ;
sci-number     = mantissa , e-sign , integer ;
number         = dot-decimal | comma-decimal | integer | sci-number ;

This roughly corresponds to a subset of C syntax for floating points and integers, excluding octal and hexadecimal representations. However, note that both commas and stops may be used as decimal points.

The Unicode minus sign (U+2212) and ASCII-compatible hyphen-minus (U+002D) MUST both be treated as acceptable indicators of a negative number. In addition, the symbols ¼ (U+00BC), ½ (U+00BD) and ¾ (U+00BE) SHOULD be supported as aliases for 0.25, 0.5 and 0.75 respectively.

In the angular measure schema, a measure is expressed as a combination of up to three numeric components: called degrees, minutes and seconds. Any combination of these components may be used, except when degrees and seconds are given minutes MUST be present. The components MUST appear in the correct order (degrees, minutes, seconds). Each component must match the production rule for "mantissa" above, with the following additional constraints:

  • Only the first component can bear a minus sign. Subsequent components "inherit" the negativity (or lack thereof) from their predecessors.
  • All components except the last must match the production rule for "integer".

The numeric components MUST be indicated by appending a suffix to each component. Valid suffixes are:

  • degree: "deg", U+00B0 degree symbol (°)
  • minute: "min", straight single quote ('), U+2032 prime (′)
  • second: "sec", straight double quote ("), U+2033 double prime (″)

Examples

  • 1729 (the smallest number that can be expressed as the sum of two cubes in two different ways)
  • 1.61803399 (the golden ratio)
  • 2,99792458e8 (the speed of light in a vacuum, measured in metres per second)
  • -40 (value at which Celcius and Farenheit scales are equal)
  • 1,000,000,000 (Invalid: commas may be used as decimal points, but not for grouping thousands.)
  • 57.2958 deg (1 radian, in degrees)
  • -57° 17′ 45.1″ (-1 radian, in degrees, minutes and seconds)
  • 4° 30″ (Invalid: no minutes)
  • 4° -30′ (Invalid: only first component may be negative)

Unit

The "unit" class is defined as an arbitrary string. Any unit may be used, but authors SHOULD attempt to use official SI units of measurement where appropriate. Parsers MUST recognise the following case-sensitive list of units, derived from the SI list of base units and official recognised derived units, with the addition of bits and bytes, which are commonly used on web pages, and litres and radians. (Note that gram appears in this table instead of kilogram. This is deliberate.)

Unit Aliases
metre meter, m
gram gramme, g
second sec, s
ampere amp, A
candela cd
mole mol
kelvin K
newton N
pascal Pa
joule J
watt W
coulomb C
volt V
ohm Ω
siemens S
farad F
weber Wb
henry H
tesler T
hertz Hz
byte B
bit b
litre liter, L
radian rad

The following SI prefixes MUST be supported (table taken from Wikipedia - it needs rewriting to remove irrelelevent columns). "u" MUST be treated as an alias for μ.

SI prefixes
1000n 10n Prefix Symbol Since[1] Short scale Long scale Decimal equivalent in SI writing style
10008 1024 yotta- Y 1991 Septillion Quadrillion 1 000 000 000 000 000 000 000 000
10007 1021 zetta- Z 1991 Sextillion Trilliard
(thousand trillion)
1 000 000 000 000 000 000 000
10006 1018 exa- E 1975 Quintillion Trillion 1 000 000 000 000 000 000
10005 1015 peta- P 1975 Quadrillion Billiard
(thousand billion)
1 000 000 000 000 000
10004 1012 tera- T 1960 Trillion Billion 1 000 000 000 000
10003 109 giga- G 1960 Billion Milliard
(thousand million)
1 000 000 000
10002 106 mega- M 1960 Million 1 000 000
10001 103 kilo- k 1795 Thousand 1 000
10002/3 102 hecto- h 1795 Hundred 100
10001/3 101 deca- da 1795 Ten 10
10000 100 (none) (none) NA One 1
1000−1/3 10−1 deci- d 1795 Tenth 0.1
1000−2/3 10−2 centi- c 1795 Hundredth 0.01
1000−1 10−3 milli- m 1795 Thousandth 0.001
1000−2 10−6 micro- µ 1960[2] Millionth 0.000 001
1000−3 10−9 nano- n 1960 Billionth Milliardth 0.000 000 001
1000−4 10−12 pico- p 1960 Trillionth Billionth 0.000 000 000 001
1000−5 10−15 femto- f 1964 Quadrillionth Billiardth 0.000 000 000 000 001
1000−6 10−18 atto- a 1964 Quintillionth Trillionth 0.000 000 000 000 000 001
1000−7 10−21 zepto- z 1991 Sextillionth Trilliardth 0.000 000 000 000 000 000 001
1000−8 10−24 yocto- y 1991 Septillionth Quadrillionth 0.000 000 000 000 000 000 000 001
Notes:
1. The 1795 dates identify prefixes in use since the metric system was introduced. The other dates are not necessarily dates of first use, but rather the date of recognition by a resolution of the CGPM, which first met in 1889.
2. The micron was earlier recognized by the CGPM in 1948; that decision was abrogated in 1967-68.

Combining units

Units may be multiplied by separating with whitespace, or divided using a slash (/) or U+2215 division slash (∕). Units may be raised to an integer power using a caret character. The unicode superscript numerals 2 to 9 (U+00B2, U+00B3, U+2074-79) MUST be supported as aliases for raising to the appropriate integer powers. Multiplication is more associative than division.

Examples:

  • <span class="unit">kg m / s</span>
  • <span class="unit">m/s^2</span>
  • <span class="unit">meter³</span>
  • <abbr class="unit" title="μm">micron</abbr>

Angular units

Units MUST NOT be given for measurements expressed in the degree schema: the degree itself is the unit.

If the standard schema is used, units may be given in radians (rad).

Non-SI Units

Authors MAY specify units other than those defined above, but SHOULD NOT assume that parsers will be able to interpret them. Authors using other units SHOULD provide a rel=glossary link to a page that defines the units.

Item

An hCard, hCalendar event or textual description of the item being measured may be supplied.

<p class="hmeasure">
  <span class="item vcard">The <span class="fn">Great Wall</span>of
  <span class="adr"><span class="country-name">China</span></span></span>
  is about <span class="value">6 700</span> <abbr title="metre">metres</abbr>
  <abbr title="length" class="type">long</abbr>.
</p>

The item is optional.

Type

The type specifies the dimension being measured. A measurement in, say, metres may be ambiguous because it could refer to a depth, a height, a length or a width. The optional type parameter allows you to specify a human-readable dimension.

Related microformats

  • hcalendar can provide a complete quantitative description of a natural event (for example an earthquake) occurring at a specified time (dtstart/dtend) and location (embedded geo), by just embedding measured physical quantities in the 'descrition' span.
  • job-listing can use time measure for specify per what period of time the salary is for.
  • hlisting product dimensions; weight/mass; time period (as above).
  • directions-examples can use length measure for mileage and time to go from one point to the next.
  • recipe-examples can use weight, volume and time measure for ingredients and preparation time.
  • currency can be viewed as a measurement unit, or as a component of a measurement unit, as in $ per hour.

Contributors

References

See also