rest/datatypes

(Difference between revisions)

Jump to: navigation, search
(Proposal)
Line 59: Line 59:
* boolean (0,1)
* boolean (0,1)
* base64
* base64
 +
Lets call this 'binary' as the encoding is in the data: url, and DRY applies
* dateTime[.iso8601]
* dateTime[.iso8601]
Whlle not perfect, these certainly cover the 80% case, and are reasonably well-defined.  That said, there are a number of open questions about how to use them:
Whlle not perfect, these certainly cover the 80% case, and are reasonably well-defined.  That said, there are a number of open questions about how to use them:
-
# should 'string' also be explicitly specified, or can it be assumed?
+
# should 'string' also be explicitly specified, or can it be assumed?  
 +
Assumed, and also defined as utf-8. [[User:Kevin Marks|Kevin Marks]] 16:39, 13 Feb 2006 (PST)
# does 'int' always mean 32-bits?
# does 'int' always mean 32-bits?
##  If so, what should be used for 64-bit integers or cryptographic (256-bit+) numbers?  
##  If so, what should be used for 64-bit integers or cryptographic (256-bit+) numbers?  
Line 70: Line 72:
###SQL's "decimal", perhaps?
###SQL's "decimal", perhaps?
##  If not, how should conforming implementations react to longer integers than they can handle?
##  If not, how should conforming implementations react to longer integers than they can handle?
 +
I think integer is fine - we don't have an explict constraint here. Do you want to define +Inf -Inf and NaN behaviour? Certainly when building testcases and examples include these.
 +
# Is it worth deviating from the standard to allow "dateTime" as an alias? (the one case where XML Schema is actually simpler)
# Is it worth deviating from the standard to allow "dateTime" as an alias? (the one case where XML Schema is actually simpler)
Line 77: Line 81:
* the name 'long' MAY be used for 64-bit or longer integers
* the name 'long' MAY be used for 64-bit or longer integers
* for 'dateTime'
* for 'dateTime'
 +
can we make this 'datetime' ? [[User:Kevin Marks|Kevin Marks]] 16:39, 13 Feb 2006 (PST)
** the trailing '.iso8601' MUST be omitted, as '.' is not (always?) valid in CSS class names
** the trailing '.iso8601' MUST be omitted, as '.' is not (always?) valid in CSS class names
** date/time formats SHOULD follow the [http://www.w3.org/TR/NOTE-datetime W3C profile] of [http://en.wikipedia.org/wiki/ISO_8601 ISO 8601]
** date/time formats SHOULD follow the [http://www.w3.org/TR/NOTE-datetime W3C profile] of [http://en.wikipedia.org/wiki/ISO_8601 ISO 8601]
Line 82: Line 87:
* binary data SHOULD be encoded in a [http://en.wikipedia.org/wiki/Data:_URI_scheme data: URI], with an explicit [http://www.htmlhelp.com/reference/html40/special/a.html ContentType] and a human-readable description as the body of the anchor.
* binary data SHOULD be encoded in a [http://en.wikipedia.org/wiki/Data:_URI_scheme data: URI], with an explicit [http://www.htmlhelp.com/reference/html40/special/a.html ContentType] and a human-readable description as the body of the anchor.
* if no datatype is specified, an implementation MAY either attempt to infer a datatype from the syntax of the value, or simply assert that the value is a string.  Thus, conforming implementations SHOULD always explicitly label strings.
* if no datatype is specified, an implementation MAY either attempt to infer a datatype from the syntax of the value, or simply assert that the value is a string.  Thus, conforming implementations SHOULD always explicitly label strings.
 +
Disagree - either we are labelling datatypes and thus labelling string is redundant, or we are trying to guess from syntax. If the latter this whole spec is unnecessary. [[User:Kevin Marks|Kevin Marks]] 16:39, 13 Feb 2006 (PST)
 +
To indicate that a particular micforomat uses typed values, precede that microformat with the class name 'typed', as in:
To indicate that a particular micforomat uses typed values, precede that microformat with the class name 'typed', as in:
Line 98: Line 105:
   <dt>data</dt><dd class="base64"><a href="data:;base64,sdcfo2JTiXE=" type="image/jpg">my image</a></dd>
   <dt>data</dt><dd class="base64"><a href="data:;base64,sdcfo2JTiXE=" type="image/jpg">my image</a></dd>
   </dl>
   </dl>
 +
 +
Example revised with above suggestions:
 +
 +
  <dl class="typed xoxo">
 +
  <dt>key</dt><dd>value</dd>
 +
  <dt>integer</dt><dd class="int">137</dd>
 +
  <dt>real</dt><dd class="double">3.14159265</dd>
 +
  <dt>date</dt><dd class="datetime">1994-11-05T13:15:30Z</dd>
 +
  <dt>date(abbr)</dt><dd class="datetime"><abbr title="1994-11-05">November 5, 1994</abbr></dd>
 +
  <dt>true</dt><dd class="boolean">1</dd>
 +
  <dt>false</dt><dd class="boolean">0</dd>
 +
  <dt>data</dt><dd class="binary"><a href="data:;base64,sdcfo2JTiXE=" type="image/jpg">my image</a></dd>
 +
  </dl>
 +
== References ==
== References ==

Revision as of 00:39, 14 February 2006

Contents

Datatypes in HTML

One of the challenges of using HTML as a data transport is that everything, by default, is a string. This page explores ways to use microformats -- specifically, class names -- to encode data type information, e.g., for use with xoxo and rest/ahah, in order to allow lossless import/export from various languages. These could also be used with forms to provide rest/descriptions of the type of data expected.

Examples

These are the primary datatypes in a range of different languages and formats. Note that we are only concerned with "primitive" datatypes (loosely defined), as structured datatypes (list/array, hash/dictionary) are handled by xoxo.

Datatype comparison table
Language/format string float integer boolean data date/time null
XML Schema string float, double decimal, integer, etc. boolean hexBinary, base64Binary duration, dateTime, date, time nil
XML-RPC string double i4, int boolean base64 dateTime.iso8601 nil
Mac OS X plists string real integer true, false data date nil
JSON (JavaScript) string number number true, false N/A Date nil
YAML tags str int float bool null (base 64) N/A null
SQL (JDBC) char,varchar float, double, real decimal, numeric bit binary date, time, timestamp  ?
C char[] float, double int, long, short bool, int char[] N/A (void*)0
Java char, String float, double int, long, short, byte boolean N/A util.Date null
PHP string float (double) integer boolean array N/A NULL
Perl array scalar scalar scalar array N/A
Python str float, complex int, long bool binascii, base64 time,datetime
Ruby + lib String Float Fixnum, Bignum TrueClass,FalseClass Hash Date NilClass
REBOL string! decimal! integer! logic! binary! date!, time! none!

Analysis

The most common set of datatypes appears to be those represented by XML-RPC, which (perhaps fortunately) also has historical precedence on the web:

Lets call this 'binary' as the encoding is in the data: url, and DRY applies

Whlle not perfect, these certainly cover the 80% case, and are reasonably well-defined. That said, there are a number of open questions about how to use them:

  1. should 'string' also be explicitly specified, or can it be assumed?

Assumed, and also defined as utf-8. Kevin Marks 16:39, 13 Feb 2006 (PST)

  1. does 'int' always mean 32-bits?
    1. If so, what should be used for 64-bit integers or cryptographic (256-bit+) numbers?
      1. Python's 'long' is simple, but ambiguous.
      2. Ruby's BigNum is clear but much less common.
      3. XML-Schema has so many types it is hard to say.
      4. SQL's "decimal", perhaps?
    2. If not, how should conforming implementations react to longer integers than they can handle?

I think integer is fine - we don't have an explict constraint here. Do you want to define +Inf -Inf and NaN behaviour? Certainly when building testcases and examples include these.

  1. Is it worth deviating from the standard to allow "dateTime" as an alias? (the one case where XML Schema is actually simpler)

Proposal

The proposal is to adopt XML-RPC scalar values as the class names for typed microformats, with the following caveats:

can we make this 'datetime' ? Kevin Marks 16:39, 13 Feb 2006 (PST)

Disagree - either we are labelling datatypes and thus labelling string is redundant, or we are trying to guess from syntax. If the latter this whole spec is unnecessary. Kevin Marks 16:39, 13 Feb 2006 (PST)


To indicate that a particular micforomat uses typed values, precede that microformat with the class name 'typed', as in:

< div class="typed xoxo">

Example

 <dl class="typed xoxo">
  <dt>key</dt><dd class="string">value</dd>
  <dt>integer</dt><dd class="int">137</dd>
  <dt>real</dt><dd class="double">3.14159265</dd>
  <dt>date</dt><dd class="dateTime">1994-11-05T13:15:30Z</dd>
  <dt>date(abbr)</dt><dd class="dateTime"><abbr title="1994-11-05">November 5, 1994</abbr></dd>
  <dt>true</dt><dd class="boolean">1</dd>
  <dt>false</dt><dd class="boolean">0</dd>
  <dt>data</dt><dd class="base64"><a href="data:;base64,sdcfo2JTiXE=" type="image/jpg">my image</a></dd>
 </dl>

Example revised with above suggestions:

 <dl class="typed xoxo">
  <dt>key</dt><dd>value</dd>
  <dt>integer</dt><dd class="int">137</dd>
  <dt>real</dt><dd class="double">3.14159265</dd>
  <dt>date</dt><dd class="datetime">1994-11-05T13:15:30Z</dd>
  <dt>date(abbr)</dt><dd class="datetime"><abbr title="1994-11-05">November 5, 1994</abbr></dd>
  <dt>true</dt><dd class="boolean">1</dd>
  <dt>false</dt><dd class="boolean">0</dd>
  <dt>data</dt><dd class="binary"><a href="data:;base64,sdcfo2JTiXE=" type="image/jpg">my image</a></dd>
 </dl>


References

rest/datatypes was last modified: Wednesday, December 31st, 1969

Views