chat-brainstorming: Difference between revisions

From Microformats Wiki
Jump to navigation Jump to search
(→‎Example playground: Added strawman)
m (Reverted edits by Chatroll (Talk) to last version by ChristopheDucamp)
 
(12 intermediate revisions by 9 users not shown)
Line 8: Line 8:


How much, and what kind of data is going to be in the file? The work done by the [http://purl.org/NET/ULF/SPEC Unified Logging Format WG] has a pretty good overview of the various types of things that an IM client would want to log. Don't make too much of a bikeshed about it -- I'm mostly linking it beacuse it's a good overview of the sorts of general element types (message, event, status) we probably want to use. --[[User:Colin Barrett|Colin Barrett]] 05:33, 22 Aug 2006 (PDT)
How much, and what kind of data is going to be in the file? The work done by the [http://purl.org/NET/ULF/SPEC Unified Logging Format WG] has a pretty good overview of the various types of things that an IM client would want to log. Don't make too much of a bikeshed about it -- I'm mostly linking it beacuse it's a good overview of the sorts of general element types (message, event, status) we probably want to use. --[[User:Colin Barrett|Colin Barrett]] 05:33, 22 Aug 2006 (PDT)
There has been debate on the mailing list recently as to wether on-disk data storage or presentation should be the focus of this microformat. [http://microformats.org/discuss/mail/microformats-discuss/2006-October/006869.html here is one thread]. pipermail is a bit braindead so it may be useful to look at the whole [http://microformats.org/discuss/mail/microformats-discuss/2006-October/thread.html#6869 october page]
=== Chat rooms ===
=== Chat rooms ===


Line 28: Line 31:
== hChatLog Strawman Proposal==
== hChatLog Strawman Proposal==


Initially compiled by [[Ben_Ward]]. (Please note: There are likely to be a large number of slightly random typos in this first version of the document, this is entirely the fault of my keyboard which intermittently spews extra characters into sentences when I press Option or Command. Also, it's 1am. I will attempt to revise it when I have access to more reliable hardware!)
Initially compiled by [[User:Ben Ward|BenWard]].


Following [http://microformats.org/discuss/mail/microformats-discuss/2006-October/006719.html a vaguely related post] to uf-discuss, Colin Barrett requested I write up my hChatLog example in more detail, so I present it here as a straw man proposal to see if chat can gain any traction. It's based off reading around the initial brainstorming, but is initially led by my own preferences for mark-up. Class names, however, have where possible been reused or derived from other existing Microformats, and new class names are generally based on the Unified Logging Format, simply because the element names in ULF are very much human-friendly.
Following [http://microformats.org/discuss/mail/microformats-discuss/2006-October/006719.html a post] to uf-discuss, Colin Barrett requested I write up my hChatLog example in more detail, so I present it here as a straw man proposal to see if chat can gain any traction.  
 
It's based around the initial brainstorming, but was initially led by my own preferences for mark-up. The class names introduced are reused or derived from other existing microformats, and new class names are based on the Unified Logging Format, as the element names in ULF are already human-friendly.


===hChatLog Structure===
===hChatLog Structure===
Line 43: Line 48:
*** sender (which MUST be an hCard  and SHOULD be a CITE element)
*** sender (which MUST be an hCard  and SHOULD be a CITE element)
*** type (from a predefined list of values)
*** type (from a predefined list of values)
*** message (optional, the custom message the user assigns to a status, e.g. an 'away message')


====Notes about the above structure====
====Notes about the above structure====


* message, sender, status and type are all from ULF.
* message, sender, status and type are all taken from ULF. Note that 'type' also corresponds with tel > type and adr > type in hCard.
* 'dt' is proposed based on 'dtstart' and 'dtend' as used in hCalendar, however 'time' (again from ULF) may be preferable
* 'dt' is proposed as a derivative of 'dtstart' and 'dtend' as used in hCalendar, however 'time' (again from ULF) may be preferable
* The message text does not have a class name, and is instead identified as being quoted text within a message, marked up with Q or BLOCKQUOTE.
* The message text does not have a class name, and is instead identified as from quoted text within a message, marked up with Q (for common single line messages)  or BLOCKQUOTE (for multiline or otherwise complex messages).
* An alternative, pre-ULF draft name for 'status' was 'event', however this is bad as it conflicts too closely with hCalendar.


====List of status type values====
====List of status 'type' values====


This should be a single-word list of the common status types from current IM implementations, namely:
This should be a single-word list of the common status types from current IM implementations, namely:
Line 61: Line 66:
* BRB (Be Right Back)
* BRB (Be Right Back)
* Lunch (Out To Lunch)
* Lunch (Out To Lunch)
* Phone (On The Phone)


This is better represented with a mark-up example. Note that this example uses an OL as the container, with LI elements for each message/status. This mark-up scheme for dialogue is oft debated and subject to opinions of how strictly an Ordered List shoul3dl be specified, hense not listing OL/LI mark-up as SHOULD or MUST in the structure above. It should be worth watching the WHATWG list as tightening of the definition has recently be proposed for HTML5.
===Example===


===Example===
Note that this example uses an OL as the container, with LI elements for each message/status. This mark-up scheme for dialogue is subject to opinions of how strictly an Ordered List should be specified. For this reason OL/LI mark-up as is not specified as 'SHOULD' or 'MUST' in the structure above. The WHATWG list was recently included a discussion about tightening of the definition of OL for HTML5 (no definitive resolution was made though).


<pre>&lt;!-- &lsquo;hChatLog&rsquo;&nbsp;straw man by Ben Ward --&gt;
<pre>&lt;!-- &lsquo;hChatLog&rsquo;&nbsp;straw man by Ben Ward --&gt;
Line 92: Line 98:
&lt;/ol&gt;</pre>
&lt;/ol&gt;</pre>


In this example, two messages are exchanged followed by two status changes. Each message contains an hCard, identifying the user's nickname (which should be the user's 'display name' or 'screeen name') and uses the form proposed in [[hcard-examples]] for New Types of Contact Info to identify the AIM usernames.
In this example, two messages are exchanged followed by two status changes. Each message contains an hCard, identifying the user's nickname (which should map to the user's 'display name' or 'screen name') and uses the form proposed in [[hcard-examples]] for New Types of Contact Info to identify the AIM usernames.


Messages are quoted. Single line messages in Q elements and multiline messages in BLOCKQUOTE. There's no reason to limit each message to contain only one Q or BLOCKQUOTE, as depending on the precision of the timestamps being used, it may be appropriate to have allow messages from the same individual to be placed together.
Messages are quoted. Single line messages in Q elements and multiline messages in BLOCKQUOTE. There's no reason to limit each message to contain only one Q or BLOCKQUOTE, as depending on the precision of the timestamps being used, it may be appropriate to allow messages from the same individual to be placed together.


Status messages contain the timestamp, sender and then a TYPE. Since we're presenting humans-first information here, note that while the first status change ('Hanni went away') uses the exact status type, the second ('Hanni came back') uses the abbr-pattern to embed the 'online' status type.
Status messages contain the DT, SENDER and then a TYPE. Since we're presenting humans-first information here, note that the first status change ('Hanni went away') uses the exact status type, the second ('Hanni came back') uses the abbr-pattern to embed the 'online' status type name.


It may be necessary to first integrate the IM service URLs into any hChatLog microformat, and also provide a means to identify service usernames with less reliance on implications (for example, MSN accounts are identified by hotmail, msn or passport domains, which is not inclusive of MSN Messenger users with their own domains).
It may be necessary to integrate the IM service URL patterns into the 'hChatLog' microformat itself, depending on whether they are treated as a formal part of hCard yet, but a means to identify service usernames with less reliance on implications is needed (for example, MSN accounts are identified by @hotmail, @msn or @passport domains, which is not inclusive of MSN Messenger users with their own domains, while other IM services may not have a URI scheme at all).


===Example of 'chat-username' class, extending hCard===
===Example of 'chat-username' class, extending hCard===


Whilst fitting this into the process needs to be clarified, it would be clearest perhaps to introduce a 'chat-username' class within hCards in hChatLog, to make usernames more explicit.
Whilst fitting this into the process needs to be clarified, it would be clearest to introduce a 'chat-username' class within hCards in hChatLog, to identify usernames.


<pre>&lt;li class=&quot;message&quot;&gt;
<pre>&lt;li class=&quot;message&quot;&gt;
Line 110: Line 116:
&lt;/li&gt;</pre>
&lt;/li&gt;</pre>


This fits into existing patterns OK, but loses one very important piece of information, the service provider. This could be spec'd as a prefix (&lt;abbr class="chat-username" title="aim:BenWardcouk">…&lt;/abbr>) or could be a separate property altogether.
This fits into existing patterns OK, but loses one very important piece of information, the service provider. This could be spec'd as a prefix (&lt;abbr class="chat-username" title="aim:BenWardcouk">…&lt;/abbr>) or could be a separate property altogether (chat-service).


===Example with include-pattern===
===Example with include-pattern===


Introducing two new properties to each message sender would create some clutter in the mark-up, as (arguably) the repeition of hCards already is. We can use the include-pattern in hCard to clean this up, as demonstrated in the following HTML document.
Introducing two new properties to each message hCard would create much clutter in the mark-up, and the repeition of hCards is already sub-optimal. The include-pattern in hCard can be used to keep the format cleaner, as demonstrated below.


In this example, I'm back to using the aim: URL form of representing usernames.
Note: In this example, I'm still using the aim: URL form of representing usernames.


<pre>&lt;html&gt; &lt;!-- &hellip; --&gt;
<pre>&lt;html&gt; &lt;!-- &hellip; --&gt;
&lt;body&gt;
&lt;body&gt;
&lt;h1&gt;Chat between Ben and Hanni &ndash;&nbsp;Friday cOctober 26th&lt;/h1&gt;
&lt;h1&gt;Chat between Ben and Hanni &ndash;&nbsp;Friday October 26th&lt;/h1&gt;


&lt;h2&gt;Participants&lt;/h2&gt;
&lt;h2&gt;Participants&lt;/h2&gt;
&lt;ul&gt;
&lt;ul&gt;
&lt;!-- These are, of course, hCards in their own right --&gt;
   &lt;li id=&quot;benwardcouk&quot;&gt;
   &lt;li id=&quot;benwardcouk&quot;&gt;
     &lt;h3&gt;&lt;a class=&quot;fn url nickname&quot; href=&quot;aim:goaim?benwardcouk&quot;&gt;Ben&lt;/a&gt;&lt;/h3&gt;
     &lt;h3&gt;&lt;a class=&quot;fn url nickname&quot; href=&quot;aim:goaim?benwardcouk&quot;&gt;Ben&lt;/a&gt;&lt;/h3&gt;
Line 136: Line 141:
&lt;/ul&gt;
&lt;/ul&gt;


&lt;!-- HTML semantics suggest an Ordered List is best for messages --&gt;
&lt;ol class=&quot;hChatLog&quot;&gt;
&lt;ol class=&quot;hChatLog&quot;&gt;
  &lt;!-- Class names here are lifted from the Unified Logging Format --&gt;
   &lt;li class=&quot;message&quot;&gt;
   &lt;li class=&quot;message&quot;&gt;
     &lt;abbr class=&quot;dtstart&quot; title=&quot;2006-08-08T01:22:00+0100&quot;&gt;1:22am&lt;/abbr&gt;:  
     &lt;abbr class=&quot;dtstart&quot; title=&quot;2006-08-08T01:22:00+0100&quot;&gt;1:22am&lt;/abbr&gt;:  
Line 162: Line 165:
</pre>
</pre>


So in this last example, the repeated hCards are declared once at the top of the document, with extended details, and then can be parsed into the log itself using the include pattern.
In this last example, the repeated hCards are declared once at the top of the document, with additional detail, and then can be parsed into the log itself using the include-pattern. The A/@class=include pattern is used, rather than OBJECT, as it allows the user display name to be repeated in each message as text, but will be replaced when parsed.


Note: technically in the above the vcard class name is on the CITE elements rather than the LIs where the hCards are declared in full. This is because the include-pattern is currently specified only to work <em>inside</em> an hCard. It would be tidyer to have the vcard class on the LIs for the participants list, but unless the include-pattern can be extended to apply natively inside hChatLog, the hCards must be arranged like this.
Note: The VCARD class name is on the CITE elements in each message/status, rather than on the LIs where the hCards are declared in full. This is because the include-pattern currently only to works <em>inside</em> an hCard. It would be tidier to have the VCARD class on the LIs in the participants list, but unless the include-pattern can be extended to apply natively inside hChatLog, the hCards must be arranged like this.


== Ideas ==
== Ideas ==

Latest revision as of 20:06, 5 March 2014

Requirements

Scope

Should a microformat matching these requirements be capable of representing only transcripts of existing IM protocols or should it also be able to serve as an exchange format itself. This might be useful for simple AJAX IM platforms, although I doubt if XMPP is not per definition a better choise for such purposes. --BigSmoke 13:09, 21 Jun 2006 (PDT)

- Anything that can be parsed can be used in AJAX, so we don't need to consider this in developing a microformat. --Scott Reynen

How much, and what kind of data is going to be in the file? The work done by the Unified Logging Format WG has a pretty good overview of the various types of things that an IM client would want to log. Don't make too much of a bikeshed about it -- I'm mostly linking it beacuse it's a good overview of the sorts of general element types (message, event, status) we probably want to use. --Colin Barrett 05:33, 22 Aug 2006 (PDT)

There has been debate on the mailing list recently as to wether on-disk data storage or presentation should be the focus of this microformat. here is one thread. pipermail is a bit braindead so it may be useful to look at the whole october page

Chat rooms

Is it useful for this microformat to support the representation of "chat rooms", such as IRC channels? --BigSmoke

- Location is a problem that can be clearly separated from chats. We should stick to solving the smallest problem possible, so we can more easily combine microformats later to solve larger problems. --Scott Reynen

- On chat-formats and chat-examples, IRC logs are used. I would say we should include IRC logs in our spec -- it just makes sense to design for mult-user chat, because one-to-one messaging is just a special case of that. --Colin Barrett 05:33, 22 Aug 2006 (PDT)

Example playground

<div class="hchat-log">
  <p class="hchat-msg">
    <abbr class="time" title="YYYY-MM-DDTHH:MM:SS">HH:MM:SS</abbr>
    <!-- Please, fill me in -->
  </p>
</div>

hChatLog Strawman Proposal

Initially compiled by BenWard.

Following a post to uf-discuss, Colin Barrett requested I write up my hChatLog example in more detail, so I present it here as a straw man proposal to see if chat can gain any traction.

It's based around the initial brainstorming, but was initially led by my own preferences for mark-up. The class names introduced are reused or derived from other existing microformats, and new class names are based on the Unified Logging Format, as the element names in ULF are already human-friendly.

hChatLog Structure

  • hChatLog
    • message
      • dt (or time)
      • sender (which MUST be an hCard and SHOULD be a CITE element)
      • 'quoted message content' (which MUST be a Q or BLOCKQUOTE element)
    • status
      • dt (or time)
      • sender (which MUST be an hCard and SHOULD be a CITE element)
      • type (from a predefined list of values)
      • message (optional, the custom message the user assigns to a status, e.g. an 'away message')

Notes about the above structure

  • message, sender, status and type are all taken from ULF. Note that 'type' also corresponds with tel > type and adr > type in hCard.
  • 'dt' is proposed as a derivative of 'dtstart' and 'dtend' as used in hCalendar, however 'time' (again from ULF) may be preferable
  • The message text does not have a class name, and is instead identified as from quoted text within a message, marked up with Q (for common single line messages) or BLOCKQUOTE (for multiline or otherwise complex messages).

List of status 'type' values

This should be a single-word list of the common status types from current IM implementations, namely:

  • Online
  • Offline
  • Away
  • Busy
  • BRB (Be Right Back)
  • Lunch (Out To Lunch)
  • Phone (On The Phone)

Example

Note that this example uses an OL as the container, with LI elements for each message/status. This mark-up scheme for dialogue is subject to opinions of how strictly an Ordered List should be specified. For this reason OL/LI mark-up as is not specified as 'SHOULD' or 'MUST' in the structure above. The WHATWG list was recently included a discussion about tightening of the definition of OL for HTML5 (no definitive resolution was made though).

<!-- ‘hChatLog’ straw man by Ben Ward -->
<ol class="hChatLog">
  <li class="message">
    <abbr class="dt" title="2006-10-26T01:22:00+0100">1:22am</abbr>: 
    <cite class="sender vcard"><a class="fn nickname url" href="aim:goim?screenname=HanniUsername">Hanni</a></cite>
    <q>Hello Ben</q>
  </li>
  <li class="message">
    <abbr class="dt" title="2006-10-26T01:25:00+0100">1:25am</abbr>: 
    <cite class="sender vcard"><a class="fn nickname url" href="aim:goim?screenname=BenWardcouk">Ben</a></cite>
    <blockquote>
			<p>Hello Hanni</p>
    	<p>How're you today?</p>
		</blockquote>
  </li>
  <li class="status"> <!-- not 'event' -->
    <abbr class="dt" title="2006-08-08T01:27:00+0100">1:27am</abbr>: 
    <span class="sender vcard"><a class="fn nickname url" href="aim:goim?screenname=HanniUsername">Hanni</a> went <span class="type">away</span></span>
  </li>
  <li class="status">
    <abbr class="dt" title="2006-08-08T01:28:00+0100">1:28am</abbr>: 
    <span class="sender vcard"><a class="fn nickname url" href="aim:goim?screenname=HanniUsername">Hanni</a>
    <abbr class="type" title="online">came back</abbr></span>
  </li>
</ol>

In this example, two messages are exchanged followed by two status changes. Each message contains an hCard, identifying the user's nickname (which should map to the user's 'display name' or 'screen name') and uses the form proposed in hcard-examples for New Types of Contact Info to identify the AIM usernames.

Messages are quoted. Single line messages in Q elements and multiline messages in BLOCKQUOTE. There's no reason to limit each message to contain only one Q or BLOCKQUOTE, as depending on the precision of the timestamps being used, it may be appropriate to allow messages from the same individual to be placed together.

Status messages contain the DT, SENDER and then a TYPE. Since we're presenting humans-first information here, note that the first status change ('Hanni went away') uses the exact status type, the second ('Hanni came back') uses the abbr-pattern to embed the 'online' status type name.

It may be necessary to integrate the IM service URL patterns into the 'hChatLog' microformat itself, depending on whether they are treated as a formal part of hCard yet, but a means to identify service usernames with less reliance on implications is needed (for example, MSN accounts are identified by @hotmail, @msn or @passport domains, which is not inclusive of MSN Messenger users with their own domains, while other IM services may not have a URI scheme at all).

Example of 'chat-username' class, extending hCard

Whilst fitting this into the process needs to be clarified, it would be clearest to introduce a 'chat-username' class within hCards in hChatLog, to identify usernames.

<li class="message">
  <abbr class="dt" title="2006-10-26T01:25:00+0100">1:25am</abbr>: 
  <cite class="sender vcard"><a class="url" href="aim:goim?screenname=BenWardcouk"><abbr class="chat-username" title="BenWardcouk"><span class="fn nickname">Ben</span></abbr></a></cite>
  <q>Hello</q>
</li>

This fits into existing patterns OK, but loses one very important piece of information, the service provider. This could be spec'd as a prefix (<abbr class="chat-username" title="aim:BenWardcouk">…</abbr>) or could be a separate property altogether (chat-service).

Example with include-pattern

Introducing two new properties to each message hCard would create much clutter in the mark-up, and the repeition of hCards is already sub-optimal. The include-pattern in hCard can be used to keep the format cleaner, as demonstrated below.

Note: In this example, I'm still using the aim: URL form of representing usernames.

<html> <!-- … -->
<body>
<h1>Chat between Ben and Hanni – Friday October 26th</h1>

<h2>Participants</h2>
<ul>
  <li id="benwardcouk">
    <h3><a class="fn url nickname" href="aim:goaim?benwardcouk">Ben</a></h3>
		<p><a class="url" href="http://ben-ward.co.uk">Homepage</a></p>
		<p class="description">Ben is a 22 year old web application developer in Birmingham, England</p> 
  </li>
  <li id="hanni">
    <h3><a class="fn url nickname" href="aim:goaim?hanniusername">Hanni</a></h3>
		<p><a class="url" href="http://hanniross.com">Homepage</a></p>
  </li>
</ul>

<ol class="hChatLog">
  <li class="message">
    <abbr class="dtstart" title="2006-08-08T01:22:00+0100">1:22am</abbr>: 
    <cite class="sender vcard"><a class="include" href="#hanni">Hanni</a></cite>
    <q>Hello Ben</q>
  </li>
  <li class="message">
    <abbr class="dtstart" title="2006-08-08T01:25:00+0100">1:25am</abbr>: 
    <cite class="sender vcard"><a class="include" href="#benward">Ben</a></cite>
    <blockquote><p>Hello Hanni</p>
    <p>How're you today?</p></blockquote>
  </li>
  <li class="status">
    <abbr class="dtstart" title="2006-08-08T01:27:00+0100">1:27am</abbr>: 
    <span class="sender vcard"><a class="include" href="#hanni">Hanni</a> went <span class="type">away</span></span>
  </li>
  <li class="status">
    <abbr class="dtstart" title="2006-08-08T01:28:00+0100">1:28am</abbr>: 
    <span class="sender vcard"><a class="include" href="#hanni">Hanni</a> 
    <abbr class="type" title="online">came back</abbr></span>
  </li>
</ol>

In this last example, the repeated hCards are declared once at the top of the document, with additional detail, and then can be parsed into the log itself using the include-pattern. The A/@class=include pattern is used, rather than OBJECT, as it allows the user display name to be repeated in each message as text, but will be replaced when parsed.

Note: The VCARD class name is on the CITE elements in each message/status, rather than on the LIs where the hCards are declared in full. This is because the include-pattern currently only to works inside an hCard. It would be tidier to have the VCARD class on the LIs in the participants list, but unless the include-pattern can be extended to apply natively inside hChatLog, the hCards must be arranged like this.

Ideas

Using paragraphs to represent chat messages

I think that individual messages in a chat log should be formatted as XHTML paragraphs (<p>), because this is how conversations are commonly formatted. From the examples I gather that this is also what the ILRT Logger Bot currently does. --BigSmoke 13:09, 21 Jun 2006 (PDT)

- We can't assume all paragraphs are chat messages, so we'll need a class name to identify a chat message. Once a class name is identifying something as a message, what is the advantage of applying the additional stipulation of a specific HTML tag? It doesn't appear to aid parsing, and it only constrains publishers. --Scott Reynen

- I'm not convinced that ‘messages are paragraphs’ is an overly fair assumption: Lots of chat is extremely fragmented into sentences (or even partial sentences). I'd be nervous about generalising the P element any further than it all ready.

I have a lot of love for Anne van Kesteren's chat mark-up (using Q elements for single line text, and BLOCKQUOTE > P for multiline messages, where the presence of newlines seems a more concrete basis on which to describe paragraph).

As far as block level element construction goes, AvK's mark-up again highlights the capability of raw HTML: OL is certainly correct, as is CITE and Q/BLOCKQUOTE. Paragraphs might not always be correct.

--BenWard 12:29, 24 Sep 2006 (PDT)

- No sooner do I say ‘OL is certainly correct’ but something comes up to question it. Those interested in developing hChat might also like to keep an eye on the WHATWG list where there's been some questioning of using OL for dialogue. Additionally, there's a fresh discussion on dialogue mark-up at Eric Meyer's blog.

--BenWard 03:41, 24 Oct 2006 (PDT)