chat-brainstorming
laerchic
Requirements
Scope
Should a microformat matching these requirements be capable of representing only transcripts of existing IM protocols or should it also be able to serve as an exchange format itself. This might be useful for simple AJAX IM platforms, although I doubt if XMPP is not per definition a better choise for such purposes. --BigSmoke 13:09, 21 Jun 2006 (PDT)
- Anything that can be parsed can be used in AJAX, so we don't need to consider this in developing a microformat. --Scott Reynen
How much, and what kind of data is going to be in the file? The work done by the Unified Logging Format WG has a pretty good overview of the various types of things that an IM client would want to log. Don't make too much of a bikeshed about it -- I'm mostly linking it beacuse it's a good overview of the sorts of general element types (message, event, status) we probably want to use. --Colin Barrett 05:33, 22 Aug 2006 (PDT)
There has been debate on the mailing list recently as to wether on-disk data storage or presentation should be the focus of this microformat. here is one thread. pipermail is a bit braindead so it may be useful to look at the whole october page
Chat rooms
Is it useful for this microformat to support the representation of "chat rooms", such as IRC channels? --BigSmoke
- Location is a problem that can be clearly separated from chats. We should stick to solving the smallest problem possible, so we can more easily combine microformats later to solve larger problems. --Scott Reynen
- On chat-formats and chat-examples, IRC logs are used. I would say we should include IRC logs in our spec -- it just makes sense to design for mult-user chat, because one-to-one messaging is just a special case of that. --Colin Barrett 05:33, 22 Aug 2006 (PDT)
Example playground
<div class="hchat-log">
  <p class="hchat-msg">
    <abbr class="time" title="YYYY-MM-DDTHH:MM:SS">HH:MM:SS</abbr>
    <!-- Please, fill me in -->
  </p>
</div>
hChatLog Strawman Proposal
Initially compiled by BenWard.
Following a post to uf-discuss, Colin Barrett requested I write up my hChatLog example in more detail, so I present it here as a straw man proposal to see if chat can gain any traction.
It's based around the initial brainstorming, but was initially led by my own preferences for mark-up. The class names introduced are reused or derived from other existing microformats, and new class names are based on the Unified Logging Format, as the element names in ULF are already human-friendly.
hChatLog Structure
- hChatLog
- message
- dt (or time)
- sender (which MUST be an hCard and SHOULD be a CITE element)
- 'quoted message content' (which MUST be a Q or BLOCKQUOTE element)
 
- status
- dt (or time)
- sender (which MUST be an hCard and SHOULD be a CITE element)
- type (from a predefined list of values)
- message (optional, the custom message the user assigns to a status, e.g. an 'away message')
 
 
- message
Notes about the above structure
- message, sender, status and type are all taken from ULF. Note that 'type' also corresponds with tel > type and adr > type in hCard.
- 'dt' is proposed as a derivative of 'dtstart' and 'dtend' as used in hCalendar, however 'time' (again from ULF) may be preferable
- The message text does not have a class name, and is instead identified as from quoted text within a message, marked up with Q (for common single line messages) or BLOCKQUOTE (for multiline or otherwise complex messages).
List of status 'type' values
This should be a single-word list of the common status types from current IM implementations, namely:
- Online
- Offline
- Away
- Busy
- BRB (Be Right Back)
- Lunch (Out To Lunch)
- Phone (On The Phone)
Example
Note that this example uses an OL as the container, with LI elements for each message/status. This mark-up scheme for dialogue is subject to opinions of how strictly an Ordered List should be specified. For this reason OL/LI mark-up as is not specified as 'SHOULD' or 'MUST' in the structure above. The WHATWG list was recently included a discussion about tightening of the definition of OL for HTML5 (no definitive resolution was made though).
<!-- ‘hChatLog’ straw man by Ben Ward -->
<ol class="hChatLog">
  <li class="message">
    <abbr class="dt" title="2006-10-26T01:22:00+0100">1:22am</abbr>: 
    <cite class="sender vcard"><a class="fn nickname url" href="aim:goim?screenname=HanniUsername">Hanni</a></cite>
    <q>Hello Ben</q>
  </li>
  <li class="message">
    <abbr class="dt" title="2006-10-26T01:25:00+0100">1:25am</abbr>: 
    <cite class="sender vcard"><a class="fn nickname url" href="aim:goim?screenname=BenWardcouk">Ben</a></cite>
    <blockquote>
			<p>Hello Hanni</p>
    	<p>How're you today?</p>
		</blockquote>
  </li>
  <li class="status"> <!-- not 'event' -->
    <abbr class="dt" title="2006-08-08T01:27:00+0100">1:27am</abbr>: 
    <span class="sender vcard"><a class="fn nickname url" href="aim:goim?screenname=HanniUsername">Hanni</a> went <span class="type">away</span></span>
  </li>
  <li class="status">
    <abbr class="dt" title="2006-08-08T01:28:00+0100">1:28am</abbr>: 
    <span class="sender vcard"><a class="fn nickname url" href="aim:goim?screenname=HanniUsername">Hanni</a>
    <abbr class="type" title="online">came back</abbr></span>
  </li>
</ol>
In this example, two messages are exchanged followed by two status changes. Each message contains an hCard, identifying the user's nickname (which should map to the user's 'display name' or 'screen name') and uses the form proposed in hcard-examples for New Types of Contact Info to identify the AIM usernames.
Messages are quoted. Single line messages in Q elements and multiline messages in BLOCKQUOTE. There's no reason to limit each message to contain only one Q or BLOCKQUOTE, as depending on the precision of the timestamps being used, it may be appropriate to allow messages from the same individual to be placed together.
Status messages contain the DT, SENDER and then a TYPE. Since we're presenting humans-first information here, note that the first status change ('Hanni went away') uses the exact status type, the second ('Hanni came back') uses the abbr-pattern to embed the 'online' status type name.
It may be necessary to integrate the IM service URL patterns into the 'hChatLog' microformat itself, depending on whether they are treated as a formal part of hCard yet, but a means to identify service usernames with less reliance on implications is needed (for example, MSN accounts are identified by @hotmail, @msn or @passport domains, which is not inclusive of MSN Messenger users with their own domains, while other IM services may not have a URI scheme at all).
Example of 'chat-username' class, extending hCard
Whilst fitting this into the process needs to be clarified, it would be clearest to introduce a 'chat-username' class within hCards in hChatLog, to identify usernames.
<li class="message"> <abbr class="dt" title="2006-10-26T01:25:00+0100">1:25am</abbr>: <cite class="sender vcard"><a class="url" href="aim:goim?screenname=BenWardcouk"><abbr class="chat-username" title="BenWardcouk"><span class="fn nickname">Ben</span></abbr></a></cite> <q>Hello</q> </li>
This fits into existing patterns OK, but loses one very important piece of information, the service provider. This could be spec'd as a prefix (<abbr class="chat-username" title="aim:BenWardcouk">â¦</abbr>) or could be a separate property altogether (chat-service).
Example with include-pattern
Introducing two new properties to each message hCard would create much clutter in the mark-up, and the repeition of hCards is already sub-optimal. The include-pattern in hCard can be used to keep the format cleaner, as demonstrated below.
Note: In this example, I'm still using the aim: URL form of representing usernames.
<html> <!-- … -->
<body>
<h1>Chat between Ben and Hanni – Friday October 26th</h1>
<h2>Participants</h2>
<ul>
  <li id="benwardcouk">
    <h3><a class="fn url nickname" href="aim:goaim?benwardcouk">Ben</a></h3>
		<p><a class="url" href="http://ben-ward.co.uk">Homepage</a></p>
		<p class="description">Ben is a 22 year old web application developer in Birmingham, England</p> 
  </li>
  <li id="hanni">
    <h3><a class="fn url nickname" href="aim:goaim?hanniusername">Hanni</a></h3>
		<p><a class="url" href="http://hanniross.com">Homepage</a></p>
  </li>
</ul>
<ol class="hChatLog">
  <li class="message">
    <abbr class="dtstart" title="2006-08-08T01:22:00+0100">1:22am</abbr>: 
    <cite class="sender vcard"><a class="include" href="#hanni">Hanni</a></cite>
    <q>Hello Ben</q>
  </li>
  <li class="message">
    <abbr class="dtstart" title="2006-08-08T01:25:00+0100">1:25am</abbr>: 
    <cite class="sender vcard"><a class="include" href="#benward">Ben</a></cite>
    <blockquote><p>Hello Hanni</p>
    <p>How're you today?</p></blockquote>
  </li>
  <li class="status">
    <abbr class="dtstart" title="2006-08-08T01:27:00+0100">1:27am</abbr>: 
    <span class="sender vcard"><a class="include" href="#hanni">Hanni</a> went <span class="type">away</span></span>
  </li>
  <li class="status">
    <abbr class="dtstart" title="2006-08-08T01:28:00+0100">1:28am</abbr>: 
    <span class="sender vcard"><a class="include" href="#hanni">Hanni</a> 
    <abbr class="type" title="online">came back</abbr></span>
  </li>
</ol>
In this last example, the repeated hCards are declared once at the top of the document, with additional detail, and then can be parsed into the log itself using the include-pattern. The A/@class=include pattern is used, rather than OBJECT, as it allows the user display name to be repeated in each message as text, but will be replaced when parsed.
Note: The VCARD class name is on the CITE elements in each message/status, rather than on the LIs where the hCards are declared in full. This is because the include-pattern currently only to works inside an hCard. It would be tidier to have the VCARD class on the LIs in the participants list, but unless the include-pattern can be extended to apply natively inside hChatLog, the hCards must be arranged like this.
Ideas
Using paragraphs to represent chat messages
I think that individual messages in a chat log should be formatted as XHTML paragraphs (<p>), because this is how conversations are commonly formatted. From the examples I gather that this is also what the ILRT Logger Bot currently does. --BigSmoke 13:09, 21 Jun 2006 (PDT)
- We can't assume all paragraphs are chat messages, so we'll need a class name to identify a chat message. Once a class name is identifying something as a message, what is the advantage of applying the additional stipulation of a specific HTML tag? It doesn't appear to aid parsing, and it only constrains publishers. --Scott Reynen
- I'm not convinced that âmessages are paragraphsâ is an overly fair assumption: Lots of chat is extremely fragmented into sentences (or even partial sentences). I'd be nervous about generalising the P element any further than it all ready.
I have a lot of love for Anne van Kesteren's chat mark-up (using Q elements for single line text, and BLOCKQUOTE > P for multiline messages, where the presence of newlines seems a more concrete basis on which to describe paragraph).
As far as block level element construction goes, AvK's mark-up again highlights the capability of raw HTML: OL is certainly correct, as is CITE and Q/BLOCKQUOTE. Paragraphs might not always be correct.
--BenWard 12:29, 24 Sep 2006 (PDT)
- No sooner do I say âOL is certainly correctâ but something comes up to question it. Those interested in developing hChat might also like to keep an eye on the WHATWG list where there's been some questioning of using OL for dialogue. Additionally, there's a fresh discussion on dialogue mark-up at Eric Meyer's blog.
--BenWard 03:41, 24 Oct 2006 (PDT)