chat-brainstorming
Requirements
Scope
Should a microformat matching these requirements be capable of representing only transcripts of existing IM protocols or should it also be able to serve as an exchange format itself. This might be useful for simple AJAX IM platforms, although I doubt if XMPP is not per definition a better choise for such purposes. --BigSmoke 13:09, 21 Jun 2006 (PDT)
- Anything that can be parsed can be used in AJAX, so we don't need to consider this in developing a microformat. --Scott Reynen
How much, and what kind of data is going to be in the file? The work done by the Unified Logging Format WG has a pretty good overview of the various types of things that an IM client would want to log. Don't make too much of a bikeshed about it -- I'm mostly linking it beacuse it's a good overview of the sorts of general element types (message, event, status) we probably want to use. --Colin Barrett 05:33, 22 Aug 2006 (PDT)
Chat rooms
Is it useful for this microformat to support the representation of "chat rooms", such as IRC channels? --BigSmoke
- Location is a problem that can be clearly separated from chats. We should stick to solving the smallest problem possible, so we can more easily combine microformats later to solve larger problems. --Scott Reynen
- On chat-formats and chat-examples, IRC logs are used. I would say we should include IRC logs in our spec -- it just makes sense to design for mult-user chat, because one-to-one messaging is just a special case of that. --Colin Barrett 05:33, 22 Aug 2006 (PDT)
Example playground
<div class="hchat-log">
  <p class="hchat-msg">
    <abbr class="time" title="YYYY-MM-DDTHH:MM:SS">HH:MM:SS</abbr>
    <!-- Please, fill me in -->
  </p>
</div>
hChatLog Strawman Proposal
Initially compiled by Ben_Ward. (Please note: There are likely to be a large number of slightly random typos in this first version of the document, this is entirely the fault of my keyboard which intermittently spews extra characters into sentences when I press Option or Command. Also, it's 1am. I will attempt to revise it when I have access to more reliable hardware!)
Following a vaguely related post to uf-discuss, Colin Barrett requested I write up my hChatLog example in more detail, so I present it here as a straw man proposal to see if chat can gain any traction. It's based off reading around the initial brainstorming, but is initially led by my own preferences for mark-up. Class names, however, have where possible been reused or derived from other existing Microformats, and new class names are generally based on the Unified Logging Format, simply because the element names in ULF are very much human-friendly.
hChatLog Structure
- hChatLog
- message
- dt (or time)
- sender (which MUST be an hCard and SHOULD be a CITE element)
- 'quoted message content' (which MUST be a Q or BLOCKQUOTE element)
 
- status
- dt (or time)
- sender (which MUST be an hCard and SHOULD be a CITE element)
- type (from a predefined list of values)
 
 
- message
Notes about the above structure
- message, sender, status and type are all from ULF.
- 'dt' is proposed based on 'dtstart' and 'dtend' as used in hCalendar, however 'time' (again from ULF) may be preferable
- The message text does not have a class name, and is instead identified as being quoted text within a message, marked up with Q or BLOCKQUOTE.
- An alternative, pre-ULF draft name for 'status' was 'event', however this is bad as it conflicts too closely with hCalendar.
List of status type values
This should be a single-word list of the common status types from current IM implementations, namely:
- Online
- Offline
- Away
- Busy
- BRB (Be Right Back)
- Lunch (Out To Lunch)
This is better represented with a mark-up example. Note that this example uses an OL as the container, with LI elements for each message/status. This mark-up scheme for dialogue is oft debated and subject to opinions of how strictly an Ordered List shoul3dl be specified, hense not listing OL/LI mark-up as SHOULD or MUST in the structure above. It should be worth watching the WHATWG list as tightening of the definition has recently be proposed for HTML5.
Example
<!-- ‘hChatLog’ straw man by Ben Ward -->
<ol class="hChatLog">
  <li class="message">
    <abbr class="dt" title="2006-10-26T01:22:00+0100">1:22am</abbr>: 
    <cite class="sender vcard"><a class="fn nickname url" href="aim:goim?screenname=HanniUsername">Hanni</a></cite>
    <q>Hello Ben</q>
  </li>
  <li class="message">
    <abbr class="dt" title="2006-10-26T01:25:00+0100">1:25am</abbr>: 
    <cite class="sender vcard"><a class="fn nickname url" href="aim:goim?screenname=BenWardcouk">Ben</a></cite>
    <blockquote>
			<p>Hello Hanni</p>
    	<p>How're you today?</p>
		</blockquote>
  </li>
  <li class="status"> <!-- not 'event' -->
    <abbr class="dt" title="2006-08-08T01:27:00+0100">1:27am</abbr>: 
    <span class="sender vcard"><a class="fn nickname url" href="aim:goim?screenname=HanniUsername">Hanni</a> went <span class="type">away</span></span>
  </li>
  <li class="status">
    <abbr class="dt" title="2006-08-08T01:28:00+0100">1:28am</abbr>: 
    <span class="sender vcard"><a class="fn nickname url" href="aim:goim?screenname=HanniUsername">Hanni</a>
    <abbr class="type" title="online">came back</abbr></span>
  </li>
</ol>
In this example, two messages are exchanged followed by two status changes. Each message contains an hCard, identifying the user's nickname (which should be the user's 'display name' or 'screeen name') and uses the form proposed in hcard-examples for New Types of Contact Info to identify the AIM usernames.
Messages are quoted. Single line messages in Q elements and multiline messages in BLOCKQUOTE. There's no reason to limit each message to contain only one Q or BLOCKQUOTE, as depending on the precision of the timestamps being used, it may be appropriate to have allow messages from the same individual to be placed together.
Status messages contain the timestamp, sender and then a TYPE. Since we're presenting humans-first information here, note that while the first status change ('Hanni went away') uses the exact status type, the second ('Hanni came back') uses the abbr-pattern to embed the 'online' status type.
It may be necessary to first integrate the IM service URLs into any hChatLog microformat, and also provide a means to identify service usernames with less reliance on implications (for example, MSN accounts are identified by hotmail, msn or passport domains, which is not inclusive of MSN Messenger users with their own domains).
Example of 'chat-username' class, extending hCard
Whilst fitting this into the process needs to be clarified, it would be clearest perhaps to introduce a 'chat-username' class within hCards in hChatLog, to make usernames more explicit.
<li class="message"> <abbr class="dt" title="2006-10-26T01:25:00+0100">1:25am</abbr>: <cite class="sender vcard"><a class="url" href="aim:goim?screenname=BenWardcouk"><abbr class="chat-username" title="BenWardcouk"><span class="fn nickname">Ben</span></abbr></a></cite> <q>Hello</q> </li>
This fits into existing patterns OK, but loses one very important piece of information, the service provider. This could be spec'd as a prefix (<abbr class="chat-username" title="aim:BenWardcouk">…</abbr>) or could be a separate property altogether.
Example with include-pattern
Introducing two new properties to each message sender would create some clutter in the mark-up, as (arguably) the repeition of hCards already is. We can use the include-pattern in hCard to clean this up, as demonstrated in the following HTML document.
In this example, I'm back to using the aim: URL form of representing usernames.
<html> <!-- … -->
<body>
<h1>Chat between Ben and Hanni – Friday cOctober 26th</h1>
<h2>Participants</h2>
<ul>
	<!-- These are, of course, hCards in their own right -->
  <li id="benwardcouk">
    <h3><a class="fn url nickname" href="aim:goaim?benwardcouk">Ben</a></h3>
		<p><a class="url" href="http://ben-ward.co.uk">Homepage</a></p>
		<p class="description">Ben is a 22 year old web application developer in Birmingham, England</p> 
  </li>
  <li id="hanni">
    <h3><a class="fn url nickname" href="aim:goaim?hanniusername">Hanni</a></h3>
		<p><a class="url" href="http://hanniross.com">Homepage</a></p>
  </li>
</ul>
<!-- HTML semantics suggest an Ordered List is best for messages -->
<ol class="hChatLog">
  <!-- Class names here are lifted from the Unified Logging Format -->
  <li class="message">
    <abbr class="dtstart" title="2006-08-08T01:22:00+0100">1:22am</abbr>: 
    <cite class="sender vcard"><a class="include" href="#hanni">Hanni</a></cite>
    <q>Hello Ben</q>
  </li>
  <li class="message">
    <abbr class="dtstart" title="2006-08-08T01:25:00+0100">1:25am</abbr>: 
    <cite class="sender vcard"><a class="include" href="#benward">Ben</a></cite>
    <blockquote><p>Hello Hanni</p>
    <p>How're you today?</p></blockquote>
  </li>
  <li class="status">
    <abbr class="dtstart" title="2006-08-08T01:27:00+0100">1:27am</abbr>: 
    <span class="sender vcard"><a class="include" href="#hanni">Hanni</a> went <span class="type">away</span></span>
  </li>
  <li class="status">
    <abbr class="dtstart" title="2006-08-08T01:28:00+0100">1:28am</abbr>: 
    <span class="sender vcard"><a class="include" href="#hanni">Hanni</a> 
    <abbr class="type" title="online">came back</abbr></span>
  </li>
</ol>
So in this last example, the repeated hCards are declared once at the top of the document, with extended details, and then can be parsed into the log itself using the include pattern.
Note: technically in the above the vcard class name is on the CITE elements rather than the LIs where the hCards are declared in full. This is because the include-pattern is currently specified only to work inside an hCard. It would be tidyer to have the vcard class on the LIs for the participants list, but unless the include-pattern can be extended to apply natively inside hChatLog, the hCards must be arranged like this.
Ideas
Using paragraphs to represent chat messages
I think that individual messages in a chat log should be formatted as XHTML paragraphs (<p>), because this is how conversations are commonly formatted. From the examples I gather that this is also what the ILRT Logger Bot currently does. --BigSmoke 13:09, 21 Jun 2006 (PDT)
- We can't assume all paragraphs are chat messages, so we'll need a class name to identify a chat message. Once a class name is identifying something as a message, what is the advantage of applying the additional stipulation of a specific HTML tag? It doesn't appear to aid parsing, and it only constrains publishers. --Scott Reynen
- I'm not convinced that ‘messages are paragraphs’ is an overly fair assumption: Lots of chat is extremely fragmented into sentences (or even partial sentences). I'd be nervous about generalising the P element any further than it all ready.
I have a lot of love for Anne van Kesteren's chat mark-up (using Q elements for single line text, and BLOCKQUOTE > P for multiline messages, where the presence of newlines seems a more concrete basis on which to describe paragraph).
As far as block level element construction goes, AvK's mark-up again highlights the capability of raw HTML: OL is certainly correct, as is CITE and Q/BLOCKQUOTE. Paragraphs might not always be correct.
--BenWard 12:29, 24 Sep 2006 (PDT)
- No sooner do I say ‘OL is certainly correct’ but something comes up to question it. Those interested in developing hChat might also like to keep an eye on the WHATWG list where there's been some questioning of using OL for dialogue. Additionally, there's a fresh discussion on dialogue mark-up at Eric Meyer's blog.
--BenWard 03:41, 24 Oct 2006 (PDT)