[uf-discuss] [chat] Microformats are not for data storage

Colin Barrett timber at lava.net
Mon Oct 30 06:17:53 PST 2006


On Oct 30, 2006, at 3:39 AM, Greg Elin wrote:

> Woohoo! I'm excited by this thread as well and have had a more than a
> couple conversations with Kevin regarding the need for a chat
> microformat. I'm jumping in with 3 points and a disclosure of my own.
> Thanks for piping up Colin - I love Adium.

You're welcome! :) Glad to see this thread generating interest. I was  
afraid it would go without comment. :)

> Point 1: Chat is not just for chat anymore.
> -------------------------------------------------------------
> Continuous pressure to multi-tasks, work with people geographically
> dispersed (or silently with others in the same room), innovate
> quickly, and the very human need for back-channel signaling and
> communication is making chat one of dominant channel of
> communications. The need for a better abstract chat
> storage/exchange/display format is being driven by more than instant
> messaging:

Definitely agree with your premise here -- I see stuff like this  
happening al lthe time.

>  - growing use of chat as a back-channel at live events

Agreed.

>  - growing behavior of cutting and pasting chat snippets

Hugely agree.

>  - lack of ability to search multiple chat logs online as
> conversations the way we can
>    blogposts and even forum postings to some extent

Hmm, I'm not so sure. Adium's search works pretty well for me. Once  
1.0 comes out we'll be integrating OS-wide with Spotlight. It would be  
nice if I could get Spotlight to search google as well, or work in a  
google like way, but that's a different problem. The alternative is,  
of course, to plug google into the chat logs, but I would rather not  
have all of my chat transcripts indexed by GOOG ;)

>  - continuing need for IRC-to-Web-based chat gateways

In the "view only" direction, yes, definitely (rw is basically a web  
IRC client, and isn't a very interesting problem). Especially with a  
way to signal certain events as should be pushed through the gateway  
and others not.

>  - ability to use chat-stream for tracking attention-oriented events
> (Attention Stream)

Could you expound a little more here, please?

> Point 2 - XML vs. XHMTL vs. Microformats
> ---------------------------------------------------------------
> It strikes me there are two major threads Colin and Chris are
> discussing and it might be helpful to separate them a bit, at least
> for me.
>
> One theme is a microformat for chat. The other is a larger debate as
> to whether or not XML eventually collapses into something microformats
> or microformats-like. That is, can I express any given sequence of XML
> elements
>
> <item><subitem> data </subitem> </item>
>
> as a sequence of properly classed divs (e.g) XHMTL)
>
>  <div class='item'><div class='subitem'> data </div> </div> ?
>
> Am I right these are two threads here? If so, can someone catch me up
> on the state of discussion of the second thread? (There are also seems
> a related thread of display v. storage formats.) Pulling a part the
> discussion -- if it can be -- would be helpful in catching me up. (For
> me, this has hit home with Fotonotes since it is necessary to
> translate any storage/exchange format (either database or XML) into
> classed XHMTL in order for the increasingly universal render engine of
> the Browser+Javascript to manipulate it.  Should I define my own XML
> and respective elements, shoe-horn functionality into an existing
> format like Atom, or just $&%@-it and store it as XHMTL with classes?)

There is in fact a second (actual) thread, oddly enough. I started one  
(or replied to one) a couple months earlier -- It's not in my personal  
archives, and I haven't gone through and searched the pipermail  
interface yet. It's somewhere back there, though. That's kind of what  
Chris is referring to. I don't think it's horribly interesting from  
what I remember.

To go back to your concept of threads of conversation (and not threads  
of mail messages): I think you *can* render any XML document as a uF,  
for several reasons, the most obvious being that it's already in XML  
(the X in XHTML). The question is really *do you want to*. Assuming  
you can only use SAX, it's more straightforward to parse an XML  
document than an XHTML one, but not that much harder. The real trouble  
is that uFs can exist in HTML as well, which SAX will throw up on.  
Then you need to invoke an HTML parser, and since you've gone that  
far, you'll probably just use JS. When you're indexing thousands of  
logs, some of which may be in the multi-megabyte, things like this  
become important to think about.

Display v. Storage is definitely something I think Chris and me differ  
on. Which makes sense -- Chris is very interested and passionate about  
uFs, so it makes sense that he will initially propose a uF to solve a  
problem. I wouldn't call myself "passionate" about XML, but it a) does  
the trick, b) there are loads of tools for parsing and working with it  
and c) is fairly easy to get others to adpot -- in fact the Gaim  
people were going to implement it already and heard about our effort  
and jumped on board. FWIW, they too are experimenting with an SQL- 
based log format.

Finally, the problem you're talking about in the end of the above  
paragraph is exactly what we were talking about with Adium. We ended  
up deciding to make an XML format and standardize it across as many  
projects as we could. :)

>
>
> Point 3 - Chat as Command Line Prompt for People
> ----------------------------------------------------------------------------
> The more I use both chat and the command line, the more I experience
> the command line as a means of IM'ing my computer and chat as a way of
> responding to the people who have 'accounts' with me and need my
> functionality and processing power (and a way of getting status and
> queuing tasks for people to whom I am logged in).

That makes a *lot* of sense. *thinks about making a terminal service  
type for Adium*.

> You can only type in one text box at a time.  And I've begun to see
> more and more interface usage -- what I actually do at the computer --
> as typing one line at a time and more interfaces collapsing into the
> chat window. Email on my Blackberry, SMS messages, IRC, Chat, editing
> subetheredit or writely document, responding to gmail, adding an
> annotation to a photo, blog comments, AJAX-enabled in-line editing,
> forums, etc.

Well, that makes sense -- the keyboard is our primary input mechanism  
for text.

> Everywhere I turn I see a textbox with an long vertical tail of text
> trailing above it.

Almost sounds like one of those old arguments against GUIs ;)

> Every application I look at increasingly see user friendlier "command
> lines". AJAX is a BIG DEAL b/c it makes web pages truly interactive in
> a way Web 1.0 pages never were. Web 1.0 pages are like computer punch
> cards with prettier interfaces: submit a batch of instructions at one
> time and wait for the computer to print you back a page of fixed
> output. Repeat.)
>
> Instead of dashboard at the bottom of my Apple desktop I want
> "textbox" -- a simple area I could always type and with a few controls
> (or ultra simple /commands) do my work.

Tried Quicksilver[1] yet? Example run: ctrl-space 'killall Dock r  
<return>

That's equivalent to running the command like killall Dock (which is  
needed more often than you'd think...).

> Like Adium's wonderfully
> tabbed chat interface, or Flock's context aware reader/posting
> functionality...my 'textbox' would enable me to type in my box and
> direct my output to the context I wanted -- or several simultaneously:
> a chat to my colleague, a tagged note or todo to myself, a post to my
> blog, save a tagged url to delicious, an SMS to twitter, a google
> search, or a command line for the Internet. (The location bar is kind
> of nascent command line for the Internet if you think about, accepting
> an increasingly broad range of input. Non-linear visual applications
> like Photoshop and Final Cut are obviously not command-line
> oriented...but look again at Photoshop's non-desctructive action
> tracker or Apple's use of core graphics and video and you will see a
> tracked sequence of commands.)

Yeah, you're basically describing QS. It's freaking fantastic. I  
highly, highly suggest you, and any other mac users, try it out.

> Chat, in the sense of IRC and IM, is a special case of a "live
> interactions + log" where most of the messages are status queries and
> responses between people. It is going to be hugely beneficial to keep
> in mind the more abstract representation of this type interaction when
> thinking about formats for representing Chat. Even at the level of IRC
> and IM, we need a microformats structure for chat that recognizes chat
> as something larger than chat...as a sequence of discrete messages, or
> as "packets of conversation". And each packet can have value in
> contexts *outside* of the specific conversation of which it is a part.

I kind of agree? This point is a bit nebulous.

> For example, why the do I need to leave my chat window to open my
> calendar to note an appointment? Or, why does not appropriate
> modification to a document my colleague IM's me show up as a note I
> can accept inside my Word document?  I would love -- just love -- if
> my chat tools, like Adium, allowed me to chat with myself ...
> including my future and past selves ... as easily it does other
> people. Give me the chat interface where I can type to myself and tag
> it to find it later. (Yes, I know a few apps of the endless journal
> variety are exploring this. That is precisely my point.)

http://www.apple.com/macosx/leopard/mail.html

The new notes feature there actually has a public API any app on the  
OS can access. We plan to use that in Adium in some way :)

> Each snippet has an originating source and context...and a target
> source and context. A message *is* as well as *is about*. Right now,
> our Chat applications are very stupid and just pass messages depending
> upon the humans at either end (or the bots) to recognize the context.
> That is a very good thing. Yet, a few more hooks -- like tagging a
> message or recognizing the creator and recipient(s) operate in
> different yet overlapping contexts -- would enable the computer to
> handle certain functionality more easily and enable new functionality.
> The microformats for a chat needs to represent more than a message
> with a time stamp.

Definitely agree that a microformat would need more than just a  
message and a time stamp. Still, I'm not convince that an on-disk  
storage system for a chat application needs more than that. Address  
Book apps can be polled to get information to generate an hCard for a  
contact, for example. That would be really useful for Adium. Want to  
email someone you're chatting with? Click on a contact's name and an  
hCard pops up, then you click on the mailto: url and BAM you're in  
your mail client with the last three or so messages included in the  
mail in plain text (of course ;) as a reminder context.

> Disclosures
> ------------------
> I got sucked into chat when David Isenberg asked me to help create an
> "in-room chat" system in 2001 that included making the chat readable
> on a large screen. That experience enlarged my thinking about the
> nature of chat and how small changes in the UI had large social
> effects. I've gotten to work with Manuel Kiessling creator of A Really
> Simple Chat one of the first smooth web-based chat before Ajax took
> hold and together we experimented with using the chat engine/metaphor
> for an "Attention Stream" at Etech. I've done a bit of other hacking
> to see if I could extend the Chat-metaphor to a kind of generic UI
> described.  Jerry Michalski has tried to promulgate for a
> more-context-flexible interface called "linkido" where a simple data
> entry is provided the output of which can be targeted (and adjusted)
> for different messaging contexts: IM, word document, blog post, email,
> or simultaneous combinations context.

Very interesting stuff! Could you hook me up with URLs to those? It  
sound like you have a lot of interesting stuff to offer to the group  
in general.

-Colin

[1] http://quicksilver.blacktree.com/

> Greg Elin
>
>
> On 10/30/06, Colin Barrett <timber at lava.net> wrote:
>> On Oct 29, 2006, at 8:39 PM, Chris Messina wrote:
>>
>> > Whohoo! A point of real debate!
>>
>> :)
>>
>> > Well, as I did when we first spoke months ago, I firmly disagree,  
>> and
>> > over time, I think you will be proven wrong. There's simply no  
>> point
>> > in having multiple instantiations of the same data in a text-based
>> > format (I'm exempting relational databases).
>>
>> Mmm, I think you're confusing the issues of immediate presentation to
>> the user and long term storage (one of which is a *human* problem and
>> the other is a *machine* problem). In many cases -- specifically
>> Adium's case -- the on-disk logs are going to be read by things that
>> aren't even *close* to a web browser -- Spotlight, for instance.
>> Gaim's log viewer is another example.
>>
>> Also, think about an AJAXy web app (like Meebo) that constantly is
>> modifying its DOM tree. I don't think that that markup and all of its
>> included JS (and also the fact that it might be necessary to have a
>> server connection) is suitable for long term storage and parsing.  
>> This
>> is another reason against using a uF for long term storage -- you can
>> have uFs in web pages with all sorts of extraneous information on  
>> them.
>>
>> I think it's very telling that you exempt relational databases. I
>> could imagine a uF for specifying a table (<table> anyone?) and the
>> relations therein -- this could of course be easily parsed and
>> modified by JS. The way I see it, you've already acknowledging the
>> need for machine-friendly data that is separate from human-friendly
>> data (uFs).
>>
>> > In any case -- I know where you're coming from and I appreciate you
>> > making a public statement about your position. This will allow us  
>> to
>> > finally have this conversation out in daylight.
>>
>> Thanks! I agree, it's important to have this out in the open. On that
>> note, I'm curious if anything you've worked on might provide
>> additional examples for discussion?
>>
>> > Your point has been that straight XML is easier to parse than  
>> XHTML --
>> > which could include linked CSS for use in presentation, or JS to
>> > add/modify behavior.
>>
>> A nitpicky, off-topic point: XML can include CSS and JS and can be
>> styled.
>>
>> In any case, my real point is that it's not the XHTML-nature of uFs
>> makes them harder to parse. Rather, it's something about  
>> microformats.
>> Microformats have to be much more flexible to interact with the other
>> kinds of markup and presentation issues that exist on the web. A good
>> example that popped up on the list recently was hCard's powerful (but
>> tricky to parse) value class. A log that exists for machines to parse
>> definitely doesn't need something like that.
>>
>>
>> > Furthermore, as new chat programs emerge, the ability to move data
>> > between them in a format that works in *any browser* seems like an
>> > investment in the future, as opposed to the present and recent  
>> past.
>>
>> My main problem with this is the word "browser". Adium is not a web
>> browser. It is a user agent, in that it displays (X)HTML, but it does
>> no browsing of the web. As mentioned above, this is another instance
>> of a non-browser working with HTML. See above for another example of
>> the need for something that is easily read by a non-browser.
>>
>> As far as in-browser elements like Google's Gtalk web chat, those  
>> logs
>> are stored on-disk by the server and then displayed to the user.
>> Storing the entire web page (or a section thereof) for the log file
>> that will be shown to the user is impossible -- Gmail is a highly
>> dynamic web page. Thus, there's a uF offers no real advantage in this
>> case: the file is going to have to be read in and parsed anyway, so
>> why not use something easier for machines (like XML).
>>
>> > Lastly, the unknown uses and mashability of XHTML is one of the  
>> most
>> > important elements of using XHTML as opposed to custom XML.
>>
>> Another nitpicky, offtopic issue: ULF isn't exactly custom, being  
>> that
>> it's a published standard and has multiple implementations.
>>
>> Otherwise, this is a good point, and one of the reasons why I'm not
>> discounting a uF for chat all together -- I'm just arguing about the
>> specific *focus* the format will take.
>>
>> > These are a few of the basic assumptions that I work under. I  
>> would be
>> > very happy if you would discount them, one at a time. ;)
>>
>> Hopefully I've done a satisfactory job. Anyone else can feel free to
>> jump in, btw.
>>
>> -Colin
>>
>> > On 10/29/06, Colin Barrett <timber at lava.net> wrote:
>> >> Particularly, I'm going to be talking about "hChat".
>> >>
>> >> Personally, I see the primary use case of a chat microformat to be
>> >> for
>> >> displaying of chat contents in a live (or semi-live) way. There  
>> are
>> >> quite a few examples of this in the wild -- web-based IM clients  
>> like
>> >> Meebo, and Adium and Kopete's message view, both of which are HTML
>> >> based.
>> >>
>> >> Contrarily, most data formats are not HTML based (XML based,
>> >> usually),
>> >> or if they are, do not use modern HTML (i.e. AOL's HTML log  
>> format).
>> >>
>> >> I think the chat group should alter its focus to providing ways to
>> >> semantically talk about the structure of a chat and a particular
>> >> message entry. One of the most obvious benefits in this area is  
>> much
>> >> better clipboard support -- copying things out of Meebo and Adium/
>> >> Kopete's HTML view usually results in an un-useful mess.
>> >>
>> >> Another important use case, that I think there has been some  
>> research
>> >> towards already, is "snippets". That is, small snippets of an IM
>> >> conversation in a blog post. This, conveniently, goes along with  
>> my
>> >> earlier point about copy-paste.
>> >>
>> >> Just a couple of other points that have been percolating:
>> >>   - the use of hCard to mark up people's names
>> >>   - Time zone information is important -- particularly wrt. time  
>> time
>> >> zone of the sender and the time zone of the receiver.
>> >>
>> >> This email has been a long time coming -- I've waited for a  
>> couple of
>> >> months, just lurking and trying to understand the process. Should
>> >> people agree that this is the direction research should go, I'll
>> >> start
>> >> working on getting links on the wiki to examples of  
>> documentations of
>> >> chat representations. I think that the idea of chat *logging* is a
>> >> well-solved problem by XML and microformats shouldn't be butting
>> >> their
>> >> heads in -- especially since most of the time full logs aren't
>> >> available on the web (the uF log bot being a notable, and
>> >> interesting,
>> >> exception).
>> >>
>> >> In the interest of full disclosure: I'm one of the developers of
>> >> Adium
>> >> -- we're looking into improving our message view, and copy/paste  
>> has
>> >> been something that's plagued us for a while. I'm also one of the
>> >> primary authors of an XML based log format that's been adopted by
>> >> Adium, Gaim and Kopete.
>> >>
>> >> --
>> >> Colin Barrett
>> >> Developer, Adium
>> >> http://adiumx.com
>> >> _______________________________________________
>> >> microformats-discuss mailing list
>> >> microformats-discuss at microformats.org
>> >> http://microformats.org/mailman/listinfo/microformats-discuss
>> >>
>> >
>> >
>> > --
>> > Chris Messina
>> > Citizen Provocateur &
>> >  Open Source Ambassador-at-Large
>> > Work: http://citizenagency.com
>> > Blog: http://factoryjoe.com/blog
>> > Cell: 412 225-1051
>> > Skype: factoryjoe
>> > This email is:   [ ] bloggable    [X] ask first   [ ] private
>> > _______________________________________________
>> > microformats-discuss mailing list
>> > microformats-discuss at microformats.org
>> > http://microformats.org/mailman/listinfo/microformats-discuss
>>
>> _______________________________________________
>> microformats-discuss mailing list
>> microformats-discuss at microformats.org
>> http://microformats.org/mailman/listinfo/microformats-discuss
>>
>
>
> -- 
> Greg Elin
> http://fotonotes.net - Because photos have stories (TM)
> http://duhblog.com
> _______________________________________________
> microformats-discuss mailing list
> microformats-discuss at microformats.org
> http://microformats.org/mailman/listinfo/microformats-discuss



More information about the microformats-discuss mailing list