[uf-discuss] [chat] Microformats are not for data storage

Colin Barrett timber at lava.net
Mon Oct 30 01:37:32 PST 2006


On Oct 29, 2006, at 8:39 PM, Chris Messina wrote:

> Whohoo! A point of real debate!

:)

> Well, as I did when we first spoke months ago, I firmly disagree, and
> over time, I think you will be proven wrong. There's simply no point
> in having multiple instantiations of the same data in a text-based
> format (I'm exempting relational databases).

Mmm, I think you're confusing the issues of immediate presentation to  
the user and long term storage (one of which is a *human* problem and  
the other is a *machine* problem). In many cases -- specifically  
Adium's case -- the on-disk logs are going to be read by things that  
aren't even *close* to a web browser -- Spotlight, for instance.  
Gaim's log viewer is another example.

Also, think about an AJAXy web app (like Meebo) that constantly is  
modifying its DOM tree. I don't think that that markup and all of its  
included JS (and also the fact that it might be necessary to have a  
server connection) is suitable for long term storage and parsing. This  
is another reason against using a uF for long term storage -- you can  
have uFs in web pages with all sorts of extraneous information on them.

I think it's very telling that you exempt relational databases. I  
could imagine a uF for specifying a table (<table> anyone?) and the  
relations therein -- this could of course be easily parsed and  
modified by JS. The way I see it, you've already acknowledging the  
need for machine-friendly data that is separate from human-friendly  
data (uFs).

> In any case -- I know where you're coming from and I appreciate you
> making a public statement about your position. This will allow us to
> finally have this conversation out in daylight.

Thanks! I agree, it's important to have this out in the open. On that  
note, I'm curious if anything you've worked on might provide  
additional examples for discussion?

> Your point has been that straight XML is easier to parse than XHTML --
> which could include linked CSS for use in presentation, or JS to
> add/modify behavior.

A nitpicky, off-topic point: XML can include CSS and JS and can be  
styled.

In any case, my real point is that it's not the XHTML-nature of uFs  
makes them harder to parse. Rather, it's something about microformats.  
Microformats have to be much more flexible to interact with the other  
kinds of markup and presentation issues that exist on the web. A good  
example that popped up on the list recently was hCard's powerful (but  
tricky to parse) value class. A log that exists for machines to parse  
definitely doesn't need something like that.


> Furthermore, as new chat programs emerge, the ability to move data
> between them in a format that works in *any browser* seems like an
> investment in the future, as opposed to the present and recent past.

My main problem with this is the word "browser". Adium is not a web  
browser. It is a user agent, in that it displays (X)HTML, but it does  
no browsing of the web. As mentioned above, this is another instance  
of a non-browser working with HTML. See above for another example of  
the need for something that is easily read by a non-browser.

As far as in-browser elements like Google's Gtalk web chat, those logs  
are stored on-disk by the server and then displayed to the user.  
Storing the entire web page (or a section thereof) for the log file  
that will be shown to the user is impossible -- Gmail is a highly  
dynamic web page. Thus, there's a uF offers no real advantage in this  
case: the file is going to have to be read in and parsed anyway, so  
why not use something easier for machines (like XML).

> Lastly, the unknown uses and mashability of XHTML is one of the most
> important elements of using XHTML as opposed to custom XML.

Another nitpicky, offtopic issue: ULF isn't exactly custom, being that  
it's a published standard and has multiple implementations.

Otherwise, this is a good point, and one of the reasons why I'm not  
discounting a uF for chat all together -- I'm just arguing about the  
specific *focus* the format will take.

> These are a few of the basic assumptions that I work under. I would be
> very happy if you would discount them, one at a time. ;)

Hopefully I've done a satisfactory job. Anyone else can feel free to  
jump in, btw.

-Colin

> On 10/29/06, Colin Barrett <timber at lava.net> wrote:
>> Particularly, I'm going to be talking about "hChat".
>>
>> Personally, I see the primary use case of a chat microformat to be  
>> for
>> displaying of chat contents in a live (or semi-live) way. There are
>> quite a few examples of this in the wild -- web-based IM clients like
>> Meebo, and Adium and Kopete's message view, both of which are HTML
>> based.
>>
>> Contrarily, most data formats are not HTML based (XML based,  
>> usually),
>> or if they are, do not use modern HTML (i.e. AOL's HTML log format).
>>
>> I think the chat group should alter its focus to providing ways to
>> semantically talk about the structure of a chat and a particular
>> message entry. One of the most obvious benefits in this area is much
>> better clipboard support -- copying things out of Meebo and Adium/
>> Kopete's HTML view usually results in an un-useful mess.
>>
>> Another important use case, that I think there has been some research
>> towards already, is "snippets". That is, small snippets of an IM
>> conversation in a blog post. This, conveniently, goes along with my
>> earlier point about copy-paste.
>>
>> Just a couple of other points that have been percolating:
>>   - the use of hCard to mark up people's names
>>   - Time zone information is important -- particularly wrt. time time
>> zone of the sender and the time zone of the receiver.
>>
>> This email has been a long time coming -- I've waited for a couple of
>> months, just lurking and trying to understand the process. Should
>> people agree that this is the direction research should go, I'll  
>> start
>> working on getting links on the wiki to examples of documentations of
>> chat representations. I think that the idea of chat *logging* is a
>> well-solved problem by XML and microformats shouldn't be butting  
>> their
>> heads in -- especially since most of the time full logs aren't
>> available on the web (the uF log bot being a notable, and  
>> interesting,
>> exception).
>>
>> In the interest of full disclosure: I'm one of the developers of  
>> Adium
>> -- we're looking into improving our message view, and copy/paste has
>> been something that's plagued us for a while. I'm also one of the
>> primary authors of an XML based log format that's been adopted by
>> Adium, Gaim and Kopete.
>>
>> --
>> Colin Barrett
>> Developer, Adium
>> http://adiumx.com
>> _______________________________________________
>> microformats-discuss mailing list
>> microformats-discuss at microformats.org
>> http://microformats.org/mailman/listinfo/microformats-discuss
>>
>
>
> -- 
> Chris Messina
> Citizen Provocateur &
>  Open Source Ambassador-at-Large
> Work: http://citizenagency.com
> Blog: http://factoryjoe.com/blog
> Cell: 412 225-1051
> Skype: factoryjoe
> This email is:   [ ] bloggable    [X] ask first   [ ] private
> _______________________________________________
> microformats-discuss mailing list
> microformats-discuss at microformats.org
> http://microformats.org/mailman/listinfo/microformats-discuss



More information about the microformats-discuss mailing list