[uf-new] an equation/MathML/TeX microformat?

Sat Oct 27 06:24:33 PDT 2007

On Thu, 2007-10-25 at 15:39 -0700, Paul Topping wrote:

> I'm aware of the effort to support MathML in HTML5 but this effort seems
> unlikely to bear fruit. Besides, I'm looking to create something that
> will work in "plain old HTML" which, as I understand it, is part of the
> microformat philosophy. As I stated, MathML was originally intended to
> be implemented in browsers but the actuality leaves something to be
> desired. With HTML5, I would simply be waiting for yet another thing to
> be implemented in browsers. Exactly what I want to avoid.

Personally, I believe wanting to use MathML is one of the rare
use-cases that justifies choosing XHTML over HTML 4.01, in which case
you can inline MathML directly into your markup.

Can you provide some more details about the problem you're trying to
solve, in particular:

1. What are the deficiencies of existing techniques for using
mathematical markup in conjunction with HTML that you are realistically
expecting to overcome?

2. What are the key existing consuming agents you'd like the solution to
those deficiencies to work with? e.g. What platforms, browsers, and
assistive technology?

Currently, your company's MathPlayer ActiveX plugin can already speak
maths and has MSAA support for assistive technology:

http://www.dessci.com/en/products/mathplayer/tech/accessibility.htm

Gecko has built-in support for a subset of MathML; the FireVox extension
turns Firefox into a self-voicing browser on all three major platforms
and can read MathML using Abraham Nemeth's Mathspeak rules:

http://www.firevox.clcworld.net/features.html

The next version of Opera will include support for a subset of MathML
and restore (some level of) support for assistive technologies on
Windows and Mac: 

http://dev.opera.com/articles/view/can-kestrels-do-math-mathml-support-in/

http://my.opera.com/desktopteam/blog/2007/08/31/focus-areas-during-kestrel-development

The only major engine missing support for MathML at all is WebKit:

http://bugs.webkit.org/show_bug.cgi?id=3251

This situation could be a /lot/ better. 

Your problem seems very similar to the problem of inlining RDF in HTML:

http://infomesh.net/2002/rdfinhtml/

http://esw.w3.org/topic/EmbeddingRDFinHTML

The suggested approach of dumping XML markup inside comments was
actually used by Creative Commons and Moveable Type's Trackback system:

http://www.xml.com/pub/a/2003/01/15/creative.html

But I can't really see how embedding MathML markup inside HTML comments
is supposed to improve things with existing (or even future) consuming
agents. You did say this:

> Also, I want to put the MathML or TeX in the page, not in separate
> documents. Typical pages with math in them might have dozens of
> equations. Having their representation in separate files is
> inefficient but perhaps the biggest problem is that it makes authoring
> a lot more tedious as lots of small files have to be managed. 

The web stack tends to work by stitching together lots of different
resources to make up the final presentation the user experiences when
she visits a URI: multiple HTML documents, scripts, stylesheets, images,
Flash movies, and so on all get bound together by a master document. I'm
generally sceptical about attempts to fight against this model in the
name of efficiency without changing the HTML specification itself. 

There's also an issue of efficiency for consuming agents that don't
support MathML and would have to waste bandwidth on MathML embedded in
comments.

Having said that, additional requests are more expensive than additional
bytes. But even if you have to externalize MathML outside the main HTML
document, you could (I guess) cut down the additional requests to just
one by using fragment identifiers to pinpoint individual equations in an
XML document containing a series of equations. You could even make the
referenced document an XHTML version of the HTML document, and use a
<link rel="alternate" type="application/xhtml+xml"> element to point to
the XHTML version from the HEAD of the HTML version, and use the HTML
HREF and LONGDESC attributes to point to fragment identifiers for
particular equations. HTTP content negotiation could perhaps be used to
send supporting consuming agents to the XHTML version in the first
place.

For very short equations, you might be able to do away with the external
document altogether by using data URIs instead of referencing an actual
external document:

http://www.w3.org/TR/html401/struct/objects.html#adef-data

http://en.wikipedia.org/wiki/Data:_URI_scheme

If authoring tools make creating master documents hard, I think that
highlights a flaw in the authoring tools in question, and the authoring
tools are likely to be a little easier to "fix" than consuming agents.
OpenDocuments consist of an archive of multiple resources, but the user
experience of editing an OpenDocument in OpenOffice.org is of editing a
seamless document.

Using HTML comments to hide stuff is generally a bad idea since comments
should be disposable and sometimes are discarded. There are cases where
that doesn't matter. For example, it doesn't matter when conditional
comments are used to serve different styles to IE are okay because, if
content has been kept separate from presentation, it should be possible
to use the browser's default CSS to achieve a usable presentation of the
document. Now there could be cases where using comments is worth the
risk even for data that is important to understanding a document, but I
don't really see sufficient advantages in this particular case to offset
the costs.

--
Benjamin Hawkes-Lewis