[uf-new] an equation/MathML/TeX microformat?
James Howison
james at howison.name
Sat Oct 27 10:46:46 PDT 2007
Since you are focusing on image files always being present ("Equation
images are 100% reliable") would another option be to encode the
MathML or TeX into the image file, then use a class name to indicate
that this image should be parsed to look for the data.
eg stick it into the Description or Comment field (or make your own
field keyword) in a PNG file:
http://www.libpng.org/pub/png/spec/1.2/PNG-Chunks.html#C.Anc-text
That could be added to some of the commonly used MathML->PNG or TeX-
>PNG tools. eg
http://redsymbol.net/software/l2p/dist/l2p-doc.html
Then you could use something like:
<img class="parse-image-for-mathml" alt="natural language description
of forumla" src="image-with-embedded-MathML.png" />
Of course this would need a tool to read the text out from the image
file, but those are widely available. Client side applications that
supported MathML could even read the MathML out of the image and
replace the PNG with the preferred MathML content.
Yeah, it wouldn't be in the HTML file, but you'd almost always also
have to download the image representation, no?
It would have the advantage of being able to be easily emailed too :)
--J
On Oct 27, 2007, at 11:46 AM, Paul Topping wrote:
> For an explanation of why I'm looking to embed MathML and/or TeX in
> HTML, I refer you to my original post for the details. The short
> answer
> is that MathML is not a universal enough solution. I am aware of
> most of
> what you mention below (especially the stuff about my own company's
> products -- thanks!) and agree with it. Perhaps I should come at this
> from a slightly different angle.
>
> Instead of addressing the successes and failures of MathML, let's look
> at the many "solutions" to the equation display in a web page problem.
> There are many websites that represent equations as images. They do
> this
> because of the universal browser compatibility of HTML with equation
> images. MathML is not a solution as it is not close to being
> universally
> supported in browsers. This is a big issue in education which is
> usually
> not in a position to dictate browsers and, perhaps more importantly,
> doesn't want to embrace any solution that might require the user to
> download plugins and/or fonts. Equation images are 100% reliable.
>
> Pretty much all the equation images are produced by rendering TeX or
> MathML using a variety of tools. In the case of TeX at least, this
> textual representation is also potentially useful in the browser so
> some
> math websites embed the TeX in alt text or as a comment. Many don't
> expose this textual representation at all.
>
> If math websites embedded the TeX or MathML used to produce each
> equation image in a standardized manner, it would enable client-side
> software to provide math accessibility and interoperability. And,
> if it
> could be done in a way that didn't interfere with other HTML features,
> such as commandeering alt text away from its intended purpose, that
> would be a win. Finally, it needs to be done in a rigorous way. For
> example, if a certain version of LaTeX is used for the representation,
> that should be declared so the s/w that processes the notation should
> not have to sniff and guess.
>
> Paul Topping
> Design Science, Inc.
> www.dessci.com
>
>> -----Original Message-----
>> From: microformats-new-bounces at microformats.org [mailto:microformats-
>> new-bounces at microformats.org] On Behalf Of Benjamin Hawkes-Lewis
>> Sent: Saturday, October 27, 2007 6:25 AM
>> To: For discussion of new microformats.
>> Subject: RE: [uf-new] an equation/MathML/TeX microformat?
>>
>>
>> On Thu, 2007-10-25 at 15:39 -0700, Paul Topping wrote:
>>
>>> I'm aware of the effort to support MathML in HTML5 but this effort
>> seems
>>> unlikely to bear fruit. Besides, I'm looking to create something
> that
>>> will work in "plain old HTML" which, as I understand it, is part of
>> the
>>> microformat philosophy. As I stated, MathML was originally intended
> to
>>> be implemented in browsers but the actuality leaves something to be
>>> desired. With HTML5, I would simply be waiting for yet another thing
>> to
>>> be implemented in browsers. Exactly what I want to avoid.
>>
>> Personally, I believe wanting to use MathML is one of the rare
>> use-cases that justifies choosing XHTML over HTML 4.01, in which case
>> you can inline MathML directly into your markup.
>>
>> Can you provide some more details about the problem you're trying to
>> solve, in particular:
>>
>> 1. What are the deficiencies of existing techniques for using
>> mathematical markup in conjunction with HTML that you are
> realistically
>> expecting to overcome?
>>
>> 2. What are the key existing consuming agents you'd like the solution
> to
>> those deficiencies to work with? e.g. What platforms, browsers, and
>> assistive technology?
>>
>> Currently, your company's MathPlayer ActiveX plugin can already speak
>> maths and has MSAA support for assistive technology:
>>
>> http://www.dessci.com/en/products/mathplayer/tech/accessibility.htm
>>
>> Gecko has built-in support for a subset of MathML; the FireVox
> extension
>> turns Firefox into a self-voicing browser on all three major
>> platforms
>> and can read MathML using Abraham Nemeth's Mathspeak rules:
>>
>> http://www.firevox.clcworld.net/features.html
>>
>> The next version of Opera will include support for a subset of MathML
>> and restore (some level of) support for assistive technologies on
>> Windows and Mac:
>>
>>
> http://dev.opera.com/articles/view/can-kestrels-do-math-mathml-
> support-
>> in/
>>
>> http://my.opera.com/desktopteam/blog/2007/08/31/focus-areas-during-
>> kestrel-development
>>
>> The only major engine missing support for MathML at all is WebKit:
>>
>> http://bugs.webkit.org/show_bug.cgi?id=3251
>>
>> This situation could be a /lot/ better.
>>
>> Your problem seems very similar to the problem of inlining RDF in
> HTML:
>>
>> http://infomesh.net/2002/rdfinhtml/
>>
>> http://esw.w3.org/topic/EmbeddingRDFinHTML
>>
>> The suggested approach of dumping XML markup inside comments was
>> actually used by Creative Commons and Moveable Type's Trackback
> system:
>>
>> http://www.xml.com/pub/a/2003/01/15/creative.html
>>
>> But I can't really see how embedding MathML markup inside HTML
> comments
>> is supposed to improve things with existing (or even future)
>> consuming
>> agents. You did say this:
>>
>>> Also, I want to put the MathML or TeX in the page, not in separate
>>> documents. Typical pages with math in them might have dozens of
>>> equations. Having their representation in separate files is
>>> inefficient but perhaps the biggest problem is that it makes
> authoring
>>> a lot more tedious as lots of small files have to be managed.
>>
>> The web stack tends to work by stitching together lots of different
>> resources to make up the final presentation the user experiences when
>> she visits a URI: multiple HTML documents, scripts, stylesheets,
> images,
>> Flash movies, and so on all get bound together by a master document.
> I'm
>> generally sceptical about attempts to fight against this model in the
>> name of efficiency without changing the HTML specification itself.
>>
>> There's also an issue of efficiency for consuming agents that don't
>> support MathML and would have to waste bandwidth on MathML
>> embedded in
>> comments.
>>
>> Having said that, additional requests are more expensive than
> additional
>> bytes. But even if you have to externalize MathML outside the main
> HTML
>> document, you could (I guess) cut down the additional requests to
>> just
>> one by using fragment identifiers to pinpoint individual equations in
> an
>> XML document containing a series of equations. You could even make
>> the
>> referenced document an XHTML version of the HTML document, and use a
>> <link rel="alternate" type="application/xhtml+xml"> element to point
> to
>> the XHTML version from the HEAD of the HTML version, and use the HTML
>> HREF and LONGDESC attributes to point to fragment identifiers for
>> particular equations. HTTP content negotiation could perhaps be used
> to
>> send supporting consuming agents to the XHTML version in the first
>> place.
>>
>> For very short equations, you might be able to do away with the
> external
>> document altogether by using data URIs instead of referencing an
> actual
>> external document:
>>
>> http://www.w3.org/TR/html401/struct/objects.html#adef-data
>>
>> http://en.wikipedia.org/wiki/Data:_URI_scheme
>>
>> If authoring tools make creating master documents hard, I think that
>> highlights a flaw in the authoring tools in question, and the
> authoring
>> tools are likely to be a little easier to "fix" than consuming
>> agents.
>> OpenDocuments consist of an archive of multiple resources, but the
> user
>> experience of editing an OpenDocument in OpenOffice.org is of editing
> a
>> seamless document.
>>
>> Using HTML comments to hide stuff is generally a bad idea since
> comments
>> should be disposable and sometimes are discarded. There are cases
> where
>> that doesn't matter. For example, it doesn't matter when conditional
>> comments are used to serve different styles to IE are okay
>> because, if
>> content has been kept separate from presentation, it should be
> possible
>> to use the browser's default CSS to achieve a usable presentation of
> the
>> document. Now there could be cases where using comments is worth the
>> risk even for data that is important to understanding a document, but
> I
>> don't really see sufficient advantages in this particular case to
> offset
>> the costs.
>>
>> --
>> Benjamin Hawkes-Lewis
>>
>> _______________________________________________
>> microformats-new mailing list
>> microformats-new at microformats.org
>> http://microformats.org/mailman/listinfo/microformats-new
>
> _______________________________________________
> microformats-new mailing list
> microformats-new at microformats.org
> http://microformats.org/mailman/listinfo/microformats-new
>
More information about the microformats-new
mailing list