plain-old-xml-considered-harmful: Difference between revisions

From Microformats Wiki
Jump to navigation Jump to search
(Cleaner grammar)
(Added issues with draconian error handling, character encoding complexities)
Line 34: Line 34:
to optimize for ease of publishing, and let iterative open source solve the
to optimize for ease of publishing, and let iterative open source solve the
programming problems.
programming problems.
XML also has disadvantages in that an XML processor is required to abort when it encounters an error, so a single unescaped ampersand can cause an XML document to be entirely unreadable.  This is hardly appropriate for an end-user application, so many people ignore this requirement and break the spec, so they're not actually using XML.  Furthermore, serving XML over HTTP is difficult; there are all kinds of complicated issues dealing with character encodings; start with [http://www.ietf.org/rfc/rfc3023.txt RFC 3023].


== See Also ==
== See Also ==

Revision as of 17:41, 7 September 2006

plain old xml considered harmful

(This article is a stub, feel free to expand upon it)

The plain old xml approach has already been tried by *numerous* others since 1998 and has failed on the Web.

http://blog.davidjanes.com/:entry:davidjanes-2005-10-04-0000/

OTOH, XHTML + semantic-class-names has seen widespread adoption among the web authoring/design/IA/publishing community. Microformats is leveraging the approach that is both working better and frankly dominating in practice on the Web.

http://microformats.org/blog/2006/01/09/tim-bray-on-creating-xml-dialects/

See also namespaces-considered-harmful.

XML elements are limited to only one "name" and thus only one meaning, whereas the class attribute is a space separated set of names and can thus capture multiple meanings, providing a much more flexible semantic structure for authors, and greatly aiding in following DRY.

There are 1000s more web authors/developers that write/understand (X)HTML + semantic class names + CSS as compared to the number of folks that write/understand either plain or namespaced XML.

It's the publishers that matter, not the programmers. To put it another way, programmers can solve problems once and share open source. Publishers have to keep solving markup/publishing problems for content and design numerous times continuously, and have much less chance of being able to share their solutions. That, plus the fact that there are many more web designers than programmers, plus simple economics, means the best solution is to optimize for ease of publishing, and let iterative open source solve the programming problems.

XML also has disadvantages in that an XML processor is required to abort when it encounters an error, so a single unescaped ampersand can cause an XML document to be entirely unreadable. This is hardly appropriate for an end-user application, so many people ignore this requirement and break the spec, so they're not actually using XML. Furthermore, serving XML over HTTP is difficult; there are all kinds of complicated issues dealing with character encodings; start with RFC 3023.

See Also