namespaces-considered-harmful: Difference between revisions

From Microformats Wiki
Jump to navigation Jump to search
mNo edit summary
m (Replace <entry-title> with {{DISPLAYTITLE:}})
 
(15 intermediate revisions by 5 users not shown)
Line 1: Line 1:
<h1> namespaces considered harmful </h1>
{{DISPLAYTITLE: namespaces considered harmful }}
{{TOC-right}}


In particular namespaces for '''content''' are considered harmful (e.g. XML namespaces, QNames in attributes etc.).  Namespaces for code is outside the bounds of the topic of this page.
In particular namespaces for '''content''' are considered harmful (e.g. XML namespaces, QNames in attributes etc.).  Namespaces for code is outside the bounds of the topic of this page.
Line 10: Line 9:


It's been tried by numerous groups, before microformats, and after.  It's even been tried in the context of RSS and RDF, and in practice people write
It's been tried by numerous groups, before microformats, and after.  It's even been tried in the context of RSS and RDF, and in practice people write
scrapers that look for namespace prefixes as if they are part of the element name, not as mere shorthands for namespace URIs.
scrapers that look for namespace prefixes as if they are part of the element name, or perform literal string matches on common namespace prefix uses (e.g. [http://google.com/codesearch/p?hl=en#J7osgojbryc/src/java/org/web3d/x3d/jaxp/X3DSAVAdapter.java&q=X3DSAVAdapter&l=854 1]), not as mere shorthands for namespace URIs.


If you want to carry on a theoretical discussion of namespaces, please do so elsewhere, for in practice, discussing them is a waste of time, and
If you want to carry on a theoretical discussion of namespaces, please do so elsewhere, for in practice, discussing them is a waste of time, and
Line 23: Line 22:
* [http://www.xml.com/pub/a/2004/07/21/dive.html XML on the Web has Failed by Mark Pilgrim]
* [http://www.xml.com/pub/a/2004/07/21/dive.html XML on the Web has Failed by Mark Pilgrim]
* [http://microformats.org/blog/2006/01/09/tim-bray-on-creating-xml-dialects/ Tim Bray on creating XML dialects]
* [http://microformats.org/blog/2006/01/09/tim-bray-on-creating-xml-dialects/ Tim Bray on creating XML dialects]
=== implementation experience lacks beneficial anecdotes ===
As [http://krijnhoetmer.nl/irc-logs/whatwg/20080801#l-153 hsivonen noted in IRC 2008-08-01]: <blockquote><p>I had dinner with friends who write software. It seems to me that when people who have had to deal with Namespaces in XML can talk freely, they never have anecdotes about how Namespaces have helped them. Instead, they have negative comments. OTOH, devil's advocate scenarios where Namespaces could help come from people who don't have to deal with Namespaces as part of their work.</p></blockquote>


== namespaces for content are a negative ==
== namespaces for content are a negative ==
Line 38: Line 40:
From the #whatwg IRC channel on irc.freenode.net [http://krijnhoetmer.nl/irc-logs/whatwg/20071025#l-148 on 2007-10-25]: <blockquote><p># [15:43] &lt;<cite>hsivonen</cite>&gt; I wonder how many hours in my life has been wasted looking up namespace URIs for copying and pasting</p></blockquote>
From the #whatwg IRC channel on irc.freenode.net [http://krijnhoetmer.nl/irc-logs/whatwg/20071025#l-148 on 2007-10-25]: <blockquote><p># [15:43] &lt;<cite>hsivonen</cite>&gt; I wonder how many hours in my life has been wasted looking up namespace URIs for copying and pasting</p></blockquote>


=== non-namespaced techniques have been succeeding ===
=== example of fundamental software engineering error ===
As [http://krijnhoetmer.nl/irc-logs/whatwg/20080801#l-160 othermaciej observed in IRC 2008-08-01]: <blockquote><p>Namespaces are an example of the Fundamental Software Engineering Error, which is that something too terrible to actually use can be fixed by adding a level of indirection. Sometimes that is true but software engineers try to do it even when it clearly is not.</p></blockquote>
 
== bound prefixes are an anti-pattern ==
Ian Hickson describes numerous problems with bound prefixes in his post:
* [http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2009Aug/0035.html Why bound prefixes are an anti-pattern in language design]
 
Some excerpted points:
* '''Copy-and-paste brittleness.''' Copy-and-paste of the source becomes very brittle when two separate parts of a document are needed to make sense of the content.
* '''Hard for authors.''' Prefixes are notoriously hard for authors to understand.
* '''More indirection adds difficulty''' Fundamentally, prefixes are an indirection model. Indirection models are very, very hard for people to understand.
* '''Hard for implementors.''' Prefixes are notoriously hard for implementors to get right.
* '''Hard for dynamic content situations.''' Prefixes in dynamically changing content are even worse because they require than an observing software agent not only track the value that they are concerned about, but also all possible ways for the value's prefixes to change meaning.
 
Basically, adding bound prefixes to any language or format design adds fragility, difficulty of use, difficulty of implementation, and in general must be avoided.
 
== namespaces unnecessary in practice ==
In practical use cases for marking up visible data (and others), namespaces, bound prefixes etc. have never been necessary. E.g.[http://lists.w3.org/Archives/Public/public-webapps/2011JulSep/1584.html]:
* rel="" values in HTML, e.g. [[rel-me]]. More: [[rel-values]] define them for [[HTML5]]
* element names in HTML, e.g. <code>&lt;p&gt;</code>
* MIME type names, e.g. "image/png"
* scheme names, e.g. "http", "https"
 
== namespaces cause dogmatic noise ==
It appears the desire for abstract bound (or literal) prefix namespaces causes a small strongly outspoken minority, especially on email lists, to frequently post long messages in apparent attempted support for namespaces as a concept, technology, add-on, feature etc., without bases in real world use cases. They often state namespaces themselves as the real world use case, which is of course, tautological. For example, see archives of public-html on w3.org in 2009.
 
It's not clear what about namespaces incites this primarily religious/dogmatic/tautological behavior among some individuals, but the behavior itself is quite unproductive (noise in community communications channels), and thus undesirable and [[mailing-lists#Bad_topics_for_discussion|explicitly discouraged]].
 
== non-namespaced techniques have been succeeding ==
On the other hand, XHTML + [[semantic-class-names]] (aka [[POSH]]) has seen widespread adoption among the web authoring/design/IA/publishing community.  Microformats is leveraging the approach that is both working better and frankly dominating in practice on the Web.
On the other hand, XHTML + [[semantic-class-names]] (aka [[POSH]]) has seen widespread adoption among the web authoring/design/IA/publishing community.  Microformats is leveraging the approach that is both working better and frankly dominating in practice on the Web.


== More ==
== more ==
=== Well, what about hAtom? ===
=== Well, what about hAtom? ===
[[hAtom]] appears to use to namespaces. In particular:
[[hAtom]] appears to use to namespaces. In particular:
Line 48: Line 78:
* entry-summary
* entry-summary


It just looks like it uses an "emulates namespace" - the definition of those three items ''is so specific to the problem domain'' that we invented names specifically for that. For example,  "entry-title" isn't any old title, it's specifically the Atom concept of a title. You could imagine a blog post semantically marked up where a "fn" is around the entry-title with some more information ("David Janes says...").
It doesn't use namespaces because "entry-" just part of the name, rather than a prefix that is associated with a URL.
 
In this case the prefix "entry-" means nothing more than "entry-".  Each of those three specific terms ''is so specific to the problem domain'' that we invented names specifically for them. For example,  "entry-title" isn't any old title, it's specifically the Atom concept of a title.  
 
Each of those three terms is defined on its own, fully spelled out with the "entry-" part of their name, in the [[hAtom]] spec.
 
You can't pair "entry-" with some arbitrary other word (even existing microformats property name) and have it work or even mean anything.  Thus "entry-" is not a stand-alone prefix, and has no defined meaning or function on its own, contrary to anything even resembling a namespace.
 
Update 2013:
 
We've fixed this fully in [[microformats2]] [[h-entry]]:
 
* "name" replaces entry-title
* "content" replaces entry-content
* "summary" replaces entry-summary
 
Once again, we found that even prefixing for this specific case was unnecessary in practice.
 
== misconceptions ==
=== fairly solid namespace ===
''(x) is a fairly solid namespace.''
 
There's no such thing as a "solid" namespace. Perhaps only "solid" in the sense of a wall that gets in the way of interoperability. See silos above.
 
=== namespaces good for scalability ===
''Decentralization using namespaces is also good for scalability.''
 
Namespaces just enable scalability of divergence - which is contrary / anathema to standardization and communication.
 
Though they're probably ok for experiments, and scalability in the number of different experiments being performed.
 
E.g. <abbr><dfn>XML</dfn></abbr> would have made more sense if it was thought of as "eXperiment Markup Language".
 
== more articles ==
Here are some more articles and posts that discuss additional practical problems with namespaces, with specific examples
* <span class="hentry"><span class="published">2009-03-06</span> <span class="entry-summary">public-html list: <cite class="entry-title">[http://lists.w3.org/Archives/Public/public-html/2009Mar/0163.html Re: @rel syntax in RDFa (relevant to ISSUE-60 discussion), was: Using XMLNS in link/@rel]</cite> by <span class="author vcard""><span class="fn">Henri Sivonen</span></span> explains more problems with xmlns, and of communities burdening each other.</span></span> Some quotes: <blockquote><p>I think it's a problem for collaboration where the subcommunities interact at the W3C if one subcommunity wants things and another bears the cost. </p><p> There a previous example in the form of inflicting permanent complexity (i.e. cost) onto an adjacent subcommunity: Namespaces in XML wasn't something that the SGML documentation community (turned into XML community) needed. Instead, Namespace in XML were a requirement posed by the Semantic Web community's RDF/XML. This requirement permanently complicated the processing model for the SGML community turned into XML community: http://www.flightlab.com/~joe/sgml/sanity.txt</p></blockquote>
* <span class="hentry"><span class="published">2008-08-24</span> <span class="entry-summary"><cite class="entry-title"><nowiki>[whatwg]</nowiki> [http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2008-August/015941.html RDFa]</cite> by <span class="author vcard"><span class="fn">Henri Sivonen</span></span> explains problems with: xmlns attributes, DOM consistency, cruftiness in language design, ccREL, indirection via prefixes</span></span>. Some choice quotes: <blockquote><p>Metadata about a file travels best inside the file, and common Web formats have native facilities on the key-value level.</p></blockquote> ... <blockquote><p>The most common case of CC metadata fits the simple key-value case, and licensing is already too hard. Addressing the more complex cases by making the common case complex seems like a bad idea to me.</p></blockquote> see related [[licensing-brainstorming#ccREL_issues|ccREL issues]]. <blockquote><p>I think URIs (directly or through indirection) are more clumsy as identifiers than short names. Since RDF vocabularies use URIs as identifiers, I find creating more microformats (even if they need more one-off speccing) a more appealing way forward from the language usage point of view than importing RDF vocabularies in a generically mappable way. (I can't see how generic mapping can be had without using URIs as identifiers.)</p></blockquote> ... <blockquote><p>I think indirection via prefixes is bad, because experience with Namespaces in XML shows that it confused people a lot. Both full URIs and short prefixed names where the fixed prefix doesn't expand into anything are better.</p></blockquote> ... <blockquote><p>we should refuse a mechanism that can reasonably be expected to have a relatively high failure probability. As the toolchain becomes longer, the probability of failure increases, since it takes only one tool in the chain to fail for the whole chain to fail.</p></blockquote>
* <span class="hentry"><span class="published">2002-04-05</span> <span class="entry-summary"><nowiki>[xml-dev]</nowiki> <cite class="entry-title">[http://www.flightlab.com/~joe/sgml/sanity.txt A plea for Sanity]</cite> by <span class="author vcard"><span class="fn">Joe English</span></span> </span> </span>
 


== See Also ==
== see also ==
* [[plain-old-xml-considered-harmful]]
* [[plain-old-xml-considered-harmful]]
* [[microformats-easier-than-xml]]
* [[microformats-easier-than-xml]]
Line 57: Line 125:
* [[namespaced-attributes-considered-harmful]]
* [[namespaced-attributes-considered-harmful]]
* [http://en.wikipedia.org/wiki/Namespace Wikipedia on namespaces]
* [http://en.wikipedia.org/wiki/Namespace Wikipedia on namespaces]
* [http://wiki.whatwg.org/wiki/Namespace_confusion Namespace confusion on the WHATWG wiki]

Latest revision as of 16:30, 18 July 2020


In particular namespaces for content are considered harmful (e.g. XML namespaces, QNames in attributes etc.). Namespaces for code is outside the bounds of the topic of this page.

Author/Editor: Tantek Çelik

namespaced content has failed

Namespaced content on the Web has failed.

It's been tried by numerous groups, before microformats, and after. It's even been tried in the context of RSS and RDF, and in practice people write scrapers that look for namespace prefixes as if they are part of the element name, or perform literal string matches on common namespace prefix uses (e.g. 1), not as mere shorthands for namespace URIs.

If you want to carry on a theoretical discussion of namespaces, please do so elsewhere, for in practice, discussing them is a waste of time, and off-topic for microformats lists.

namespaced content is not well supported

Namespaces are actually *not* well supported in sufficient modern browsers, nor even sufficiently with enough W3C technologies or test suites as compared to (X)HTML + semantic-class-names + CSS.

articles documenting the failure of namespaced content

The mixed namespace approach has already been tried by *numerous* others since 1998 and has failed on the Web.

implementation experience lacks beneficial anecdotes

As hsivonen noted in IRC 2008-08-01:

I had dinner with friends who write software. It seems to me that when people who have had to deal with Namespaces in XML can talk freely, they never have anecdotes about how Namespaces have helped them. Instead, they have negative comments. OTOH, devil's advocate scenarios where Namespaces could help come from people who don't have to deal with Namespaces as part of their work.

namespaces for content are a negative

Namespaces are actually a *huge* negative. Search for:

namespaced content discourages interoperability of data

Namespaces encourage people to seclude themselves in their own namespace and invent their own schema rather than reusing existing elements in existing formats. This hurts interoperability because a dozen different namespaces can all have their own slightly different semantics for the same element. See BuildOrBuy for support for this argument, specifically

Use somebody elses rather than making aliases on purpose. It's one thing to make your own and then discover that there's something equivalent out there. It's quite another to willfully clutter the semantic web with aliases; the latter increases the burden on the community of consuming your data, so it's anti-social.

If you start thinking about the web in terms of OOP and polymorphism, namespaces break the polymorphic model that allows you handle widely varied data structures using the same methods.

using namespaces cost a lot of time

From the #whatwg IRC channel on irc.freenode.net on 2007-10-25:

# [15:43] <hsivonen> I wonder how many hours in my life has been wasted looking up namespace URIs for copying and pasting

example of fundamental software engineering error

As othermaciej observed in IRC 2008-08-01:

Namespaces are an example of the Fundamental Software Engineering Error, which is that something too terrible to actually use can be fixed by adding a level of indirection. Sometimes that is true but software engineers try to do it even when it clearly is not.

bound prefixes are an anti-pattern

Ian Hickson describes numerous problems with bound prefixes in his post:

Some excerpted points:

  • Copy-and-paste brittleness. Copy-and-paste of the source becomes very brittle when two separate parts of a document are needed to make sense of the content.
  • Hard for authors. Prefixes are notoriously hard for authors to understand.
  • More indirection adds difficulty Fundamentally, prefixes are an indirection model. Indirection models are very, very hard for people to understand.
  • Hard for implementors. Prefixes are notoriously hard for implementors to get right.
  • Hard for dynamic content situations. Prefixes in dynamically changing content are even worse because they require than an observing software agent not only track the value that they are concerned about, but also all possible ways for the value's prefixes to change meaning.

Basically, adding bound prefixes to any language or format design adds fragility, difficulty of use, difficulty of implementation, and in general must be avoided.

namespaces unnecessary in practice

In practical use cases for marking up visible data (and others), namespaces, bound prefixes etc. have never been necessary. E.g.[1]:

  • rel="" values in HTML, e.g. rel-me. More: rel-values define them for HTML5
  • element names in HTML, e.g. <p>
  • MIME type names, e.g. "image/png"
  • scheme names, e.g. "http", "https"

namespaces cause dogmatic noise

It appears the desire for abstract bound (or literal) prefix namespaces causes a small strongly outspoken minority, especially on email lists, to frequently post long messages in apparent attempted support for namespaces as a concept, technology, add-on, feature etc., without bases in real world use cases. They often state namespaces themselves as the real world use case, which is of course, tautological. For example, see archives of public-html on w3.org in 2009.

It's not clear what about namespaces incites this primarily religious/dogmatic/tautological behavior among some individuals, but the behavior itself is quite unproductive (noise in community communications channels), and thus undesirable and explicitly discouraged.

non-namespaced techniques have been succeeding

On the other hand, XHTML + semantic-class-names (aka POSH) has seen widespread adoption among the web authoring/design/IA/publishing community. Microformats is leveraging the approach that is both working better and frankly dominating in practice on the Web.

more

Well, what about hAtom?

hAtom appears to use to namespaces. In particular:

  • entry-title
  • entry-content
  • entry-summary

It doesn't use namespaces because "entry-" just part of the name, rather than a prefix that is associated with a URL.

In this case the prefix "entry-" means nothing more than "entry-". Each of those three specific terms is so specific to the problem domain that we invented names specifically for them. For example, "entry-title" isn't any old title, it's specifically the Atom concept of a title.

Each of those three terms is defined on its own, fully spelled out with the "entry-" part of their name, in the hAtom spec.

You can't pair "entry-" with some arbitrary other word (even existing microformats property name) and have it work or even mean anything. Thus "entry-" is not a stand-alone prefix, and has no defined meaning or function on its own, contrary to anything even resembling a namespace.

Update 2013:

We've fixed this fully in microformats2 h-entry:

  • "name" replaces entry-title
  • "content" replaces entry-content
  • "summary" replaces entry-summary

Once again, we found that even prefixing for this specific case was unnecessary in practice.

misconceptions

fairly solid namespace

(x) is a fairly solid namespace.

There's no such thing as a "solid" namespace. Perhaps only "solid" in the sense of a wall that gets in the way of interoperability. See silos above.

namespaces good for scalability

Decentralization using namespaces is also good for scalability.

Namespaces just enable scalability of divergence - which is contrary / anathema to standardization and communication.

Though they're probably ok for experiments, and scalability in the number of different experiments being performed.

E.g. XML would have made more sense if it was thought of as "eXperiment Markup Language".

more articles

Here are some more articles and posts that discuss additional practical problems with namespaces, with specific examples

  • 2009-03-06 public-html list: Re: @rel syntax in RDFa (relevant to ISSUE-60 discussion), was: Using XMLNS in link/@rel by Henri Sivonen explains more problems with xmlns, and of communities burdening each other. Some quotes:

    I think it's a problem for collaboration where the subcommunities interact at the W3C if one subcommunity wants things and another bears the cost.

    There a previous example in the form of inflicting permanent complexity (i.e. cost) onto an adjacent subcommunity: Namespaces in XML wasn't something that the SGML documentation community (turned into XML community) needed. Instead, Namespace in XML were a requirement posed by the Semantic Web community's RDF/XML. This requirement permanently complicated the processing model for the SGML community turned into XML community: http://www.flightlab.com/~joe/sgml/sanity.txt

  • 2008-08-24 [whatwg] RDFa by Henri Sivonen explains problems with: xmlns attributes, DOM consistency, cruftiness in language design, ccREL, indirection via prefixes. Some choice quotes:

    Metadata about a file travels best inside the file, and common Web formats have native facilities on the key-value level.

    ...

    The most common case of CC metadata fits the simple key-value case, and licensing is already too hard. Addressing the more complex cases by making the common case complex seems like a bad idea to me.

    see related ccREL issues.

    I think URIs (directly or through indirection) are more clumsy as identifiers than short names. Since RDF vocabularies use URIs as identifiers, I find creating more microformats (even if they need more one-off speccing) a more appealing way forward from the language usage point of view than importing RDF vocabularies in a generically mappable way. (I can't see how generic mapping can be had without using URIs as identifiers.)

    ...

    I think indirection via prefixes is bad, because experience with Namespaces in XML shows that it confused people a lot. Both full URIs and short prefixed names where the fixed prefix doesn't expand into anything are better.

    ...

    we should refuse a mechanism that can reasonably be expected to have a relatively high failure probability. As the toolchain becomes longer, the probability of failure increases, since it takes only one tool in the chain to fail for the whole chain to fail.

  • 2002-04-05 [xml-dev] A plea for Sanity by Joe English


see also