en-US FAQ

From Microformats Wiki
Jump to navigation Jump to search


Frequently asked questions regarding the use of en-US for property/keyword/token names (in microformats and other formats and protocols).

First, familiarize yourself with:

questions

why not use other spellings and languages for properties

Q: Why not use other spellings and languages for properties, values, and other keywords? (in microformats or other formats and protocols). This was briefly discussed on the microformats-discuss list as "Language Maps" but has been raised before that, and seems to come up about once a year.

A(1): Daniel Glazman:

Why don't you also ask for localized c++ tokens or javascript tokens ? Why limit yourself to HTML and CSS ? Hmmm ?

 pour (variable i = 0; i < monTableau.longueur; i++)
 {
   s += parserFlottant(monTableau[i]);
 }

Wonderful :-) Oh sorry, I should say Merveilleux :-)

A(2): See minimal-vocabulary, an aspect of several microformats principles.

why not create multiple aliases for properties in other spellings and languages

Q: Why not create multiple aliases (variants) for properties (or keywords, tokens in general) in other spellings and languages? (in microformats or other formats and protocols).

A(1): In short, Tower of Babel problem. Tab Atkins Jr.:

Localizing your tokens automatically cuts you off from the vast majority of code in the wild. If you're a French speaker who knows no English, and you learn French-token CSS, you can't use *any* of the vast, vast quantities of CSS help on the web. You can't copy-paste code (unless the mapping preserves the original English tokens as well - hope there's no conflicts, especially if you have people from multiple languages working together!). You can't

even *read* code written by the majority of the world (and they can't read yours).

As much as humanly/technically possible, we of course want to support the diversity of languages on our planet. But in some cases it is advantageous *to the speakers of non-English languages* to purposely ignore their language, and use the dominant one (English, currently). Programming language tokens are one such area.

A(2): See minimal-vocabulary, an aspect of several microformats principles.

followup regarding Anglo centric bias

Q: (more like issue) Unless we seriously discuss how this Anglo-centric bias can be eliminated, all we are doing is continuing to perpetuate a highly undesirable state of affairs.

A(1): Local communities should solve any such translations and aliases for themselves before involving the attention of the broader community. Bjoern Hoehrmann explains specifically regarding CSS and the "color" property, but his response applies similarly to any format or protocol:

The time to discuss allowing alternate names for properties is when CSS users have built and deployed successful tools that allow doing this locally, like a web server module that maps "fr-css" to "en-css", or source code editors that display "fr-css" but save to "en-css". We could then discuss whether to standardize some such mappings, or make features available that allow arbitrary user-defined mappings.

If local communities can but do not solve a particular problem, there is usually insufficient reason for the broader community to "solve" it for them.

A(2): The bias is a historical accident, inevitable, and desirable. Tab Atkins Jr.:

The problem is that it's only an Anglo-centric bias by accident

(English speakers invented programming languages), but *some* kind of bias is both inevitable and desirable, for the reason I gave in my email. We simply don't *want* a computer language to use wildly different identifiers for the exact same concepts, because it splinters the possible community of help/examples/sample code one can

use.

I mean, if Iceland had become the world's Silicon Valley in the 1980's, we'd all be programming in Fjölnir1. I have no idea what the various language tokens actually mean, but I can decipher the programs, and could get along just fine in the language, assuming we had English tutorials for it. I would *prefer* an English-based programming language, certainly, but the benefit of being able to use all the Fjölnir code lying around on the web would be large enough for me to not care that much.

The situation would be slightly different if it was a language more significantly removed from the latin script, like Farsi or Korean. In that case, I doubt I could use the programming language without first gaining at least a passing familiarity with the host language, at least enough to intuitively distinguish between the characters. Of course, whatever language it was would *be* the language of the web like English is in the real world, so I'd likely be in a situation that many people across the world are in - I would learn and speak English at home, but I'd learn the other language to communicate in the global marketplace.

[1]: http://en.wikipedia.org/wiki/Fj%C3%B6lnir_(programming_language)

A(3): However, we must be aware of the impact of the Anglo centric bias in the design of standards vocabularies, and seek to minimize internationalization problems which may not be apparent to Anglo centric designers/developers, such as the problem of seemingly harmless changing/aliasing of vocabularies, AKA the Anglo-centric renaming anti-pattern.

see also