From Microformats Wiki
Jump to navigation Jump to search

Multilingual blogs and sites

The boundaries on the web are linguistic. An increasing number of people have multilingual websites and blogs. However, existing blog software, although localizable, is designed with the monolingual author/reader in mind. HTML specs are designed mainly for monolingual web pages.

  • How should similar content in different languages (whether translated, re-phrased, abstracted) be organised and related?
  • How should blogging software make this possible?
  • Three levels of difficulty (or subproblems):
    • Markup
    • Interface for the reader
    • Authoring process
    • (And a fourth: integration in specific blogging tools)

The microformat approach to markup can clarify the problem space and lead to simpler work in the interface, authoring and integration stages.

Readers' perspective

Many web authors have a multilingual readership. This means a readership composed of people who are monolingual in language A, monolingual in language B, people who are perfectly bilingual and the whole range of language proficiency in-between. Often, the solution found for "multilingual" content is to create "mirror" versions of a site in different languages. This functions for sites which are static or are maintained by a huge team of people. It is not viable for a blog or forms of publication which encourage people to express themselves online by making it easy to publish, though translating others' posts can distribute this work.

A multilingual site will be a site containing more than one language. How can it be made friendly for all types of readers -- from monolingual to perfectly bilingual, including monolingual people who have enough working knowledge of other languages to make the effort of trudging through an article in a foreign language if it sounds interesting enough? How can we semantically mark up pages containing more than one language, and create logical links between content expressed in other languages with varying degrees of closeness to the original?

Successive refinement of meaning

indicating language of sub-sections

pre-existing HTML standards (use lang="xx" on the containing element)

  • enables CSS-based show hide, or visual lang indication
  • allows parsing of which language is which

indicating alternative versions in other languages

pre-existing HTML standards (use rel="alternate" hreflang="xx")

making meaningful distinctions between the various kinds of alternatives

These may be translations, abstracts, paraphrases. If they are attempted faithful translations, indicating which is the original is useful.

  • consider a rel="original" on links (or rev="original" to point to known translations of the current page)
    • could combine with votelinks for approval/disapproval of the translation

If there are separately written posts in different languages on the same topic, the more general rel="alternate" applies

Examples of current approaches

We can start with a multilingual blog safari: multilingual-examples

Previous discussions

Links to posts which have already reflected on this question or tried to find a solution:

Related documents: