wiki-formats

From Microformats Wiki
Jump to navigation Jump to search

wiki formats

Authors

  • Tantek Çelik
  • Ben West

Intro

Ian Hickson recently lamented to me that:

"I have yet to find a wiki that has both a nice syntax (i.e. one that looks 
like text/plain as opposed to one that looks like just another obscure 
markup language -- if you're going to use markup, why not just use HTML 
in the first place), and that produces semantic markup (as opposed to 
having tags for "bold" and "italics")."

And I have to kind of agree with him. My experience with current wiki formats is that they haven't done that good a job of "paving the cowpaths", that is, taking what people write in plain text documents, and interpreting them as structure, rather than inventing new text conventions (e.g. equal signs for headings?!?) and getting people to learn them.

This page is an attempt to catalog/document current wiki and wiki-like text formats to see if there is any chance of solving this problem.

Technically a wiki format would not be a microformat because it is not expressed in XHTML building blocks. However, many of the other principles of microformats can be applied to perhaps come up with a better solution that what wikis use today (since they all seem to use their own variant formats anyway).


wiki software

MediaWiki

What you're using now.

  • paragraphs
    • blank line creates a new paragraph
  • unordered lists
    • start a line with "* " and it will put it into an unordered list.
    • use multiple "*", e.g. "** " for 2nd level, for nested unordered lists.
  • ordered lists
    • start a line with "# " and it will put it into an unordered list.
    • use multiple "#", e.g. "## " for 2nd level, for nested unordered lists.
  • headings
    • prefix and suffix with "=" for level 1 heading, "==" for level 2 heading etc.
  • literal
    • use <pre> ... </pre> tags

MoinMoin

What the Technorati Developer's Wiki uses.

Kwiki

Tiki Wiki

TikiWiki Syntax Reference and Formatting Guide

Important Syntax:

  • Lists
    • * Creates an unordered list.
    • # Creates a numbered list.
    • ;term:definition creates a term and definition list.
    • Features include nesting in a predictable manner, sections that can hide/display with a +/- symbol, and line continuation after breaks.
  • Links
    • JoinedWords indicate an internal wiki link.
    • ((Words|Description)) inside parenthesis also indicates an internal wiki link and can include spaces and non-standard wiki link conventions. A pipe delimits the text to be used for the link.
    • ))JoinedWords(( can escape the link parsing.
    • External links go inside square brackets [ ] with the same convention regarding the descriptive text. Many features of this wiki also allow options to be passed, eg. nocache, after a pipe.
  • Images
    • {img src= width= height= align= desc= link= }
  • Text formatting
    • Bolding is done by placing text in between a pair of double underscores: __bolded text__
    • Text is centered by placing text in between two colons: ::centered text::
    • Text is colored by delimiting the color name and the text with a colon surrounded by a pair of double tildes. ~~blue:text~~
    • Text is italicized by surounding the text with a double pair of single quotes: ''italicized''.
    • The syntax for monospaced/teletype text is: -+monospaced text+-
    • Underlined text is indicated with 3 equal signs: ===underlined text===
    • Text can be put in a simple box by surrounding it with the carrot: ^boxed text^
  • Headings
    • Headings are indicated by the presence of an exclamation mark at the beginning of the line: !My heading. Sub headings and level of nesting is indicated by the number of exclamation marks. (Same way that lists nest.) This does carry semantic purpose in the TikiWiki documentation and the maketoc module uses this feature in order to make tables of contents.

phpwiki

Introduction and Syntax Rules.

  • Formatting (copied from http://phpwiki.sourceforge.net/phpwiki/TextFormattingRules, edited to make a list)
    • Emphasis: _ for italics, * for bold, _* for both, = for fixed width.
    • Lists: * for bullet lists, # for numbered lists, Term:<new-line> definition for definition lists.
    • Preformatted text: Enclose text in <pre></pre> or <verbatim></verbatim>.
    • Indented text: Indent the paragraph with whitespaces.
    • References: JoinCapitalizedWords or use square brackets for a [page link] or URL [http://example.com].
    • Preventing linking: Prefix with "~": ~DoNotHyperlink, name links like [text | URL].
    • Misc: "!", "!!", "!!!" make headings, "%%%" or "
      " makes a linebreak, "----" makes a horizontal rule.
    • Allowed HTML tags: b big i small tt em strong abbr acronym cite code dfn kbd samp var sup sub


Midgard Wiki (net.nemein.wiki)

Other Resources

Should plain text formats from other non-wiki systems be included in this exploration? What about phpbb codes? Or certain blogging tools? What about Almost Free Text ( Syntax Overview ) and other plain text processing tools? There is a breed of hybrid wiki-blog systems like http://www.backpackit.com and http://www.basecamphq.com both by 37signals.

Extra-wiki Formatting Conventions

Live Chats

This includes IRC, and sundry chat services such as AIM. There are several popular conventions to indicate a low level of formatting in plain text in various chat services. Text in between a pair of * is understood to either be an emotion or action (eg *grin*) or *emphasized* (perhaps equivalent to bolding). Text in between a pair of /forward slashes/ is many times understood to carry an italicized meaning. Text in between a pair of _underscores_ is understood to be underlined.

For example, the syntax used by the Yarr (☠) in-Wiki chat system.

Other Standards Efforts

Summary

Apparently most wikis use a * to indicate bulleted lists. Nesting works intuitively. New paragraphs are often indicated with newlines. Several schemes uses capitalized JoinedWords to indicate an internal link, and square brackets [ ] to indicate an external link. Common problems include unexpected failure to handle nesting within certain syntax, competing formatting rules, varying degrees of semantic meaning, and arbitrary formatting codes.

Asterisks to handle unordered lists and pound signs for numbered lists probably work pretty nicely. It's common to use asterisks for lists in plain text formatting, and using a pound sign typically means "a number", and lets the user know that the system will automatically enumerate the following points. However, indicating that the following line should match the indentation of the preceeding line involves strange notation. Unfortunately, arbitrarily blocked elements such as a simple box will break the nesting and continued parsing of list items several wikis.

Likewise, although one doesn't often see exclamation points used to convey that a given line is a heading, this might work nicely as well. An exclamation point indicates importance and emphasis; having it at the beginning of the line is rare, makes the interface to the nesting behavior monotonous because it is the same as the lists, and seems just as natural (to this writer) as filling the succeeding line with dashes or equals. It also makes a lot more sense than surrounding the headline text with equal signs.

Square brackets are used in most wikis to indicate a link of some kind. However, some wikis split links into external and internal, creating a modal interface to publishing links. Furthermore, despite the standard JoinedCapitalizedWords to create an internal link (and/or create a new page), wiki systems freely allow users to ignore the convention by allowing varied alternate linking methods. An additional failing of internal linking schemes is is that wikis are many times a part of a larger content management system, and full "external" links are required anyway in order to reach components of the site. In plain text documents, it is more common to see a full url accompanied with some explaining text. Of the wikis that allow for a natural rendering of urls as links, they also allow a specialized convention to allow for the substituted text to point to the url. Perhaps a future solution would abolish the internal/external modality, parse in-line urls, and include a simple option for text substitution. For example: "http://www.google.com(Google) is a great search engine." would show up as: "[http:www.google.com Google] is a great search engine."

wiki formats

straw proposals

What Ian uses in his text/plain documents:

  • h1:
first level heading - followed by a line starting with equal signs "="
=============================================
  • h2:
second level heading - followed by a line starting dashes "-"
--------
  • h3:
THIRD LEVEL HEADING - ALL CAPS ON A LINE
  • p:
    • a blank line to start and finish
  • ol / li
    • a line starting with space then a number followed immediately by a period, e.g.
 1. Here is one ordered list item
    • note that such list items may be separated by blank lines.
    • note that paragraphs within a list item will be indented as much as the text after the list item marker.
    • list is terminated by a non-blank line that *doesn't* start with space then a number then a period, and is outdented from where list item paragraphs are.
  • ul / li
    • a line starting with space then an asterisk then at least one space, e.g.
 * Here is an unordered list item
    • same notes apply respectively as those for ordered list items above.
    • nested unordered list items are similar, except that their marker is further indented, and in addition to "*", other list item markers may be used such as "+" and "-".
  • pre / code
    • some amount of nesting with whitespace. pre / code. it's not clear what type of code (e.g. HTML or CSS).
  • em
    • text surrounded by a single adjacent underline on both sides, e.g.
_at the moment_
  • blockquote and cite attribute
    • a set of lines that being with "| ", and after the last one, a line that starts with " -- ", followed by the citation URL, e.g.:
| This is a quote
| and a second line
 -- http://example.com/quotation/

Open issues:

  • What's this?
    • -*- Mode: text; -*- It's the Emacs mode line. Just ignore anything starting with one or more spaces and then having the form -*- ... -*-
  • how do you encode in text/plain the semantics of:
    • strong Use *stars* instead of _underscores_
    • dfn
    • dl/dt/dd
    • h4, h5, h6 There is no H4 in this format. Only H1-H3. Just like HTML has no H7, and is limited to H1-H6.
    • table / thead, tbody, tfoot, caption / tr / td, th I have some pages that do tables, you just do an actual ASCII art table with proper ASCII art lines
    • hyperlinked text text/plain has no hyperlinks, so I always put them on the next line (pre/code style)
    • hyperlink relationships (rel attribute on hyperlinked text)
    • address (possibly the "Author: " line?)
    • inline code