Difference between revisions of "naming-principles"

From Microformats Wiki
naming-principles
Jump to navigation Jump to search
(entry title, short URL)
(→‎See Also: cardinality)
Line 133: Line 133:
 
== See Also ==
 
== See Also ==
 
* [[naming-principles-faq]]
 
* [[naming-principles-faq]]
 +
* [[cardinality]]
 
* [[existing-classes]] for class names already in use
 
* [[existing-classes]] for class names already in use
 
* [[class-design-pattern]] to see how class names are used in microformats
 
* [[class-design-pattern]] to see how class names are used in microformats
 
* [[semantic-xhtml-design-principles]]
 
* [[semantic-xhtml-design-principles]]
 
* [[naming-conventions]] for microformats.org wiki pages
 
* [[naming-conventions]] for microformats.org wiki pages

Revision as of 01:54, 3 March 2012

<entry-title>Naming Principles</entry-title> One of the key microformats principles is re-use, and in particular, re-use of names of objects, properties, and values from existing formats and standards when possible.

Author
Tantek Çelik
short URL
http://tr.im/naming

Introduction

One of the key microformats principles is re-use, and in particular, re-use of names of objects, properties, and values from existing formats and standards when possible. -Tantek

I explicitly created this principle in response to the anti-patterns that I saw in many (most?) existing standards efforts such as:

  • Making up names from thin air
  • Ignoring all earlier work
  • Actual hostility towards using names/terms from other standards
  • Using others' names to mean different things
  • Using new names to mean the same thing (often in a mistaken effort to re-use semantics but rename vocabulary to something "more understandable".)
  • Endlessly debating and "name-smithing" in order to come up with a slightly more perfect name

human nature

Perhaps it is human nature to want to create new names, or name new things. Certainly there is some amount of ego involved in the creation of a new thing which you can then claim to have invented or named. Some of these tendencies are also a form of "Not Invented Here" (NIH) syndrome which unfortunately is quite common among software engineers.

novelty hurts interoperability

Unfortunately such desire for novelty is bad for standards, and certainly bad for interoperability, which depends on being able to depend on the same name meaning the same thing.

novelty hurts communication

It's also bad for language and communication among humans (e.g. see the Anglo-centric renaming anti-pattern). Even though humans can deal with some ambiguity and overloading of terms (using context to disambiguate), it's easier for humans as well when there is less ambiguity and less overloading.

documenting principles helps

We're not going to be able to fully eliminate such "Tower of Babel" tendencies, but at least we can minimize them, especially when they are bad for standards and interoperability.

With the experience of developing new microformats such as xFolk, hReview, and hAtom, it has become quite clear that we need to explicitly document some of the specific design principles that went into naming the objects and properties of some of the early established microformats like hCard, hCalendar, and hReview 0.4 (in progress) and that's the purpose of this document.

Naming Principles

Semantic XHTML Design Principles

First, it is important to note the naming principles which have been defined and explicitly referenced in (most of) the above-mentioned microformats.

Note: the Semantic XHTML Design Principles were written primarily within the context of developing hCard and hCalendar, thus it may be easier to understand these principles in the context of the hCard design methodology (i.e. read that first). Tantek

XHTML is built on XML, and thus XHTML based formats can be used not only for convenient display presentation, but also for general purpose data exchange. In many ways, XHTML based formats exemplify the best of both HTML and XML worlds. However, when building XHTML based formats, it helps to have a guiding set of principles.

  1. Reuse the schema (names, objects, properties, values, types, hierarchies, constraints) as much as possible from pre-existing, established, well-supported standards by reference. Avoid restating constraints expressed in the source standard. Informative mentions are ok.
    1. For types with multiple components, use nested elements with class names equivalent to the names of the components.
    2. Plural components are made singular, and thus multiple nested elements are used to represent multiple text values that are comma-delimited.
  2. Use the most accurately precise semantic XHTML building block for each object etc.
  3. Otherwise use a generic structural element (e.g. <span> or <div>), or the appropriate contextual element (e.g. an <li> inside a <ul> or <ol>).
  4. Use class names based on names from the original schema, unless the semantic XHTML building block precisely represents that part of the original schema. If names in the source schema are case-insensitive, then use an all lowercase equivalent. Components names implicit in prose (rather than explicit in the defined schema) should also use lowercase equivalents for ease of use. Spaces in component names become dash '-' characters.
  5. Finally, if the format of the data according to the original schema is too long and/or not human-friendly, use <abbr> instead of a generic structural element, and place the literal data into the 'title' attribute (where abbr expansions go), and the more brief and human readable equivalent into the element itself. Further informative explanation of this use of <abbr>: Human vs. ISO8601 dates problem solved

Some Details

  • hyphen-separated-lowercase-words. W3C CSS (cascading style sheets) introduced the convention of lowercasing all property/value names (identifiers) and separating words with hyphen"-" characters for reasons of better human readability as compared to other approaches like CamelCase (or even camelCase). Microformats property names strictly adopt this approach as well.

Unique Root Class Names

I've also written a bit about the design principles that went into the *root* class names (which require a bit different treatment than property class names) in the microformats, but this is described in the hcard-parsing page currently:

http://microformats.org/wiki/hcard-parsing#root_class_name

Need to copy some of that text here and make it not-hCard specific.

Minimal Vocabulary

Main article: minimal vocabulary

Use as few terms as possible, and in particular use as few new terms as possible. The principles of "minimal vocabulary" is actually directly derived from the principle of start as simple as possible.

  • minimal vocabulary. We try to introduce as few new microformat terms as possible. See minimal vocabulary for more detail and reasons.

Reuse

Reuse microformats first, other standards second.

This is actually outlined quite clearly in the microformats principles, but deserves both explicit repeating here with strong emphasis:

The key here is that this principle is not only about reusing whole microformats (e.g. don't invent a new person property for your microformat, just reuse hCard), but also about where to get names for properties.

In particular, if you find that your new microformat has a property which means the same thing as an exsiting microformat, you SHOULD (maybe I should make this a MUST) reuse the class name from that existing microformat. This practice also follows the principle of minimal vocabulary, and of re-using the same name to mean the same thing (instead of using two names to mean the same thing).

For Other Standards, Prefer Older to Newer

If there is no microformat name for a property, and we are reusing names based upon research of existing formats, then often there is more than one format with more than one name for the particular concept.

Often times new standards are developed which (most often) needlessly rename names from older standards. Thus to repair such naming drift, all other things being equal (e.g. both standards have been widely interoperably implemented), we prefer the older name over the newer name.

Examples of Following the Naming Principles

We've followed these naming principles from the start, and made changes to microformats in development as a result. For example, xFolk was changed from v0.4 to v1RC. xFolk dropped the new class name "extended" in preference for re-using the existing "description" class name. See Changes since xFolk 0.4 for details.

Naming Patterns Under Consideration as Principles

A few patterns have arisen in the naming of class names for microformats, and while these patterns are not conventions (yet), it may be worth considering them.

dt properties

So far, all datetime class names start with "dt", and all class names that start with "dt" are ISO8601 datetime properties. E.g.

Note that "dt" is also under consideration for type XOXO.

Undefined: dtstamp - hCalendar

exceptions to dt prefix

However, some proposed/underdevelopment microformats currently have class names for datetime properties without the "dt" prefix:

Draft:

Proposed:

h word

So far, all uses of a single "h" prefix in a property name apply to (potential) root elements. But not all (potential) root elements start with "h" (which is ok).

E.g.:

Should we enforce the rule that only (potential) root elements may begin with an "h" prefix?

Non-h-prefixed root elements:

Anti-Patterns

Here are things not to do when creating names:

Namespaces

Avoid namespaces or anything resembling namespaces like prefixes (i.e. class names of microformat-key]); read namespaces considered harmful. The problem briefly stated is that namespacing or prefixing encourages silo formats (instead of modular formats, one of the principles) that neither reuse nor are themselves reusable, certainly not in any easy/elegant way. hAtom uses a limited amount of prefixing to exactly reuse a particular semantic from the Atom spec, but even there, uses a generic prefix "entry-" for terms that could then be reused, rather than a specific prefix like "hatom-" which would look awkward in any instance of reuse outside of hAtom.

Anglo centric renaming when reusing

Main article: minimal-vocabulary#the_Anglo_centric_renaming_anti_pattern

Avoid renaming vocabulary when reusing from other specifications. Even if you think you are picking a more understandable English term, you are actually making it more confusing to non-native-English developers, and you are going to waste even diligent native-English developers' time wondering if the two terms (your new "better" term, and the original term) mean exactly the same thing or not. Why even allow for the possibility of confusion? Avoid renaming when reusing.

See Also