DRY

From Microformats Wiki
Jump to navigation Jump to search


DRY is an acronym for "Don't Repeat Yourself".

The concept is a major reason for using such microformats as hAtom. The idea is rather than publishing something twice (repeating yourself) with (x)HTML for browsers and XML for aggregators - you simply publish once using (x)HTML and allow the tools to take care of the rest. This puts content creators first as they only have to maintain and publish one source and the technical barriers to publishing to the web are lower.

Examples of DRY failures

This section is a stub.

There are numerous past examples of DRY failures, each of which resulted in bad (meta)data, some in new and interesting ways.

meta

keywords

HTML introduced the notion of meta keywords, typically used to repeat keywords that were on the visible page itself, which were indexed by 1990s search engines, but have been abandoned by search engines (e.g. Google) since they're far more noise than signal for a number of reasons:

  • historical rot: the visible page changed due to web authors editing it, but since they didn't see anything wrong when viewing the web page, they failed to update the invisible and duplicated meta keywords.
  • spamming: numerous sites stuffed their meta keywords which were either irrelevant to the subject of the page or barely relevant.

ogp

See: http://pinboard.in/u:benward/b:9dcc058a6e29

In short, sites are abusing the Facebook Open Graph Protocol (OGP) meta markup to provide *different* (less useful) page titles to Facebook in order to apparently entice more clicks through to their pages.

description

Meta description may be the one exception to meta tag failures. While typically used to duplicate information that is visible on the page, since search engines (notably Google) do tend to display a page's meta description as part of the search result for that page, there is additional incentive to provide something accurate and up to date there.

sidefiles

Sidefiles, whether in XML or some other format, over time become hopelessly out of sync with the visible HTML pages that they're supposed to represent. Here are some examples.

RSS Atom RDF

In the mid 2000s the blog search engine Technorati indexed blogs' HTML and any RSS, Atom, and RDF feeds. Up to 30-40% of those feeds were broken, out of date, or just had completely different content than the blog HTML. In a schema.org workshop in mid-2011, representatives from Google verified that they'd seen similarly poor quality when indexing RSS and Atom feeds.

XML Sitemaps

XML Sitemaps may be the only known exception to sidefile DRY rot. That is, while essentially duplicating what could be discovered by crawling the hyperlinks of a website, XML Sitemaps appear to provide a reasonably accurate map of where major files/directories/pages are on a web site.

This is likely due to two things:

  • overall site IA changes less often than page content itself, thus sitemaps get out of sync less often
  • there is substantial search-engine optimization incentive to keep an accurate site-map (apparently more so than there was with meta keywords)

See http://en.wikipedia.org/wiki/Site_map for more.

Regardless, given the historically poor record of sidefiles, it is unlikely that a new sidefile would succeed like sitemaps has.

It is much more likely that any new sidefile would fail similar to RSS/Atom/RDF, if it's lucky enough to even get that much adoption.

See also



norights-a.gif All text of this page is available under the terms of the Creative Commons Public Domain License. (See Copyrights for details.)