URL formats

Jump to: navigation, search


URLs are often defined and represented in various systems as a set of various pieces/parts. This page documents the implicit formats from those systems.

Contents

Why

While similar names are used for the various parts of a URL, it's quite surprising how much variety there is for this fundamental building block of the web.

Why do each of these descriptions of a URL use somewhat different names (and in many cases punctuation boundaries) than the others, and how did this happen?

Perhaps by placing them in a historical order we can make some sense of the evolution of the terminology, which has likely also diverged when adopted by different communities.

URL specification

The URL specification is perhaps the most canonical source for the names of the different parts of a URL.

1994 http://www.w3.org/Addressing/URL/url-spec.txt

Names are quoted literally, dropping any "The" prefix and "part" suffix.

HTTP

The HTTP specification has a few notes about the format/portions of HTTP URLs.

1996 http://www.ietf.org/rfc/rfc1945.txt - 3.2.1 General Syntax

Also:

Canonicalization:

DOM

1996 https://developer.mozilla.org/en/DOM/window.location#Properties

The window.location object represent the URL of the window's page and thus also has properties (terms) for the different parts/pieces.

Properties:

CGI

~1997-1999? Common Gateway Interface, specifically, Environment Variables

http://example.com/cgi-bin/printenv.pl/ponylove?q=20%C001er&moar=kitties

Terms:

Environment variables:

Python 2

2000[1] Python 2 urlparse

Attributes

URI specification

2005 URI Generic Syntax

foo://example.com:8042/over/there?name=ferret#nose

Googler

2007 Per Matt Cutts's blog post Talk like a Googler: parts of a url: of for example:

http://video.google.co.uk:80/videoplay?docid=-7246927612831078230&hl=en#00h02m30s

Parts of a url:

related

URL formats was last modified: Thursday, August 25th, 2011

Views