url-formats: Difference between revisions
Jump to navigation
Jump to search
(added HTTP URL syntax list of parts of the URL) |
(add CGI terms) |
||
Line 57: | Line 57: | ||
* :port is omitted if the port is 80 | * :port is omitted if the port is 80 | ||
* empty abs_path is replaced with '''/''' | * empty abs_path is replaced with '''/''' | ||
== CGI == | |||
* http://tools.ietf.org/html/rfc3875 | |||
* http://en.wikipedia.org/wiki/Common_Gateway_Interface - has example: http://example.com/cgi-bin/printenv.pl/ponylove?q=20%C001er&moar=kitties | |||
Terms: | |||
* '''script-URI''' | |||
** '''scheme''' same as SERVER_PROTOCOL | |||
** '''://''' | |||
** '''server-name''' - SERVER_NAME | |||
** ''':''' | |||
** '''server-port''' - SERVER_PORT | |||
** '''script-path''' same as SCRIPT_NAME | |||
** '''extra-path''' same as PATH_INFO | |||
** '''?''' | |||
** '''query-string''' - QUERY_STRING | |||
Environment variables: | |||
* '''SERVER_PROTOCOL''' - not the protocol scheme, e.g. "HTTP/1.1" | |||
* '''HTTP_HOST''' - e.g "example.com" | |||
* '''SERVER_PORT''' - e.g. "80" | |||
* '''REMOTE_USER''' | |||
* '''PATH''' - not the URL path, but to the web server on the system | |||
* '''REQUEST_URI''' - e.g. "/cgi-bin/printenv.pl/ponylove?q=20%C001er&moar=kitties" | |||
** '''SCRIPT_NAME''' - e.g. "/cgi-bin/printenv.pl" | |||
** '''PATH_INFO''' - e.g. "/ponylove" | |||
** '''QUERY_STRING''' - e.g. "q=20%C001er&moar=kitties" | |||
== related == | == related == | ||
* [[url]] | * [[url]] |
Revision as of 23:52, 21 August 2011
<entry-title>URL formats</entry-title>
URLs are often defined and represented in various systems as a set of various pieces/parts. This page documents the implicit formats from those systems.
URL specification
The URL specification is perhaps the most canonical source for the names of the different parts of a URL.
1994 http://www.w3.org/Addressing/URL/url-spec.txt
Names are quoted literally, dropping any "The" prefix and "part" suffix.
- PrePrefix - e.g. "URL:". The portion before the "http".
- Scheme - e.g. "http"
- :
- Internet protocol parts
- // (until the following /)
- user name (if present, followed by an @ after optional password (see next field)).
- password (if present, preceded by a :)
- internet domain name - e.g. "www.w3.org"
- port number (if present, preceded by a :)
- Path
- search
- fragmentid - "the hash sign and following"
HTTP
The HTTP specification has a few notes about the format/portions of HTTP URLs.
1996 http://www.ietf.org/rfc/rfc1945.txt - 3.2.1 General Syntax
- URI
- absoluteURI
- scheme
- :
- relativeURI
- net_path
- //
- net_loc
- abs_path
- /
- rel_path
- path
- fsegment
- segment (zero or more, if present, preceded by /)
- params (if present, preceded by ;)
- query (if present, preceded by ?)
- path
- net_path
- fragment (if present, preceded by #)
- absoluteURI
Also:
- http_URL
- http://
- host
- port (if present, preceded by :)
- abs_path (as defined above)
Canonicalization:
- host is lowercased
- :port is omitted if the port is 80
- empty abs_path is replaced with /
CGI
- http://tools.ietf.org/html/rfc3875
- http://en.wikipedia.org/wiki/Common_Gateway_Interface - has example: http://example.com/cgi-bin/printenv.pl/ponylove?q=20%C001er&moar=kitties
Terms:
- script-URI
- scheme same as SERVER_PROTOCOL
- ://
- server-name - SERVER_NAME
- :
- server-port - SERVER_PORT
- script-path same as SCRIPT_NAME
- extra-path same as PATH_INFO
- ?
- query-string - QUERY_STRING
Environment variables:
- SERVER_PROTOCOL - not the protocol scheme, e.g. "HTTP/1.1"
- HTTP_HOST - e.g "example.com"
- SERVER_PORT - e.g. "80"
- REMOTE_USER
- PATH - not the URL path, but to the web server on the system
- REQUEST_URI - e.g. "/cgi-bin/printenv.pl/ponylove?q=20%C001er&moar=kitties"
- SCRIPT_NAME - e.g. "/cgi-bin/printenv.pl"
- PATH_INFO - e.g. "/ponylove"
- QUERY_STRING - e.g. "q=20%C001er&moar=kitties"