The HTTP protocol is the basis of the Web. It is he who is used to manage hypertext links on the Internet. Here are some details on how it works.
HTTP (HyperText Transfer Protocol) has been the most widely used protocol on the Internet since 1990. Version 0.9 was only intended to transfer data over the Internet (especially web pages written in HTML]Version 1.0 of the protocol (the most widely used ) makes it possible to transfer messages with headers describing the content of the message using a MIME type encoding.The purpose of the HTTP protocol is to allow a transfer of files (essentially in HTML format) localized thanks to a character string called URL between a browser (the client) and a web server (called by the way httpd on UNIX machines).
Communication between browser and server
Communication between the browser and the server takes place in two stages:
- The browser performs a HTTP request.
- The server processes the request then sends a HTTP response.
In reality, the communication takes more time if we consider the processing of the request by the server. Since we are only interested in the HTTP protocol, the processing on the server side will not be explained in the context of this article… If this subject interests you, refer to the article on the processing of CGIs .
HTTP request
An HTTP request is a set of lines sent to the server by the browser. She understands :
- A query line: it is a line specifying the type of document requested, the method which must be applied, and the version of the protocol used. The line consists of three elements that must be separated by a space:
- The method
- The URL
- The version of the protocol used by the client (usually HTTP/1.0)
- Request header fields: this is a set of optional lines used to provide additional information about the request and/or the client (Browser, operating system, …). Each of these lines is composed of a name qualifying the type of header, followed by a colon (:) and the value of the header
- The body of the request: it is a set of optional lines to be separated from the previous lines by an empty line and allowing for example a sending of data by a POST command when sending data to the server by a form
An HTTP request therefore has the following syntax (
METHODE URL VERSION<crlf> EN-TETE : Valeur<crlf> . . . EN-TETE : Valeur<crlf> Ligne vide<crlf> CORPS DE LA REQUETE
Here is an example of an HTTP request:
GET [/ https://www.commentcamarche.net/] HTTP/1.0 Accept : text/html If-Modified-Since : Saturday, 15-January-2000 14:37:11 GMT User-Agent : Mozilla/4.0 (compatible; MSIE 5.0; Windows 95)
Orders
Ordered | Description |
---|---|
GET | Request for the resource located at the specified URL |
HEAD | Request for the header of the resource located at the specified URL |
POST | Sending data to the program located at the specified URL |
PUT | Send data to specified URL |
DELETE | Deleting the resource located at the specified URL |
Headers
Header name | Description |
---|---|
Accept | Type of content accepted by the browser (for example text/html). See MIME types |
Accept-Charset | Character set expected by the browser |
Accept-Encoding | Data encryption accepted by the browser |
Accept-Language | Language expected by the browser (English by default) |
Authorization | Identification of the browser to the server |
Content-Encoding | Request body encoding type |
Content-Language | Request body language type |
Content-Length | Request body length |
Content-Type | Request body content type (for example text/html). See MIME types |
Date | Data transfer start date |
forwarded | Used by intermediate machines between the browser and the server |
From | Allows you to specify the customer’s email address |
From | Allows you to specify that the document must be sent if it has been modified since a certain date |
Link | Relationship between two URLs |
Orig-URL | Original request URL |
Refer | URL of the link from which the request was made |
User Agent | String giving information about the client, such as browser name and version, operating system |
HTTP Response
An HTTP response is a set of lines sent to the browser by the server. She understands :
- A status line: this is a line specifying the version of the protocol used and the status of the processing of the request using a code and an explanatory text. The line consists of three elements that must be separated by a space:
- The version of the protocol used
- The status code
- The meaning of the code
- Response header fields: this is a set of optional lines allowing to give additional information about the response and/or the server. Each of these lines is composed of a name qualifying the type of header, followed by a colon (:) and the value of the header
- The response body: it contains the requested document
An HTTP response therefore has the following syntax (
VERSION-HTTP CODE EXPLICATION<crlf> EN-TETE : Valeur<crlf> . . . EN-TETE : Valeur<crlf> Ligne vide<crlf> CORPS DE LA REPONSE
Here is an example of an HTTP response:
HTTP/1.0 200 OK Date : Sat, 15 Jan 2000 14:37:12 GMT Server : Microsoft-IIS/2.0 Content-Type : text/HTML Content-Length : 1245 Last-Modified : Fri, 14 Jan 2000 08:25:13 GMT
Response headers
Header name | Description |
---|---|
Content-Encoding | Response body encoding type |
Content-Language | Response body language type |
Content-Length | Response body length |
Content-Type | Response body content type (e.g. text/html). |
Date | Data transfer start date |
Expires | Data consumption deadline |
forwarded | Used by intermediate machines between the browser and the server |
Lease | Redirection to a new URL associated with the document |
server | Characteristics of the server that sent the response |
Response Codes
These are the codes you see when the browser fails to provide you with the requested page. The response code consists of three digits: the first indicates the status class and the following the exact nature of the error.
Coded | Message | Description |
---|---|---|
10x | Information message | These codes are not used in version 1.0 of the protocol |
20x | Success | These codes indicate the smooth running of the transaction |
200 | OK | The request was completed successfully |
201 | CREATED | It follows a POST command, it indicates success, the body of the rest of the document is supposed to indicate the URL at which the newly created document should be located. |
202 | ACCEPTED | The request has been accepted, but the following procedure has not been completed |
203 | PARTIAL INFORMATION | When this code is received in response to a GET command, it indicates that the response is not complete. |
204 | NO RESPONSE | The server received the request but there is no information to return |
205 | RESET CONTENT | The server tells the browser to delete the contents of the fields of a form |
206 | PARTIALLY HAPPY | This is a response to a request with the header tidy. The server must indicate the header content-Range |
30x | Redirect | These codes indicate that the resource is no longer in the indicated location |
301 | MOVED | The requested data has been transferred to a new address |
302 | FOUND | The requested data is at a new URL, but may have moved since… |
303 | METHOD | This implies that the client should try a new address, preferably trying a method other than GET |
304 | NOT MODIFIED | If the client has made a conditional GET command (asking if the document has been modified since the last time) and the document has not been modified, it returns this code. |
40x | Customer error | These codes indicate that the request is incorrect |
400 | BAD REQUEST | The query syntax is poorly worded or impossible to satisfy |
401 | UNAUTHORIZED | The message parameter gives specifications of acceptable forms of authorization. The client must reformulate his request with the correct authorization data |
402 | PAYMENT REQUIRED | The customer must reformulate his request with the correct payment data |
403 | FORBIDDEN | Access to the resource is simply prohibited |
404 | NOT FOUND | Classic! The server did not find anything at the specified address. Left without leaving an address… 🙂 |
50x | Server error | These codes indicate that there has been an internal server error |
500 | INTERNAL ERROR | The server encountered an unexpected condition that prevented it from fulfilling the request (like what happens to their servers…) |
501 | NOT IMPLEMENTED | The server does not support the requested service (we can’t know how to do everything…) |
502 | BAD GATEWAY | The server received an invalid response from the server it was trying to access while acting as a gateway or proxy |
503 | SERVICE UNAVAILABLE | The server cannot answer you at the moment because the traffic is too heavy (all the lines of your correspondent are busy, please call back later) |
504 | GATEWAY TIMEOUT | The response from the server was too long compared to the time the gateway was prepared to wait for it (the time allotted to you has now expired…) |
For more information on the HTTP protocol, it is best to refer to RFC 1945 explaining the protocol in detail: