HTTP Message Format

HTTP Message Format

The HTTP specifications [RFC 2616]) contain the definitions of the HTTP message formats. There are two types of HTTP messages, request messages and response messages, both of which are discussed below.

HTTP Request Message


Below we provide a typical HTTP request message:

GET /somedir/page.html HTTP/1.1
Host: www.someschool.edu

Connection:  close
User-agent:  Mozilla/4.0
Accept-language:  fr

We can learn a lot by taking a close look at this simple request message. First of all, we see that the message is written in ordinary ASCII text, so that your ordinary computer-literate human being can read it. Second, we see that the message comprises five lines, each followed by a carriage return and a line feed. The last line is followed by an additional carriage return and line feed. Although this specific request message has five lines, a request message can have many more lines or as few as one line. The first line of an HTTP request message is called the request line; the subsequent lines are called the header lines. The request line has three fields: the method field, the URL field, and the HTTP version field. The method field can take on various different values, including GET, POST, HEAD, PUT, and DELETE. The great majority of HTTP request messages use the GET method. The GET method is used when the browser requests an object, with the requested object identified in the URL field. In this example, the browser is requesting the object /somedir/page.html. The version is self-explanatory; in this example, the browser implements version HTTP/1.1.

Now let's consider the header lines in the example. The header line Host: www.someschool.edu specifies the host on which the object resides. You might think that this header line is needless, as there is already a TCP connection in place to the host. But, as we'll see in "Web Caching", the information provided by the host header line is required by Web proxy caches. By including the Connection: close header line, the browser is telling the server that it doesn't want to bother with persistent connections; it wants the server to close the connection after sending the requested object. The user-agent: header line specifies the user agent, that is, the browser type that is making the request to the server. Here the user agent is Mozilla/4.0, a Netscape browser. This header line is useful because the server can actually send different versions of the same object to different types of user agents. (Each of the versions is addressed by the same URL). Lastly, the Accept-language: header shows that the user prefers to receive a French version of the object, if such an object exists on the server; otherwise, the server should send its default version. The Accept-language: header is just one of many content negotiation headers available in HTTP.

Having looked at an example, Iet us now consider the general format of a request message, as shown in Figure 1. We see that the general format closely follows our earlier example. You may have noticed, however, that after the header lines (and the additional carriage return the line feed) there is an "entity body". The entity body is empty with the GET method, but is used with the POST method. An HTTP client sometimes uses the POST method when the user fills out a form - for instance, when a user provides search words to a search engine. With a POST message, the user is still requesting a Web page from the server, but the particular contents of the Web page depend on what the user entered into the form fields. If the value of the method field is POST, then the entity body contains what the user entered into the form fields.

General format of an HTTP request message

We would be remiss if we didn't mention that a request generated with a form does not essentially use the POST method. Instead, HTML forms often use the GET method and contain the inputted data (in the form fields) in the requested URL. For instance, if a form uses the GET method, has two fields, and the inputs to the two fields are monkeys and bananas, then the URL will have the structure www.somesite.cm/animalsearch?monkeys&bananas. In your day-to-day Web surfing, you have perhaps noticed extended URLs of this kind.

The HEAD method is similar to the GET method. When a server receives a request with the HEAD method, it responds with an HTTP message but it leaves out the requested object. Application developers often use the HEAD method for debugging. The PUT method is often used in conjunction with Web publishing tools. It allows a user to upload an object to a particular path (directory) on a particular Web server. The PUT method is also used by applications that need to upload objects to Web servers. The DELETE method allows a user, or an application, to delete an object on a Web server.

HTTP Response Message

Below we provide a typical HTTP response message. This response message could be the response to the example request message just discussed.

HTTP/1.1  200  OK
Connection:  close


Date: Sat, 07 Jul 2007 12:00:15  GMT
Server: Apache/1.3.0  (Unix)
Last-Modified: Sun, 6 May 2007 09:23:24 GMT
Content-Length:  6821
Content-Type: text/html

(data data data data data . . . )

Let's take a careful look at this response message. It has three sections: an initial status line, six header Iines, and then the entity body. The entity body is the meat of the message - it includes the quested object itself (represented by data data data data data . . . ). The status line has three fields: the protocol version field, a status code, and a corresponding status message. In this example, the status line shows that the server is using HTTP/1.1 and that everything is OK (that is, the server has found, and is sending, the requested object).

Now let's consider the header lines. The server uses the Connection: close header line to tell the client that it is going to close the TCP connection after sending the message. The Date: header line shows the time and date when the HTTP response was created and sent by the server. Note that this is not the time when the object was created or last modified; it is the time when the server recovers the object from its file system, inserts the object into the response message, and sends the response message. The Server: header line shows that the message was created by an Apache Web server; it is similar to the User-agent: header line in the HTTP request message. The Last-Modified: header line shows the time and date when the object was created or last modified. The Last-Modified: header, which we will soon cover in more detail, is critical for object caching, both in the local client and in network cache servers (also known as proxy servers). The Content-Length: header line shows the number of bytes in the object being sent. The Content-Type: header line indicates that the object in the entity body is HTML text. (The object type is officially indicated by the Content-Type: header and not by the file extension).

Having looked at an example, let's now examine the general format of a response message, which is shown in Figure 2. This common format of the response message matches the previous example of e response message. Let's say a few additional words about status codes and their phrases. The status code and associated phrase show the result of the request. Some general status codes and  associated phrases contain:

●  200  OK: Request succeeded and the information is returned in the response.
●  301  Moved permanently: Requested object has been permanently moved; the new URL is specified in Location: header of the response message. The client software will automatically retrieve the new URL.

General format of an HTTP response message

●  400  Bad Request:  This is a generic error code indicating that the request could not be understood by the server.

●  404  Not  Found : The requested document does not exist on this server.

●  505  HTTP  Version  Not  Supported: The requested HTTP protocol version is not supported by the server.

How would you like to see a real HTTP response message? This is highly recommended and very easy to do! First Telnet into your favorite Web server. Then type in a one-line request message for some object that is housed on the server. For instance, if you have access to a command prompt, type:

telnet  eis.poly.edu  80

GET /~ross/ HTTP/1.1
Host:  cis.poly.edu

(Press the carriage return twice after typing the last line). This opens a TCP connection to port 80 of the host cis.poly.edu and then sends the HTTP request message. You should see a response message that contains the base HTML file of Professor Ross's homepage. If you'd rather just see the HTTP message lines and not receive the object itself, replace GET with HEAD. Finally, replace / ~ ross / with / ~ banana/ and see what kind of response message you get.

In this section we discussed a number of header lines that can be used within HTTP request and response messages. The HTTP specification describes many, many more header lines that can be inserted by browsers, Web servers, and network cache servers. We have covered only a small number of the totality of header lines. We'll cover a few more below and another small number when we discuss network Web caching in "Web Caching". A vastly readable and comprehensive discussion of the HTTP protocol, including its headers and status codes, is given in [Krishnamurty 2001]; see also [Luotonen 1998] for a developer's view.

How does a browser decide which header lines to contain in a request message? How does a Web server decide which header lines to contain in a response message? A browser will generate header lines as a function of the browser type and version (for example, an HTTP/1.0 browser will not generate any 1.1 header lines), the user configuration of the browser (for example, preferred language), and whether the browser currently has a cached, but maybe out-of-date, version of the object. Web servers behave likewise; There are different products, versions and configurations, all of which affect which header lines are contained in response messages.


Tags

base html file, persistent connections, entity body, request line

Copy Right

The contents available on this website are copyrighted by TechPlus unless otherwise indicated. All rights are reserved by TechPlus, and content may not be reproduced, published, or transferred in any form or by any means, except with the prior written permission of TechPlus.