The Web and HTTP

The Web and HTTP

Until the early 1990s the Internet was used mainly by researchers, academics, and university students to log in to remote hosts, to transfer files from local hosts to remote hosts and vice versa, to receive and send news, and to receive and send electronic mail. Though these applications were (and continue to be) very useful, the Internet was basically unknown outside of the academic and research communities. Then, in the early 1990s, a major new application arrived on the scene - the World Wide Web [Berners-Lee 1994]. The Web was the first Internet application that caught the general public's eye. It radically changed, and continues to change, how people interact inside and outside their work environments. It raised the Internet from just one of many data networks to basically the one and only data network.

Maybe what appeals the most to users is that the Web operates on demand. Users receive what they want, when they want it. This is different from broadcast radio and television, which force users to tune in when the content provider makes the content available, In addition to being available on demand, the Web has many other wonderful features that people love and cherish. It is very easy for any individual to make information available over the Web - everyone can become a publisher at very low cost. Hyperlinks and search engines help us navigate through an ocean of Web sites. Graphics stimulate our senses. Forms, Java applets, and many other devices enable us to interact with pages and sites. And more and more, the Web provides a menu interface to vast quantities of audio and video material stored in the Internet - multimedia that can be accessed on demand.

Overview of HTTP


The HyperText Transfer Protocol (HTTP), the Web's application-layer protocol, is at the heart of the Web. It is described in [RFC 1945] and [RFC 2616]. HTTP is implemented in two programs: a client program and a server program, The client program and server program, executing on different end systems, talk to each other by exchanging HTTP messages. HTTP describes the structure of these messages and how the client and server exchange the messages. Before explaining HTTP in detail, we should review some Web terminology.

A Web page (also called a document) consists of objects. An object is just a file - such as an HTML file, a JPEG image, a Java applet, or a video clip - that is addressable by a single URL. Most Web pages are made of a base HTML file and various referenced objects. For instance if a Web page includes HTML text and five JPEG images, then the Web page has six objects: the base HTML file plus the five images. The base HTML file references the other objects in the page with the objects URLs. Each URL has two components: the hostname of the server that houses the object and the object's path name. For instance, the URL

http://www.someSchool.edu/someDepartment/picture.gif

has www.someSchool.edu for a hostname and /someDepartment/  picture.gif for a path name. Because Web browsers (such as Internet Explorer and Firefox) implement the client side of HTTP, in the context of the Web, we will use the words browser and client  interchangeably. Web servers, which implement the server side of HTTP, house Web objects, each addressable by a URL. Popular Web servers comprise Apache and Microsoft Internet Information Server.

HTTP describes how Web clients request Web pages from Web servers and how servers transfer Web pages to clients. We discuss the interaction between client and server in detail later, but the general thought is demonstrated in Figure 1. When a user requests a Web page (for instance, clicks on a hyperlink), the browser sends HTTP request messages for the objects in the page to the server. The server receives the requests and responds with HTTP response messages that include the objects.

HTTP uses TCP as its underlying transport protocol (rather than running on top of UDP). The HTTP client first starts a TCP connection with the server. Once the connection is established, the browser and the server processes access TCP through their socket interfaces. As explained in "Principles of Network Applications", on the client side the socket interface is the door between the client process and the TCP connection; on the server side it is the door between the server process and the TCP connection. The client sends HTTP request messages into its socket interface and receives HTTP response

HTTP request-response behavior

messages from its socket interface. Likewise, the HTTP server receives request messages from its socket interface and sends response messages into its socket interface. Once the client sends a message into its socket interface, the message is out of the client's hands and is "in the hands" of TCP. Recall from "Principles of Network Applications" that TCP provides a reliable data transfer service to HTTP. This implies that each HTTP request message sent by a client process finally arrives intact at the server; likewise, each HTTP response message sent by the server process finally arrives intact at the client. Here we see one of the great advantages of a layered architecture - HTTP need not worry about lost data or the details of how TCP recovers from loss or reordering of data within the network. That is the job of TCP and the protocols in the lower layers of the protocol stack.

It is important to note that the server sends requested files to clients without storing any state information about the client. If a specific client asks for the same object twice in a period of a few seconds, the server does not respond by saying that it just served the object to the client; instead, the server resends the object, as it has totally forgotten what it did earlier. Because an HTTP server maintains no information about the clients, HTTP is said to be a stateless protocol. We also remark that the Web uses the client-server application architecture, as explained in "Principles of Network Applications". A Web server is always on, with a fixed IP address, and it services requests from potentially millions of different browsers.


Tags

internet application, base html file, web servers, web browsers, socket

Copy Right

The contents available on this website are copyrighted by TechPlus unless otherwise indicated. All rights are reserved by TechPlus, and content may not be reproduced, published, or transferred in any form or by any means, except with the prior written permission of TechPlus.