Peer-to-Peer Applications

Peer-to-Peer Applications

The applications explained in this section so far - including the Web, e-mail, and DNS - all employ client-server architectures with significant reliance on always-on infrastructure servers. Recall from "Network Application Architectures" that with a P2P architecture, there is minimal (or no) reliance on always-on infrastructure servers. Instead, pairs of intermittently connected hosts, called peers, communicate directly with each other. The peers are not owned by a service provider, but are instead desktops and laptops controlled by users.

In this section we'll study three different applications that are specifically well-suited for P2P designs. The first is file distribution, where the application distributes a file from a single source to a large number of peers. File distribution is a nice place to start our investigation of P2P, as it clearly exposes the self-scalability of P2P architectures. As a particular example for file distribution, we'll explain the popular BitTorrent system. The second P2P application we'll study is a database distributed over a large community of peers. For this application, we'll explore the concept of a Distributed Hash Table (DHT). Finally, for our third application, we'll study Skype, an extraordinarily successful P2P Internet telephony application.

P2P File Distribution


We begin our foray into P2P by considering a very natural application, namely, distributing a large file from a single server to a large number of hosts (called peers). The file might be a new version of the Linux operating system, a software patch for an existing operating system or application, an MP3 music file, or an MPEG video file. In client-server file distribution, the server must send a copy of the file to each of the peers - placing an huge burden on the server and consuming a large amount of server bandwidth. In P2P file distribution, each peer can redistribute any portion of the file it has received to any other peers, thereby assisting the server in the distribution process. As of this writing the most popular P2P file distribution protocol is BitTorrent. Originally developed by Bram Cohen, there are now many different independent BitTorrent clients conforming to the BitTorrent protocol, just as there are a number of Web browser clients that conform to the HTTP protocol. In this subsection, we first look at the self-scalability of P2P architectures in the context of file distribution. We then explain BitTorrent in some detail, highlighting its most important characteristics and features.

Scalability of P2P Architectures


To compare client-server architectures with peer-to-peer architectures, and illustrate the inherent self-scalability of P2P, we now examine a simple quantitative model for distributing a file to a fixed set of peers for both architecture types. As shown in Figure 1, the server and the peers are connected to the Internet with access links. Denote the upload rate of the server's access link by us, the upload rate of the ith peer's access link by ui, and the download rate of the ith peer's access link by di. Also denote the size of the file to be distributed (in bits) by F and the number of peers that want to obtain a copy of the file by N. The distribution time is the time it takes to get a copy of the file to all N peers. In our analysis of the distribution time below, for both client-server and P2P architectures, we make the simplifying (and usually accurate assumption that the Internet core has abundant

An illustrative file distribution problem

bandwidth, implying that all of the bottlenecks are in network access. We also assume that the server and clients are not participating in any other network applications, so that all of their upload and download access bandwidth can be fully devoted to distributing this file.

Let's first determine the distribution time for the client-server architecture, which we denote by Dcs, In the client-server architecture, none of the peers aids in distributing the file. We make the following observations:

●  The server must transmit one copy of the file to each of the N peers. Thus the server must transmit NF bits. Since the server's upload rate is us, the time to distribute the file must be at least NF/us.

●  Let dmin denote the download rate of the peer with the lowest download rate, that is, dmin = min {d1,dp,. . . ,dN}. The peer with the lowest download rate cannot obtain all F bits of the file in less than F/dmin seconds. Thus the minimum distribution time is at least F/dmin.

Putting these two observations together, we obtain



This provides a lower bound on the minimum distribution time for the client-server architecture. In the homework problems you will be asked to show that the server can schedule its transmissions so that the lower bound is actually achieved. So let's take this lower bound provided above as the actual distribution time, that is,



We see from Equation (b) that for N large enough, the client-server distribution time is given by NF/us. Thus, the distribution time increases linearly with the number of peers N. So, for example, if the number of peers from one week to the next increases a thousand-fold from a thousand to a million, the time required to distribute the file to all peers increases by 1,000.

Lets now go through a similar analysis for the P2P architecture, where each peer can assist the server in distributing the file. In particular, when a peer receives some file data, it can use its own upload capacity to redistribute the data to other peers. Calculating the distribution time for the P2P architecture is somewhat more complicated than for the client-server architecture, since the distribution time depends on how each peer distributes portions of the file to the other peers. Nevertheless, a simple expression for the minimal distribution time can be obtained [Kumar 2006]. To this end, we first make the following observations:

●  At the beginning of the distribution, only the server has the file. To get this file into the community of peers, the server must send each bit of the file at least once into its access link. Thus, the minimum distribution time is at least F/us. (Unlike the client-server scheme, a bit sent once by the server may not have to be sent by he server again, as the peers may redistribute the bit among themselves).

●  As with the client-server architecture, the peer with the lowest download rate cannot obtain all F bits of the file in less than F/dmin seconds. Thus the minimum distribution time is at least F/dmin.
 
●  Finally, observe that the total upload capacity of the system as a whole is equal to the upload rate of the server plus the upload rates of each of the individual peers, that is, utotal = us + u1 + ... + uN. The system must deliver (upload) F bits to each of the N peers, thus delivering a total of NF bits. This cannot be done at a rate faster than utotal . Thus, the minimum distribution time is also at least NFl(us + u1 + .... + uN).

Putting these three observations together, we obtain the minimum distribution time for P2P, denoted by Dp2p.



Equation (c) provides a lower bound for the minimum distribution time for the P2P architecture. It turns out that if we imagine that each peer can redistribute a bit as soon as it receives the bit, then there is a redistribution scheme that actually achieves this lower bound [Kumar 2006]. In reality, where chunks of the file are redistributed rather than individual bits, Equation (c) serves as a good approximation of the actual minimum distribution time. Thus, let's take the lower bound provided by Equation (c) as the actual minimum distribution time, that is,



Figure 2 compares the minimum distribution time for the client-server and P2P architectures assuming that all peers have the same upload rate u. In Figure 2, we have set F/u = 1 hour, us = 10u, and dmin ≥ us. Thus, a peer can transmit the entire file in one hour, the server transmission rate is 10 times the peer upload rate, and (for simplicity) the peer download rates are set large enough so as not to have an effect. We see from Figure 2 that for the client-server architecture, the distribution time increases linearly and without bound as the number of peers increases. However, for the P2P architecture, the minimal distribution time is not only always less than the distribution time of the client-server architecture; it is also less than one hour for any number of peers N. Thus, applications with the P2P

Distribution time for P2P and client-server architectures

architecture can be self-scaling. This scalability is a direct consequence of peers being redistributors as well as consumers of bits.



Tags

peers, service provider, download rate, distribution time, internet core

Copy Right

The contents available on this website are copyrighted by TechPlus unless otherwise indicated. All rights are reserved by TechPlus, and content may not be reproduced, published, or transferred in any form or by any means, except with the prior written permission of TechPlus.