Typical content distribution solutions are based on placing dedicated equipment inside or at the edge of the Internet. The best example of such solutions is Akamai [1], which runs several tens of thousands of servers all over the world. In recent years, a new paradigm for Content Distribution has emerged based on a fully distributed architecture where commodity PCs are used to form a cooperative network and share their resources (storage, CPU, bandwidth).

Cooperative content distribution solutions are inherently self scalable, in that the bandwidth capacity of the system increases as more nodes arrive: each new node requests service from, and, at the same time, provides service to other nodes. Because each new node contributes resources, the capacity of the system grows as the demand increases, resulting in limitless system scalability. With cooperation, the source of the file, i.e. the server, does not need to increase its resources to accommodate the larger user population; this, also, provides resilience

This work was done while the first author was with Microsoft research. to “flash crowds”— a huge and sudden surge of traffic that usually leads to the collapse of the affected server. Therefore, end-system cooperative solutions can be used to efficiently and quickly deliver software updates, critical patches, videos, and other large files to a very large number of users while keeping the cost at the original server low.

The best example of an end-system cooperative architecture is the BitTorrent system, which became extremely popular as a way of delivering the Linux distributions and other popular content. BitTorrent splits the file into small blocks, and immediately after a node downloads a block from the origin server or from another peer, the node behaves as a server for that particular block, and, thus, contributes resources for serving the block. Further design choices, such as intelligent selection of the block to download, and parallel downloading of blocks, improve the performance of the system. For a detailed description of the BitTorrent system see [2].

Despite their enormous potential and popularity, existing end-system cooperative schemes such as BitTorrent, may suffer from a number of inefficiencies which decrease their overall performance. Such inefficiencies are more pronounced in large and heterogeneous populations, during flash crowds, in environments with high churn, or when cooperative incentive mechanisms are in place. In this paper we propose a new end- system cooperative solution that uses network coding, i.e. data encoding at the interior nodes of the network, to overcome most of these problems.

Download pdf Network Coding for Large Scale Content Distribution