Motivation:
This paper looks at the problem of how to design a cluster for Internet services and how to provide software services which can be shared across all the services in a cluster.
Main Points
- The three main qualities that a network service should have are scalability, availability and cost-effectiveness.
- The paper describes the use of commodity building blocks to design a cluster with the insight that commodity machines give the best price-performance ratio.
- This leads to a set of challenges in the software-layer which include load balancing, scheduling and maintaining shared state across a large number of machines.
- The cluster architecture consists of a pool of worker nodes, shared cache servers and frontends. There is a centralized manager used for load-balancing and a common user-profile database.
- This paper also proposes the use of BASE semantics for data stored in the cache and workers. This increases the availability of data and makes it easier to scale.
Trade-offs/Influence:
- The architecture described here has been very influential and is in fact pretty close to the Google serving infrastructure (as of 2003). In fact even the Mesos design has similar goals of provisioning a cluster for multiple applications.
- The idea of using commodity building blocks and providing transparent fault tolerance is the basis for the design of MapReduce-like systems
- BASE semantics have also been very influential and this led in turn to the CAP theorem which has been useful to build storage systems like Bigtable, PNUTS etc.