Motivation:
Datacenter networks have a need to support VM migration without losing
any open connections. Layer 2 networks can support such functionality
but cannot scale to many thousands of nodes. While layer 3 networks are
more scalable, they also have a configuration overhead. This paper
proposes a new scalable, layer 2 network fabric for datacenters.
Main points
- The paper assumes a datacenter topology that is a multi-rooted tree.
This is applicable for fat-trees and some of the other topologies
developed in recent years.
- The main idea in the paper is the introduction of a Psuedo MAC (PMAC)
address which allows end hosts to be named hierarchically at level 2.
The edge-switches perform a translation of PMAC to MAC addresses and
vice-versa.
- The networking fabric is co-ordinated by a centralized fabric manager.
The fabric manager helps avoid broadcasts of ARP requests and helps in
performing fault-tolerant routing.
- The paper also describes a local discovery protocol which helps switches
bootstap automatically and discover their role in the multi-rooted
tree.
Trade-offs
- This paper tries to get the benefits of Layer 3 (hierarchical
namespace, better routing etc.) and the benefits of Layer 2 (migrate
VMs without losing connections).
- The major insight in this paper is that since datacenter network
topology is hierarchical and well-known, a new indirection (PMAC)
can be used to get the benefits of Layer 2,3.
- The centralized manager simplifies the design but is a scalability
bottleneck. The authors propose the a small cluster could be used, but
its not clear if this would affect the other properties of the system.