Jupiter Rising A Decade of Clos Topologies and Centralized Control in Google’s Datacenter Network

Datacenter used to use WAN switches:

Limited by the performance of the switch
Datacenter doesn’t need lots of features/protocols on WAN switches
Datacenter switch doesn’t need to be as high available as WAN switches.

This paper is about Google’s experience of building high performance datacenter networks using commodity hardware components. Principles:

Clos topologies: fault tolerance and high scalability.

Challenges: fanout and more complex routing across multiple equal-cost paths.
Merchant silicon: use cheap models and upgrade them frequently to benefit from Moore’s Law.
Centralized control protocols: control plane is very complex in Clos networks.

To control this complexity, we observed that individual datacenter switches played a predetermined forwarding role based on the cluster plan.

This is an overwhelmingly dense experience paper. The main takeaways I got from this paper are (very common lessons from system evolution in recent decades):

Datacenter setting is very different from WAN setting (e.g. single owner, each switch has a fixed role, availability can be sacrificed). They made several observations then make use of them.
Commodity hardware has strong advantages on price. This reminds me of the NOW paper.
In the era of Moore’s Law, it’s easy to grow the bandwidth capacity exponentially, simply by buying cheap models and upgrading them frequently.

In the Q&A section of their SIGCOMM’15 talk, someone said that “at several points in this paper, it is just relearning the things we know for thirty years.” The speaker answered that one big difference is they transform this hardware problem to a software and scheduling problem. I’m not familiar with networks, so I can’t tell what is the known and what is the unknown. But I feel even if they relearned something, it validates the problem/idea in a industry environment. I appreciate industry people sharing their valuable evolution path.