• Existing system optimizations targeting nanosecond- (e.g. memory access) and millisecond-scale (e.g. reading disk) events are inadequate for events in the microsecond range.
  • Google strongly prefers synchronous programming model
  • Simple and consistent synchronous APIs and idioms across different languages
  • Shifting the burden of managing asynchronous events away from the programmer to the operating system or the thread library makes the code significantly simpler.
  • Microsecond-scale: fast NIC, fast flash device, non-volatile memory, GPU/accelerator offload
  • Nice figures: Table 1 and Figure 1.
  • “A 2015 paper summarizing a multiyear longitudinal study at Google10 showed that 20%–25% of fleetwide processor cycles are spent on lowlevel overheads we call the “datacenter tax.” Examples include serialization and deserialization of data, memory allocation and de-allocation, network stack costs, compression, and encryption.” Also see Atul Adya’s HotOS’19 paper.
  • One very special aspect of articles from Google is that they put the engineer productivity and code maintainability in a very important position.

This article says about lots of CPU cycles are wasting on (de-)serialization and network stack. This reminds me of Atul Adya’s HotOS’19 paper, which basically advocates people to build stateful applications instead of stateless, so that we can avoid a lot of unnecessary (de-)serialization and network communication.