ScratchPad
All tagged posts are here.
# MLSys (x8)
- Jointly Optimizing Preprocessing and Inference for DNN-based Visual Analytics
- In-Datacenter Performance Analysis of a Tensor Processing Unit
- Fractional GPUs: Software-based Compute and Memory Bandwidth Reservation for GPUs
- Unifying Data, Model and Hybrid Parallelism in Deep Learning via Tensor Tiling
- Beyond Data and Model Parallelism for Deep Neural Networks
- PipeDream: Generalized Pipeline Parallelism for DNN Training
- PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems
- Deep Learning Inference Service at Microsoft
# Inference (x5)
- Jointly Optimizing Preprocessing and Inference for DNN-based Visual Analytics
- In-Datacenter Performance Analysis of a Tensor Processing Unit
- Fractional GPUs: Software-based Compute and Memory Bandwidth Reservation for GPUs
- PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems
- Deep Learning Inference Service at Microsoft
# Cloud (x5)
- Andromeda: Performance, Isolation, and Velocity at Scale in Cloud Network Virtualization
- Azure Accelerated Networking: SmartNICs in the Public Cloud
- ASIC Clouds: Specializing the Datacenter
- Happiness index: Right-sizing the cloud’s tenant-provider interface
- Nines are Not Enough: Meaningful Metrics for Clouds
# Training (x4)
- Benchmarking TPU, GPU, and CPU Platforms for Deep Learning
- Unifying Data, Model and Hybrid Parallelism in Deep Learning via Tensor Tiling
- Beyond Data and Model Parallelism for Deep Neural Networks
- PipeDream: Generalized Pipeline Parallelism for DNN Training
# SDN (x3)
- Andromeda: Performance, Isolation, and Velocity at Scale in Cloud Network Virtualization
- Jupiter Rising A Decade of Clos Topologies and Centralized Control in Google's Datacenter Network
- Azure Accelerated Networking: SmartNICs in the Public Cloud
# NVM (x3)
- Strata: A Cross Media File System
- Attack of the Killer Microseconds
- SplitFS: Reducing Software Overhead in File Systems for Persistent Memory
# GPU (x3)
- Benchmarking TPU, GPU, and CPU Platforms for Deep Learning
- Attack of the Killer Microseconds
- Fractional GPUs: Software-based Compute and Memory Bandwidth Reservation for GPUs
# Google (x3)
- Slicer: Auto-Sharding for Datacenter Applications
- Evolve or Die: High-Availability Design Principles Drawn from Google's Network Infrastructure
- Jupiter Rising A Decade of Clos Topologies and Centralized Control in Google's Datacenter Network
# Virtualization (x2)
- Andromeda: Performance, Isolation, and Velocity at Scale in Cloud Network Virtualization
- Azure Accelerated Networking: SmartNICs in the Public Cloud
# VideoProcessing (x2)
- Jointly Optimizing Preprocessing and Inference for DNN-based Visual Analytics
- ASIC Clouds: Specializing the Datacenter
# TPU (x2)
- Benchmarking TPU, GPU, and CPU Platforms for Deep Learning
- In-Datacenter Performance Analysis of a Tensor Processing Unit
# Storage (x2)
# SOSP2019 (x2)
- PipeDream: Generalized Pipeline Parallelism for DNN Training
- SplitFS: Reducing Software Overhead in File Systems for Persistent Memory
# SIGCOMM2016 (x2)
- Evolve or Die: High-Availability Design Principles Drawn from Google's Network Infrastructure
- RDMA over Commodity Ethernet at Scale
# OperatingSystem (x2)
# NSDI2018 (x2)
- Andromeda: Performance, Isolation, and Velocity at Scale in Cloud Network Virtualization
- Azure Accelerated Networking: SmartNICs in the Public Cloud
# NIC (x2)
# Microsecond (x2)
# HotOS2019 (x2)
- Project PBerry: FPGA Acceleration for Remote Memory
- Nines are Not Enough: Meaningful Metrics for Clouds
# FPGA (x2)
- Project PBerry: FPGA Acceleration for Remote Memory
- Azure Accelerated Networking: SmartNICs in the Public Cloud
# ASIC (x2)
- In-Datacenter Performance Analysis of a Tensor Processing Unit
- ASIC Clouds: Specializing the Datacenter
# arXiv (x2)
- Unifying Data, Model and Hybrid Parallelism in Deep Learning via Tensor Tiling
- ALEX: An Updatable Adaptive Learned Index
# VLDB2020 (x1)
# SysML2019 (x1)
# SSD (x1)
# SOSP2017 (x1)
# SoCC2017 (x1)
# SLO (x1)
# SIGCOMM2015 (x1)
# Sharding (x1)
# Serverless (x1)
# RemoteMemory (x1)
# RDMA (x1)
# OSDI2018 (x1)
# OSDI2016 (x1)
# OldPaper (x1)
# MultiTenant (x1)
# MLSys2020 (x1)
# MLsys (x1)
# ML4Sys (x1)
# Microsoft (x1)
# Measurement (x1)
# LoadBalance (x1)
# Latency (x1)
# ISCA2017 (x1)
# ISCA2016 (x1)
# HotCloud2019 (x1)
# FileSystem (x1)
# DistributedSystem (x1)
# CUDA (x1)
# Clos (x1)
# CacheCoherence (x1)
# Blockchain (x1)
# Benchmark (x1)
# Availability (x1)
© 2020 ScratchPad ― Powered by Jekyll and Textlog theme