07-05 [论文笔记] Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training