C3w3-reading: Data Parallelism link gives PipeDream paper

In the first Reading (for High-Performance Modeling) section of course 3 week 3, the hyperlink for Data parallelism gives the PipeDream paper, which is Microsoft’s library for pipeline parallelism, not data parallelism… … right ?

…turns out it’s a mixture of both:

“This paper describes PipeDream, a new distributed training system specialized for DNNs. Like model parallelism, it partitions the DNN and assigns subsets of layers to each worker machine. But, unlike traditional model parallelism, PipeDream aggressively pipelines minibatch processing, with different workers processing different inputs at any instant of time. This is accomplished by injecting multiple inputs into the worker with the first DNN layer, thereby keeping the pipeline full and ensuring concurrent processing on all workers. It also uses data parallelism for selected subsets of layers to balance computation load among workers. We refer to this combination of pipelining, model parallelism, and data parallelism as pipeline-parallel training.”
page1-2 of Harlap et al. 2018 [https://arxiv.org/pdf/1806.03377.pdf](https://PipeDream: Fast and Efficient Pipeline Parallel DNN Training)