Elastic Synchronization for Efficient and Effective Distributed Deep Learning

Zhao, Xing

Elastic Synchronization for Efficient and Effective Distributed Deep Learning

dc.contributor.advisor	An, Aijun
dc.contributor.author	Zhao, Xing
dc.date.accessioned	2020-11-13T13:54:09Z
dc.date.available	2020-11-13T13:54:09Z
dc.date.copyright	2020-07
dc.date.issued	2020-11-13
dc.date.updated	2020-11-13T13:54:09Z
dc.degree.discipline	Computer Science
dc.degree.level	Master's
dc.degree.name	MSc - Master of Science
dc.description.abstract	Training deep neural networks (DNNs) using a large-scale cluster with an efficient distributed paradigm significantly reduces the training time. However, a distributed paradigm developed only from system engineering perspective is most likely to hindering the model from learning due to the intrinsic optimization properties of machine learning. In this thesis, we present two efficient and effective models in the parameter server setting based on the limitations of the state-of-the-art distributed models such as staleness synchronous parallel (SSP) and bulk synchronous parallel (BSP). We introduce DynamicSSP model that adds smart dynamic communication to SSP, improves its communication efficiency and replaces its fixed staleness threshold with a dynamic threshold. DynamicSSP converges faster and to a higher accuracy than SSP in the heterogeneous environment. Having recognized the importance of bulk synchronization in training, we propose the ElasticBSP model which shares the proprieties of bulk synchronization and elastic synchronization. We develop fast online optimization algorithms with look-ahead mechanisms to materialise ElasticBSP. Empirically, ElasticBSP achieves the convergence speed 1.77 times faster and an overall accuracy 12.6% higher than BSP.
dc.identifier.uri	http://hdl.handle.net/10315/37937
dc.language	en
dc.rights	Author owns copyright, except where explicitly noted. Please contact the author directly with licensing requests.
dc.subject	Computer science
dc.subject.keywords	Distributed Deep Learning
dc.subject.keywords	BSP
dc.subject.keywords	ASP
dc.subject.keywords	SSP
dc.subject.keywords	SGD
dc.subject.keywords	Optimization
dc.title	Elastic Synchronization for Efficient and Effective Distributed Deep Learning
dc.type	Electronic Thesis or Dissertation

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Zhao_Xing_2020_Masters.pdf
Size:: 6.34 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 2 of 2

Name:: license.txt
Size:: 1.83 KB
Format:: Plain Text
Description:

Download

Name:: YorkU_ETDlicense.txt
Size:: 3.36 KB
Format:: Plain Text
Description:

Download

Collections

Computer Science and Engineering