Evaluating The Scalability of Distributed Neural Networks in High-Performance Computing

SNVASRK Prasad; Aythepally Lakshmi Narayana; Prasanthi Potnuru

doi:10.33425/3066-1226.1114

Global Journal of Engineering Innovations and Interdisciplinary Research

Evaluating The Scalability of Distributed Neural Networks in High-Performance Computing

SNVASRK Prasad, Aythepally Lakshmi Narayana, Prasanthi Potnuru

10.33425/3066-1226.1114

This study investigates the scalability of distributed neural networks (DNNs) in high-performance computing (HPC) environments, focusing on the comparative analysis of horizontal and vertical scaling methods. By distributing neural network training across multiple nodes and upgrading individual nodes, we assess key metrics such as training time, speedup, efficiency, and resource utilization. Our experimental results demonstrate that horizontal scaling significantly reduces training time but introduces challenges in efficiency due to communication overhead and synchronization costs. Conversely, vertical scaling offers improved resource utilization and maintains high efficiency, though its scalability is constrained by hardware limitations. A hybrid approach, combining both scaling strategies, is shown to optimize performance by balancing resource utilization and computational efficiency. These findings provide valuable insights into optimizing distributed neural network training, highlighting the trade-offs and potential of different scaling methods in HPC settings.

PDF