Authors M.G. Kurnosov
Month, Year 11, 2016 @en
Index UDC 004.272
DOI 10.18522/2311-3103-2016-11-7587
Abstract Performance of collective communications (one-to-all broadcast/scatter, all-to-one gather/reduce, all-to-all gather) is critical for high-performance large-scale computer systems and parallel algorithms. In this paper, analytical expressions in LogP model for execution time of collective operations are proposed. In contrast to the well-known results, the proposed expressions are constructed for general and special cases of parameters and values of the computer system and collective operation (number of processes, selection of root process). Such estimates are needed for the analysis of algorithms scalability on large-scale computer systems. Expressions are obtained as functions of process number. This makes it possible to analyze the imbalance of process loading. The LogP model is extended by parameter [lambda] - a time for transferring a one byte message over shared memory. Approach for building optimal algorithms in LogP model is demonstrated on example of k-chain algorithm. For this algorithm the optimal value of k in the LogP model is found. A new algorithm based on the optimal value of k is developed. The dependence of the execution time of the proposed algorithm on the number of processes has a growth rate of O(√P), which is more efficient compared to the linear running time of the original k-chain algorithm. The proposed algorithms are implemented in the MPI standard and studied on computer clusters with InfiniBand QDR networks. The choice of the model of parallel computation depends on the specifics of the algorithm and the target computer system. For example, if the algorithm realizes the groups of messages in larger packages, it is advisable to use the LogGP model.

Download PDF

Keywords Collective operations; reduce; LogP; MPI; parallel programming; message passing.
References 1. Khoroshevskiy V.G. Raspredelennye vychislitel'nye sistemy s programmiruemoy strukturoy
[A distributed computing system with programmable structure], Vestnik SibGUTI [Vestnik SibGUTI], 2010, No. 2, pp. 3-41.
2. Stepanenko S.A. Mul'tiprotsessornye sredy superEVM. Masshtabirovanie effektivnosti [Multi-processor supercomputer environment. Scaling efficiency]. Moscow: Fizmatlit, 2016, 312 p.
3. Hoefler T., Moor D. Energy, Memory, and Runtime Tradeoffs for Implementing Collective Communication Operations, Journal of Supercomputing Frontiers and Innovations, 2014,
Vol. 1, No. 2, pp. 58-75.
4. Balaji P., Buntinas D., Goodell D., Gropp W., Hoefler T., Kumar S., Lusk E., Thakur R., Traff J. MPI on Millions of Cores, Parallel Processing Letters, 2011, Vol. 21, Issue 1, pp. 45-60.
5. Alverson R., Roweth D., Kaplan L. The Gemini System Interconnect, International Symposium on High Performance Interconnects, 2010, pp. 83-87.
6. Eisley N., Heidelberger P., Senger R. The IBM Blue Gene/Q interconnection network and message unit, International Conference for High Performance Computing, Networking, Storage and Analysis, 2011, pp. 1-10.
7. Abramov S.M., Zadneprovskiy V.F., Shmelev A.B., Moskovskiy A.A. SuperEVM ryada
4 semeystva. SKIF: shturm vershiny superkomp'yuternykh tekhnologiy [Supercomputers series 4 family. SKIF: the summit supercomputer technologies], Vestnik NNGU [Vestnik of Lobachevsky University of Nizhni Novgorod], 2009, No. 5, pp. 200-210.
8. Levin V.K., Chetverushkin B.N., Elizarov G.S., Gorbunov V.S., Latsis A.O., Korneev V.V., Sokolov A.A., Andryushin D.V., Klimov Yu.A. Kommunikatsionnaya set'MVS-Ekspress [Communication, settings Express], Informatsionnye tekhnologii i vychislitel'nye sistemy [In-formation technologies and computing systems], 2014, No. 1, pp. 10-24.
9. Simonov A.S., Makagon D.V., Zhabin I.A., Shcherbak A.N., Syromyatnikov E.L., Polyakov D.A. Pervoe pokolenie vysokoskorostnoy kommunikatsionnoy seti Angara [The first generation of high-speed communications network of the Hangar], Naukoemkie tekhnologii [High-end tech-nology], 2014, Vol. 15, No. 1, pp. 21-28.
10. Thakur R., Rabenseifner R., Gropp W. Optimization of collective communication operations in MPICH, Int. Journal of High Performance Computing Applications, 2005, Vol. 19 (1), pp. 49-66.
11. Bruck J. [et al.]. Efficient Algorithms for All-to-All Communications in Multiport Message Passing Systems, IEEE Trans. Parallel Distrib. Syst., 1997, Vol. 8 (11), pp. 1143-1156.
12. Kurnosov M.G. Algoritmy translyatsionno-tsiklicheskikh informatsionnykh obmenov v ierarkhicheskikh raspredelennykh vychislitel'nykh sistemakh [Algorithms broadcast-cyclic in-formation exchange in a hierarchical distributed computing systems], Vestnik komp'yuternykh i informatsionnykh tekhnologiy [Herald of computer and information technology], 2011, No. 5, pp. 27-34.
13. Avetisyan A.I., Gaysaryan S.S., Ivannikov V.P., Padaryan V.A. Prognozirovanie proizvo-ditel'nosti MPI-programm na osnove modeley [Performance prediction of MPI programs based on models ], Avtomatika i telemekhanika [Automation and remote control], 2007, Issue 5, pp. 8-17.
14. Pjesivac-Grbovic J., Angskun T. [et al.]. Performance analysis of MPI collective operations, Cluster Computing, 2007, Vol. 10, pp. 127-143.
15. Hoefler T., Schneider T., Lumsdaine A. LogGOPSim – Simulating LargeScale Applications in the LogGOPS Model, Proc. of Int. Symposium on High Performance Distributed Computing, 2010, pp. 597-604.
16. Culler D., Karp R., Patterson D. [et al.]. LogP: Towards a Realistic Model of Parallel Compu-tation, ACM SIGPLAN Notices, 1993, Vol. 28, No. 7, pp. 1-12.
17. Kielmann T. [et. al.]. Fast Measurement of LogP Parameters for Message Passing Platforms, Proceedings of the 15 IPDPS 2000 Workshops on Parallel and Distributed Processing. Springer Verlag, 2000. pp. 1176-1183.
18. Fagg G., Pjesivac-Grbovic J., Bosilca G., Dongarra J., Jeannot E. Flexible collective com-munication tuning architecture applied to Open MPI, Proc. of Euro PVM/MPI, 2006, pp. 1-10.
19. Worsch T., Reussner R., Werner A. On Benchmarking Collective MPI Operations, Proceedings of the 9th EuroPVM/MPI Users' Group Meeting, 2002, pp. 271-279.
20. Kurnosov M.G. MPIPerf: paket otsenki effektivnosti kommunikatsionnykh funktsiy bibliotek standarta MPI [MPIPerf: the evaluation of the effectiveness of communication functions of MPI standard libraries], Vestnik Nizhegorodskogo universiteta im. N.I. Lobachevskogo [Vestnik of Lobachevsky University of Nizhni Novgorod], 2012, No. 5 (2), pp. 385-391.

Comments are closed.