|Article title||PROGRAMMING OF HYBRID COMPUTER SYSTEMS IN THE PROGRAMMING LANGUAGE COLAMO|
|Authors||A.I. Dordopulo, I.I. Levin, I.A. Kalyaev, V.A. Gudkov, A.A. Gulenok|
|Section||SECTION II. MATHEMATICAL AND SOFTWARE OF SUPERCOMPUTERS|
|Month, Year||11, 2016 @en|
|Abstract||The paper covers programming methods for hybrid computer systems which contain recon-figurable and microprocessor computational nodes. The base of the programming technology for hybrid computer systems is the high-level programming language COLAMO with extensions, which allow descriptions of various types of parallel calculations such as structural, structural-procedural, multi-procedural and procedural forms of organization of calculations in a unified parallel-pipeline form. The suggested parallel-pipeline form allows modifications of forms of the calculations organization. Such modifications are performed automatically by the COLAMO lan-guage preprocessor, which takes into account current configuration of the hybrid computer system. On the base of the canonical form and possibilities of description of various forms of the cal-culations organization in the high-level programming language COLAMO we suggest a technology of resource independent programming. Owing to the suggested technology, the program can be automatically adapted to the changed architecture or configuration of the hybrid computer system without any modifications of the source code made by the developer. Especially for this the source parallel program, developed in the programming language COLAMO, is transformed by the pre-processor into the canonical form (all arrays and variables of the program must provide both parallel and sequential access to both items and bits, all fragments of calculations are described as implicitly parallel by means of the structure Implicit). Then the pre-processor estimates the available computational resource, detects effective parameters of implementation of the program on the available resource and, if necessary, reduces the program performance to adapt it to the current configuration of the hybrid computer system. The performance reduction is a complex of methods, which in a balanced way reduce the performance of the application. In several cases it leads to reduction of hardware resource taken by the task, and besides, owing to change of organization of calculations, it becomes possible to occupy free nodes of the hybrid computer system. The technology provides two-way scaling: for increasing of the available computational resource (induction), and for reducing the available computational resource (reduction), which provides resource independence of programming during implementation of the program, i.e. the developer is not “bound” to the available hardware resource of the computer system.|
|Keywords||Performance reduction; high-level programming language; programming of hybrid com-puter systems; technology of resource-independent programming.|
|References||1. Ilya Levin, Alexey Dordopulo, Vasiliy Kovalenko, Viacheslav Gudkov, Andrey Gulenok. Pro-gramming tools for reconfigurable computer systems based on Virtex-7 FPGAs with using soft-architectures, 13th International Conference on Parallel Computing Technologies (PaCT-2015), Petrozavodsk, Russia, August 31-September 4, 2015, pp. 349-362.
2. Dong X, Chai J, Yang J, Wen M, Wu N, Cai X, Zhang C, Chen Z. Utilizing multiple xeon Phi coprocessors on one compute node, 14th International Conference on Algorithms and Archi-tectures for Parallel Processing, ICA3PP 2014; Dalian; China; 24 August 2014 through
27 August 2014; Code 107001, 2014. – Vol. 8631 LNCS, Issue PART 2. – Р. 68-81.
3. Liang T.-Y., Li H.-F., Lin Y.-J., Chen B.-S. A Distributed PTX Virtual Machine on Hybrid CPU/GPU Clusters // Journal of Systems Architecture. 1 January 2016, Vol. 62, pp. 63-77.
4. Li H.-F., Liang T.-Y., Lin Y.-J. An OpenMP programming toolkit for hybrid CPU/GPU clusters based on software unified memory, Journal of Information Science and Engineering, May 2016, Vol. 32, Issue 3, pp. 517-539.
5. Evstigneev N.M., Ryabkov O.I. Primenenie arkhitektury multiGPU+CPU dlya zadach pryamogo chislennogo modelirovaniya laminarno-turbulentnogo perekhoda pri rassmotrenii zadach v kachestve nelineynykh dinamicheskikh sistem [Application architecture multiGPU + CPU tasks for direct numerical simulation of laminar-turbulent transition in considering problems as nonlinear dynamic systems], Parallel'nye vychislitel'nye tekhnologii (PAVT'2016) [Parallel computational technologies (PCT’2016)]. Chelyabinsk: Izdatel'skiy tsentr YuUrGU, 2016, pp. 141-154.
6. Dordopulo Aleksey, Levin Ilya, Kalyaev Igor, Gudkov Vyacheslav, Gulenok Andrey. Pro-gramming of hybrid computer systems based on the performance reduction method, Parallel Computing Technologies (PCT 2016), Proceedings of the 10th Annual International Scientific Conference on Parallel Computing Technologies, Arkhangelsk, Russia, 2016, pp. 131-140.
7. El-Araby E., Taher M., Abouellail M., El-Ghazawi T., Newby G.B. Comparative analysis of high level programming for reconfigurable computers: Methodology and empirical study, 2007 3rd Southern Conference on Programmable Logic, SPL'07; Mar del Plata; Argentina; 26 February 2007 through 28 February 2007; Category number07EX1511; Code 70259. 2007, Article number 4234328, pp. 99-106.
8. Xu J, Subramanian N, Alessio A, Hauck S. Impulse C vs. VHDL for accelerating tomographic reconstruction, 18th IEEE International Symposium on Field-Programmable Custom Compu-ting Machines, FCCM 2010; Charlotte, NC; United States; 2 May 2010 through 4 May 2010; Category numberP4056; Code 80904. 2010, Article number 5474054, pp. 171-174.
9. Gorodnichev M.A., Duchkov A.A., Sarychev V.G. Programmnaya realizatsiya metoda kogerent-nogo summirovaniya na GPU s ispol'zovaniem programmnoy modeli NVIDIA CUDA [Software implementation of the method of coherent summation on the GPU with NVIDIA CUDA programming model], Parallel'nye vychislitel'nye tekhnologii (PaVT’2016) [Parallel computational technologies (PCT’2016)]. Available at: https://www.agora.guru.ru/pavt, pp. 118-130.
10. Kalyaev I.A., Dordopulo A.I., Levin I.I., Gudkov V.A., Gulenok A.A. Tekhnologiya pro-grammirovaniya vychislitel'nykh sistem gibridnogo tipa [The technology of computer pro-gramming hybrid systems], Vychislitel'nye tekhnologii [Computational technologies], 2016, Vol. 21, No. 3, pp. 33-44. ISSN 1560-7534.
11. Semernikova E.E., Levin I.I., Gudkov V.A. Organizatsiya bitovoy obrabotki dannykh dlya rekonfiguriruemykh vychislitel'nykh sistem na yazyke programmirovaniya vysokogo urovnya [Organization of bit data processing for reconfigurable computer systems in the high-level programming language], Vestnik komp'yuternykh i informatsionnykh tekhnologiy [Herald of computer and information technologies], 2015, No. 5, pp. 3-9.
12. Dordopulo A.I., Levin I.I., Kalyaev I.A., Gudkov V.A., Gulenok A.A. Parallel'no-konveyernaya forma programmy kak osnova programmirovaniya vychislitel'nykh sistem gibridnogo tipa [In parallel-pipelined form of the program as a basis for programming hybrid computing systems], Vestnik UGATU [Bulletin of the South Ural State University], 2016, Vol. 20 (73), No. 3,
pp. 122-128. ISBN 978-5-9275-1980-7.
13. Danilov I.G., Dordopulo A.I., Kalyaev Z.V., Levin I.I., Gudkov V.A., Gulenok A.A. and Bovkun A.V. Distributed Monitoring System For Reconfigurable Computer Systems, Procedia Computer Science, 2016, No. 101, pp. 341-350.
14. Antonov A.S., Voevodin Vad V., Daugel'-Dauge A.A., Zhumatiy S.A., Nikitenko D.A., Sobolev S.I., Stefanov K.S., Shvets P.A. Obespechenie operativnogo kontrolya i effektivnoy avtonomnoy raboty Superkomp'yuternogo kompleksa MGU [Providing run-time control and effective offline work of MSU Supercomputer complex], Vestnik Yuzhno-Ural'skogo gosudarstvennogo universiteta. Seriya "Vychislitel'naya matematika i informatika" [Bulletin of South Ural State University. Series: Computational mathemathics and informatics], 2015,
Vol. 4 (2), pp. 33-43.
15. Movchan A.V., Tsymbler M.L. Parallel'naya realizatsiya poiska samoy pokhozhey podposledovatel'nosti vremennogo ryada dlya sistem s raspredelennoy pamyat'yu [Parallel im-plementation of the search itself like a subsequence of the time series for systems with distrib-uted memory], Parallel'nye vychislitel'nye tekhnologii (PaVT’2016) [Parallel computational technologies (PCT’2016)]. Chelyabinsk: Izdatel'skiy tsentr YuUrGU, 2016, pp. 615-628.
16. Bakanov V.M. Upravlenie dinamikoy vychisleniy v protsessorakh potokovoy arkhitektury dlya razlichnykh tipov algoritmov [Management dynamic computing processors in a streaming ar-chitecture for various types of algorithms], Programmnaya inzheneriya [Software Engineering], 2015, No. 9, pp. 20-24.
17. Konstantin Barkalov, Victor Gergel, Ilya Lebedev. Use of Xeon Phi Coprocessor for Solving Global Optimization Problems, Parallel Computing Technologies, 2015, pp. 307-318. DOI: 10.1007/978-3-319-21909-7_31.
18. Konstantin Y. Besedin, Pavel S. Kostenetskiy, Stepan O. Prikazchikov. Using Data Compression for Increasing Efficiency of Data Transfer Between Main Memory and Intel Xeon Phi Coprocessor or NVidia GPU in Parallel DBMS, Procedia Computer Science, 2015. Vol. 66, pp. 635-641.
19. Bernard Goossens DALI, David Parello, Katarzyna Porada, Djallal Rahmoune. Toward a Core Design to Distribute an Execution on a Manycore Processor, Proceedings of the 13th In-ternational Conference on Parallel Computing Technologies, 2015, Vol. 9251, pp. 390-404. ISBN: 978-3-319-21908-0. DOI: 10.1007/978-3-319-21909-7_38.
20. Pavel Pavlukhin, Igor Menshov. On Implementation High-Scalable CFD Solvers for Hybrid Clusters with Massively-Parallel Architectures, Lecture Notes in Computer Science, 2015, Vol. 9251, pp. 436-444.