Authors I.I. Levin, A.V. Pelipets, D.A. Sorokin
Month, Year 07, 2015 @en
Index UDC 004.382.2
Abstract This paper examines the estimation of reconfigurable computer systems to solving LU-decomposition of a square matrix. Factorizations of matrix into low/upper triangular form are in the base of many algorithms for performing numerical linear algebra computations. The decomposition of a square matrix into a lower triangular matrix L and an upper triangular matrix U has cubic complexity with respect to the size of the linear system. Therefore supercomputers are long used to solve complex tasks with large-scale massive data. Despite the regular performance expansion of cluster supercomputers power, their performance is limited in tasks of linear algebra. This primarily concerns computational costs related to interprocessor communication and storage of partial result. Unlike cluster supercomputers, reconfigurable computer systems based on FPGA technology enable LU-factorization of a large-size matrix (n=104) by real-time processing, without using external memory. This method is feasible in the presence of hardware resource for pipe-line implementation of the information graph. Preliminary studies shows that modern reconfigurable computer systems makes such implementation possible, but a specific performance rating of supercomputer module is low. If increasing of clock frequency and FPGA logical resource will continue, LU-factorization implementation using single reconfigurable computational module.

Download PDF

Keywords Reconfigurable computer systems; FPGA; LU-factorization; Linpack Benchmark; specific performance.
References 1. Charles L. Byrne. Applied and Computational Linear Algebra: A First Course. University of Massachusetts Lowell, 2013, pp. XXIII-XXIV.
2. John R. Bacon, Thomas P. Kendall, Thomas Mussmann, Robert Palais, Victor E. Trujillo, II, Frank Wattenberg. Climate Science: Why Mathematicians Should Be Interested, Electronic Proceedings of the Twenty-fifth Annual International Conference on Technology in Collegiate Mathematics, Boston, Massachusetts. March 21-24, 2013, pp. 351-387.
3. Piotr Luszczek, Jakub Kurzak, Jack Dongarra. Looking back at dense linear algebra software, Journal of Parallel and Distributed Computing, July 2014, Vol. 74, Issue 7, pp. 2548-2560.
4. Jack Dongarra, Piotr Luszczek. LINPACK Benchmark. Encyclopedia of Parallel Computing. Springer US, 2011, pp. 1033-1036.
5. Voevodin V.V., Kuznetsov Yu.A. Matritsy i vychisleniya [Matrix and calculations]. Moscow: Nauka. Glavnaya redaktsiya fiziko-matematicheskoy literatury, 1984, 320 p.
6. Kurzak J., Luszczek P., Faverge M., Dongarra J. LU Factorization with Partial Pivoting for a Multicore System with Accelerators, IEEE Transactions on Parallel & Distributed Systems, Aug. 2013, Vol. 24, No. 8, pp. 1613-1621.
7. Yamazaki I., Li X. New scheduling strategies and hybrid programming for a parallel right-looking sparse LU factorization algorithm on multicore cluster systems. IPDPS, 2012, pp. 619-630.
8. Badawy M.O., Hanafy Y.Y., Eltarras R. LU factorization using multithreaded system. Computer Theory and Applications (ICCTA), 2012, pp. 9-14.
9. Agullo E., Augonnet C., Dongarra J., Faverge M., Langou J., Ltaief H., Tomov S. LU factorization for accelerator-based systems, 9th ACS/IEEE International Conference on Computer Systems and Applications (AICCSA 11). Sharm El-Sheikh, Egypt, 2011, pp. 217-224.
10. Kalyaev I.A., Levin I.I., Semernikov E.A., Shmoylov V.I. Rekonfiguriruemye mul'tikonveyernye vychislitel'nye struktury [Multiconference reconfigurable computing structures]. Rostov-on-Don: YuNTs RAN, 2008, 397 p.
11. Wei Wu, Yi Shan, Xiaoming Chen, Yu Wang, Huazhong Yang. FPGA Accelerated Parallel Sparse Matrix Factorization for Circuit Simulations. 7th International Symposium, ARC 2011, Belfast, UK, March 23-25, 2011, pp. 302-315.
12. Prawat Nagvajara, Chika Nwankpa, Jeremy Johnson. Reconfigurable Hardware Accelerators for Power Transmission System Computation. High Performance Computing in Power and Energy Systems. Springer Berlin Heidelberg, 2013, pp. 211-228.
13. Available at: (accessed 25 June2015).
14. Starchenko A.V., Bertsun V.N. Metody parallel'nykh vychisleniy: Uchebnik [Methods parallel computing: a Tutorial]. Tomsk: Izd-vo Tom. un-ta, 2013, pp. 11.
15. Kalyaev I.A., Levin I.I., Semernikov E.A. Printsipy postroeniya mnogoprotsessornykh vychislitel'nykh sistem na osnove PLIS [The principles of multiprocessor systems based on FPGA], Vestnik Buryatskogo gosudarstvennogo universiteta. Ser. 9: matematika i informatika
[Bulletin of the Buryat state University. 9 series: Mathematics and Informatics]. Ulan-Ude: Izd-vo Buryatsk. gos. un-ta, 2008, pp. 184-196.
16. Ortega Dzh. Vvedenie v parallel'nye i vektornye metody resheniya lineynykh system [Introduction to parallel and vector methods for solving linear systems]. Moscow: Mir, 1991, 376 p.
17. Available at: (accessed 25 June 2015).
18. Voevodin V.V. Vychislitel'nye osnovy lineynoy algebry [Computational principles of linear algebra]. Moscow: Nauka, 1977, 304 p.
19. Sorokin D.A. Metody resheniya zadach s peremennoy intensivnost'yu potokov dannykh na rekonfiguriruemykh vychislitel'nykh sistemakh. Dis. kand. tekhn. nauk [Problem-solving methods with variable intensity of the data streams on reconfigurable computing systems. Cfnd. of eng. sc. diss.]. Taganrog, 2012, pp. 51-58.
20. Tarasov I. Evolyutsiya PLIS serii Virtex [Evolution is a series of FPGAs Virtex], Komponenty i tekhnologii [Components and Technologies], 2005, No. 1.
21. Tarasov I. Analiz predvaritel'nykh kharakteristik FPGA «serii 7» firmy Xilinx [The analysis of the characteristics of FPGA "series 7" by Xilinx], Komponenty i tekhnologii [Components and Technologies], 2010, No. 8.
22. Tarasov I. Opisanie arkhitektury FPGA semeystv UltraScale kompanii Xilinx [Description of the architecture of the UltraScale FPGA families of Xilinx company], Komponenty i tekhnologii [Components and Technologies], 2014, No. 2.

Comments are closed.