Article

Article title APPLYING DEEP LEARNING FOR SOLVING THE TASKS OF SELF-DIAGNOSIS OF DISTRIBUTED COMPUTER SYSTEMS
Authors K.E. Kramarenko, O.V. Moldovanova
Section SECTION III. DISTRIBUTED COMPUTING AND SYSTEMS
Month, Year 11, 2016 @en
Index UDC 004.052.32
DOI 10.18522/2311-3103-2016-11-113120
Abstract The article is devoted to solving the problem of self-distributed computer systems, which consist of a plurality of elementary machines (nodes), interconnected by channels of communication. With increasing number of nodes in the system the probability of faults increases. Fault is an event when the elementary machine loses its ability to perform specified functions of information processing. Fault of one node involved in the computation process can lead to incorrect result of calculations and have devastating consequences for the entire distributed computer system. There-fore, the urgent problem is the development of self-diagnostic algorithms, the aim of which is to identify the fault and fault-free system nodes using the given syndrome of the distributed computer system. This problem can be reduced to the problem of classification which is effectively solved by the deep learning algorithms. The paper presents the statement and limitation of the problem of decoding the distributed computer system syndrome; the description of the developed algorithm for decoding the syndrome of distributed computer systems on the basis of the convolutional neural network and the algorithm for training samples generation. Software implementation of the developed algorithms was performed using DeepLearnToolBox package in Matlab interactive environment. Experiments on test training samples with different numbers of nodes in a distributed computer system and different number of faulty nodes are carried out. The following convolutional neural network hyperparameters are experimentally selected: length of the training sample, number of training epochs, convolution kernel step, number and size of the convolution kernels in the layer, number of layers in convolutional neural network. The algorithm efficacy was evaluated by dependency of quantity of the accurately diagnosed nodes from the total number of fault nodes in the distributed computer system. Experiments have shown that the algorithm should be used in distributed computer systems with the number of fault nodes not more than 30 % of their total number. Despite the short length of the training samples, the network maintains a good generalizing ability.

Download PDF

Keywords Self-diagnosis; distributed computer systems; artificial neural networks; deep learning; fault-tolerance; convolutional neural networks.
References 1. Khoroshevskiy V.G. Arkhitektura vychislitel'nykh sistem: ucheb. posobie [Architecture of computer systems: a training manual]. 2nd ed. Moscow: Izd-vo MGTU im. N.E. Baumana, 2008, 520 p.
2. Preparata F.P., Metze G., Chien R.T. On the Connection Assignment Problem of Diagnosable Systems, IEEE Trans. Electron. Comput., 1967, Vol. EC-16, No. 6, pp. 848 854.
3. Barsi F., Grandoni F., Maestrini P. A Theory of Diagnosability of Digital Systems, IEEE Trans. Comput., 1976, Vol. 25, No. 6, pp. 585-593.
4. Chwa K.Y., Hakimi S.L. Schemes for fault tolerant computing: a comparison of modularly redundant and t-diagnosable systems, Information and Control, 1981, No. 49. pp. 212-238.
5. Malek M. A comparison connection assignment for diagnosis of multiprocessor systems, Proc. 7th International symposium on computer architecture, New York, 1980, pp. 31-35.
6. Duarte Jr., E.P., Ziwich, R.P., Albini, L.C.P. A survey of comparison-based system-level di-agnosis, ACM Comput. Surv. 43, 3, Article 22, 2011, 56 p.
7. Elhadef M. A modified Hopfield neural network for diagnosing comparison-based multi-processor systems using partial syndromes, ICPADS, 2011, Parallel and Distributed Systems, International Conference on, Parallel and Distributed Systems, International Conference on 2011, pp. 646-653.
8. Elhadef M., Nayak A. Comparison-Based System-Level Fault Diagnosis: A Neural Network Approach, IEEE Transactions on Parallel & Distributed Systems, 2012, Vol. 23, No. 6,
pp. 1047-1059.
9. Elhadef M., Romdhane L.B. Fault diagnosis using partial syndromes: a modified Hopfield neural network approach, International Journal of Parallel, Emergent and Distributed Systems, 2014, Vol. 29, No. 2, pp. 119-146.
10. Osovskiy S. Neyronnye seti dlya obrabotki informatsii [Neural networks for information pro-cessing]: translation from Polish by I.D. Rudinskogo. Moscow: Finansy i statistika, 2004, 344 p.
11. Petrov S.P. Svertochnaya neyronnaya set' dlya raspoznavaniya simvolov nomernogo znaka avtomobilya [Convolutional neural network for character recognition of license plate of the car], Sistemnyy analiz v nauke i obrazovanii [System analysis in science and education], 2013, No. 3, pp. 66-73.
12. Fralenko V.P., Suvorov R.E., Ovcharenko R.I., Tikhomirov I.A. Avtomaticheskaya klassifikatsiya izobrazheniy v zadachakh fil'tratsii kontenta [Automatic classification of images in content filtering], Informatsionnye tekhnologii i vychislitel'nye sistemy [Information technologies and computing systems], 2015, No. 3, pp. 3-11.
13. Zhang N., Donahue J., Girshick R., Darrell T. Partbased r-cnns for fine-grained category de-tection, In Computer Vision, 2014, pp. 834-849.
14. Zeiler M.D., Fergus R. Visualizing and understanding convolutional networks, In Computer Vision, 2014, pp. 818-833.
15. Jia Y., Shelhamer E., Donahue J., Karayev S., Long J., Girshick R. Caffe: Convolutional archi-tecture for fast feature embedding, Proceedings of the 22nd ACM international conference on Multimedia, 2014, pp. 675-678.
16. Donahue J., Hendricks L.A., Guadarrama S., Rohrbach M., Venugopalan S., Saenko K., Dar-rell T. Long-term recurrent convolutional networks for visual recognition and description, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015,
pp. 2625-2634.
17. Girshick R., Donahue J., Darrell T., Malik J. Region-based convolutional networks for accurate object detection and segmentation, IEEE transactions on pattern analysis and machine in-telligence, 2016, Vol. 38, pp. 142-158.
18. Hoffman J., Guadarrama S., Tzeng E.S., Hu R., Donahue J., Girshick R., Darrell T, Saenko K. LSDA: Large scale detection through adaptation, Advances in Neural Information Processing Systems, 2014, pp. 3536-3544.
19. Mohamed A., Dahl G.E., Hinton G. Acous-tic modeling using deep belief networks, Audio, Speech, and Language Processing, IEEE Transactions on, 2012, Vol. 20, No. 1, pp. 14-22.
20. Krizhevsky A., Sutskever I. & Hinton G. ImageNet classification with deep convolu-tional neural networks, Nature, 2015, Vol. 521, pp. 436-444.

Comments are closed.