DOI: 10.17587/prin.17.201-209
Technical Assurance of Server and Network Device Reliability based on Monitoring and Automatic Service Recovery
S. M. Kutsenko, Cand. Sc. (Ped.), Associate Professor, s.koutsenko@mail.ru,
E. A. Saltanaeva, Cand. Sc. (Eng.), Associate Professor, elena_maister@mail.ru,
Department of Information Technologies and Intelligent Systems, FSBEI of HE "Kazan State Power Engineering University", Kazan, 420066, Russian Federation
Corresponding author: Elena A. Saltanaeva, Associate Professor, Department of Information Technologies and Intelligent Systems, FSBEI of HE "Kazan State Power Engineering University", Kazan, 420066, Russian Federation, E-mail: elena_maister@mail.ru
Received on November 10, 2025
Accepted on December 10, 2025
In today's IT environment, it becomes critical not only to detect failures and incidents in a timely manner, but also to quickly eliminate them. The development of complex technical means to ensure reliable processing and transfer of information and automatic recovery of services is an urgent task for specialists of IT departments of enterprises. The solving this task with the help of a system for monitoring servers and network devices with the integration of automatic service recovery will ensure the reliable functioning of information systems. The proposed system is a framework of technical means of monitoring and control, including a convenient interface for visualizing data, setting up response rules and analyzing the history of events. The solution is aimed at ensuring fault and noise immunity of the IT infrastructure, reducing the downtime of critical services and minimizing the impact of failures on the business processes of the enterprise, which ultimately ensures reliable processing of information flows within the business processes of the enterprise. The technical implementation of the web interface uses new technical means of presenting information: the frontend part is designed using HTML5, CSS3 and pure JavaScript for maximum performance. The server side is implemented in Python using Flask, which provides fast request processing and integration with monitoring systems. The main advantages of the developed solution are a high degree of automation of recovery processes, which reduces the dependence on manual intervention, as well as the flexibility of customizing to specific customer requirements. The system has a modular structure that allows you to optimally use information resources and easily expand functionality by integrating with external IT service management platforms. Automatic service recovery is no longer an additional monitoring function, becoming a necessary component of modern highly available IT infrastructures. Implementing such mechanisms requires significant design and testing efforts, but provides a significant increase in the stability of services and a decrease in operating costs for their support.
Keywords: server monitoring, automatic service recovery, fault tolerance, Zabbix, monitoring system, technical support
pp. 201—209
For citation:
Kutsenko S. M., Saltanaeva E. A. Technical Assurance of Server and Network Device Reliability based on Monitoring and Automatic Service Recovery, Programmnaya Ingeneria, 2026, vol. 17, no. 4, pp. 201—209. DOI: 10.17587/prin.17.201-209.
References:
- Gafurov I. A., Sitnikov S. Yu. Recognition of data anomalies for predicting equipment failures, Mezhdunarodnyj zhurnal informacionnyh tekhnologij i energoeffektivnosti, 2023, vol. 8, no. 2 (28), pp. 9—12 (in Russian).
- Mustahitdinova Yu.A., Zaripova R. S. Features of the administration of information and computing systems, Informacionnye tekhnologii v stroitel'nyh, social'nyh i ekonomicheskih sistemah, 2021, no. 1 (23), pp. 143—145 (in Russian).
- Mel'chakov A. S., Lisin M. A., Vlasov S. A. Monitoring of technological equipment by means of Zabbix, Himiya. Ekologiya. Urbanistika. 2023. vol. 3, рр. 264—267 (in Russian).
- Irawati I. Network Monitoring System, Jurnal JE-Unisla, 2020, vol. 5, no. 2, p. 359. DOI: 10.30736/je.v5i2.456.
- Kucenko S. M. Development of an automated software testing system, Nauchno-tekhnicheskij vestnik Povolzh'ya, 2023, no. 11, pp. 224—227 (in Russian).
- Guznova E. S., Kazantsev M. A. Zabbix monitoring system for event analysis and forecasting, Advances in modern radio electronics, 2023, vol. 77, no. 12, pp. 131—135. DOI: 10.18127/j20700784-202312-17 (in Russian).
- Sobel M. G. Linux: administration and system programming: [complete guide to using commands, shells, utilities and editors in Linux and Mac OS]: Per. from English — N. Vilchinsky, St. Petersburg, 2011, 310 p. (in Russian).
- Belousova I. D., Buzueva M. V. Using an incident monitoring system in an IT company, Enterprise Engineering and Knowledge Management (EE&KM-2022): Collection of scientific papers of the XXV Russian scientific conference, 2022, vol. 1, pp. 22—27 (in Russian).
- Panov M. A., Ishchenko E. A. Modern systems for monitoring and alerting about events: ensuring optimal use of resources and functioning of information systems and processes, Dynamics of complex systems — XXI century, 2024, vol. 18, no. 1, pp. 18—31. DOI 10.18127/J19997493-202401-02 (in Russian).
- Mohd Fuzi M. F., Mohammad Ashraf N. F., Jamaluddin M. N. F. Integrated Network Monitoring using Zabbix with Push Notification via Telegram, Journal of Computing Research and Innovation, 2022, vol. 7, no. 1, pp. 147—155. DOI 10.24191/jcrinn.v7i1.282.
- Frolov A., Vereshchagina E. ZABBIX: setting up nodes and notification via telegram, Sistemnyj administrator, 2022, no. 12 (241), pp. 8—11 (in Russian).
- Zhalnin D. A., Stefanova I. A. Metrics used to monitor the server, Aktual'nye problemy informatiki, radiotekhniki i svyazi, Materialy XXXI Rossijskoj nauchno-tekhnicheskoj konferencii, Samara, Povolzhskij gosudarstvennyj universitet telekommunikacij i informatiki, 2024, pp. 284—286 (in Russian).
- Silva R. A. Da. A implementaijao do Zabbix com segu-ran9a: um estudo de caso Zabbix safely, Foco, 2024, vol. 17, no. 4, article e4851. DOI: 10.54751/revistafoco.v17n4-055.