DOI: 10.17587/prin.17.128-141
Detecting Multi-Step Attacks on IoT Devices Using Machine Learning and Big Data Processing Methods
I. V. Zelichenok, Junior Researcher, zelichenok@comsec.spb.ru,
I. V. Kotenko, D. Sc. (Eng.), Professor, ivkote@comsec.spb.ru,
Saint Petersburg Federal Research Center of the Russian Academy of Sciences (SPC RAS), 199178, Russian Federation
Corresponding author: Igor V. Kotenko, Professor, Saint Petersburg Federal Research Center of the Russian Academy of Sciences (SPC RAS), 199178, Russian Federation E-mail: ivkote@comsec.spb.ru
Received on July 29, 2025
Accepted on September 19, 2025
This paper presents an improved architecture of a network attack detection system using big data processing and deep learning methods. The architecture includes two modules with LSTM layers for short-term and long-term analysis of event chains, which allows one to track multi-stage attacks through the implementation of temporal analysis. The combination of modules allows one not only to promptly detect current threats, but also to reconstruct attack scenarios using big data and deep learning methods. This approach differs from existing solutions due to the ability to dynamically adapt to changing attack patterns and work in conditions of limited computing resources. The approach proposed in this paper incorporates big data processing techniques based on batch learning, parallel computing, and the use of database clusters capable of processing a large number of events per second, which reduces the response time of the security system to incidents. During testing on the Kitsune dataset, the system showed high results in multi-class classification, reaching 0.94 in accuracy and 0.91 in F1-measure, confirming its efficiency in conditions of limited resources. The obtained results confirm the potential of using the proposed NIDS architecture for detecting multi-step attacks and confirm its ability to work with big data even under conditions of limited computing resources.
Keywords: information security, multi-stage attacks, network attack detection, Internet of Things, machine learning, deep learning, neural networks, iterative learning, big data processing, dynamic NoSQL databases
pp. 128—141
For citation:
Zelichenok I. Y., Kotenko I. V. Detecting Multi-Step Attacks on IoT Devices Using Machine Learning and Big Data Processing Methods, Programmnaya Ingeneria, 2026, vol. 17, no. 3, pp. 128—141. DOI: 10.17587/prin.17.128-141. (in Russian).
References:
- Zelichenok I., Kotenko I. L/STIM: A Framework for Detecting Multi-Stage Cyber Attacks, 2024 International Russian Smart Industry Conference (SmartIndustryCon), IEEE, 2024, pp. 208—213. DOI: 10.1109/SmartIndustryCon61328.2024.10516137.
- Alabdulatif A., Rizvi S., Hashmani M. Optimal machine learning models for kitsune to detect mirai botnet malware attack, Journal of Hunan University Natural Sciences, 2021, vol. 48 (6), аrticle 12, рр. 91—102.
- Zhang L. Anomaly Detection Algorithm in Iot Environment Based on Deep Learning, SSRN 4791605, 2024. DOI: 10.2139/ssrn.4791605.
- Guida C., Nascita A., Montieri A., Pescape A. Cross-evaluation of deep learning-based network intrusion detection systems, 2023 10th International Conference on Future Internet of Things and Cloud (FiCloud), IEEE, 2023, pp. 328—335. DOI: 10.1109/FiCloud58648.2023.00055.
- Altamimi S., Abu Al-Haija Q. Maximizing intrusion detection efficiency for IoT networks using extreme learning machine, Discover Internet of Things, 2024, vol. 4, no. 1, article 5. DOI: 10.1007/s43926-024-00060-x.
- Branitskiy A., Kotenko I. Hybridization of computational intelligence methods for attack detection in computer networks, Journal of Computational Science, 2017, vol. 23, pp. 145—156. DOI: 10.1016/j.jocs.2016.07.010.
- Nascita A., Carillo R., Giampetraglia F. et al. Interpretability and Complexity Reduction in Iot Network Anomaly Detection Via XAI, 2024 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW), IEEE, 2024, pp. 325—329. DOI: 10.1109/ICASSPW62465.2024.10626031.
- Eke H. N., Petrovski A., Ahriz H. The use of machine learning algorithms for detecting advanced persistent threats, Proceedings of the 12th International Conference on Security of Information and Networks, 2019, pp. 1—8. DOI: 10.1145/3357613.335761.
- Ghafir I., Konstantinos G. K., Lambotharan S. et al. Hidden Markov Models and Alert Correlations for the Prediction of Advanced Persistent Threats, IEEE Access, 2019, vol. 7, pp. 99508—99520. DOI: 10.1109/ACCESS.2019.2930200.
- Sahabandu D., Allen J., Moothedath S. et al. Quickest Detection of Advanced Persistent Threats: A Semi-Markov Game Approach, 2020 ACM/IEEE 11th International Conference on Cyber-Physical Systems (ICCPS), Sydney, Australia, IEEE, 2020, pp. 9—19. DOI: 10.1109/ICCPS48487.2020.00009.
- Amin M. A. R. A., Shetty S., Njilla L. et al. Hidden Markov Model and Cyber Deception for the Prevention of Adversarial Lateral Movement, IEEE Access, 2021, vol. 9, pp. 49662—49682. DOI: 10.1109/ACCESS.2021.3069105.
- Kotenko I., Chechulin A., Novikova E. Attack Modelling and Security Evaluation for Security Information and Event Management, SECRYPT 2012 — Proceedings of the International Conference on Security and Cryptography, 2012, pp. 391—394. DOI: 10.5220/0004063403910394.
- Gualberto E. S., De Sousa R. T., De Brito Vieira T. P. et al. The Answer is in the Text: Multi-Stage Methods for Phishing Detection Based on Feature Engineering, IEEE Access, 2020, vol. 8, рр. 223529—223547. DOI: 10.1109/ACCESS.2020.3043396.
- Nie P., Xu G., Jiao L. et al. Sparse Trust Data Mining, IEEE Transactions on Information Forensics and Security, 2021, vol. 16, pp. 4559—4573. DOI: 10.1109/TIFS.2021.3109412.
- Guo Y., Sun Y., Wu K., Jiang K. New Algorithms of Feature Selection and Big Data Assignment for CBR System Integrated by Bayesian Network, ACM Transactions on Knowledge Discovery from Data, 2020, vol. 14, no. 2, article 23. DOI: 10.1145/337308.
- Vu T., Belussi A., Migliorini S., Eldway A. Using Deep Learning for Big Spatial Data Partitioning, ACM Transactions on Spatial Algorithms and Systems, 2021, vol. 7, no. 1, article 3. DOI: 10.1145/3402126.
- Yu J., Sarwat M. GeoSparkViz: a cluster computing system for visualizing massive-scale geospatial data, The VLDB Journal, 2021, vol. 30, pp. 237—258. DOI: 10.1007/s00778-020-00645-2.
- Kotenko I., Saenko I., Kushnerevich A. Parallel big data processing system for security monitoring in Internet of Things networks, J. Wirel. Mob. Netw. Ubiquitous Comput. Dependable Appl, 2017, vol. 8 (4), pp. 60—74. DOI: 10.22667/JOWUA.2017.12.31.060.
- Shu J., Fang K., Chen Y., Wang S. TH-iSSD: Design and Implementation of a Generic and Reconfigurable Near-Data Processing Framework, ACM Transactions on Embedded Computing Systems, 2023, vol. 22, no. 6, article 96. DOI: 10.1145/3563456.
- Javaheripi M., Chang J. W., Koushanfar F. AccHashtag: Accelerated Hashing for Detecting Fault-Injection Attacks on Embedded Neural Networks, ACM Journal on Emerging Technologies in Computing Systems, 2023, vol. 19, no. 1, article 7. DOI: 10.1145/3555808.
- Vincon T., Knodler C., Solis-Vasquez L. et al. Near-data processing in database systems on native computational storage under HTAP workloads, Proceedings of the VLDB Endowment, 2022, vol. 15, no. 10, pp. 1991—2004. DOI: 10.14778/3547305.3547307.
- Zhang G., Mariano B., Shen X., Dillig I. Automated Translation of Functional Big Data Queries to SQL, Proceedings of the ACM on Programming Languages, 2023, vol. 7, no. OOPSLA1, article 95. DOI: 10.1145/3586047.
- Ling Q., Tinkelman D. P. Costs and benefits of the LIFO-FIFO choice, Journal of Corporate Accounting & Finance, 2024, vol. 35, no. 3, pp. 11—20.
- Hochreiter S., Schmidhuber J. Long short-term memory, Neural computation, 1997, vol. 9, no. 8, pp. 1735—1780. DOI: 10.1002/jcaf.22712.
- Antonakakis M., April T., Bailey M. et al. Understanding the mirai botnet, 26th USENIX security symposium (USENIX Security 17), 2017, pp. 1093—1110.
- DeLaughter S., Sollins K. SYN Proof-of-Work: Improving Volumetric DoS Resilience in TCP, 2025 IEEE Symposium on Security and Privacy (SP),IEEE, 2025, pp. 1877—1890. DOI: 10.1109/ SP61157.2025.00166.
- Su L., Miao Y., Song Y. et al. Linear and Numerical SDoF Bounds of Active RIS-Assisted MIMO Wiretap Interference Channel, IEEE Open Journal of the Communications Society, 2025, vol. 6, pp. 5599—5610. DOI: 10.1109/OJCOMS.2025.3582404.
- Oei S., Suyanto Y., Pulungan R. A Comprehensive Approach for Detecting and Handling MitM-ARP Spoofing Attacks, IEEE Access, 2025, vol. 13, pp. 115503—115519. DOI: 10.1109/ACCESS.2025.3585463.
- Sharafaldin I., Lashkari A. H., Ghorbani A. Toward generating a new intrusion detection dataset and intrusion traffic characterization, Proceedings of the 4th International Conference on Information Systems Security and Privacy (ICISSP), 2018, vol. 1, pp. 108—116. DOI: 10.5220/0006639801080116.
- Mohi-ud-din G. NSL-KDD, IEEE Dataport, 2018. DOI: 10.21227/425a-3e55.
- Moustafa N., Slay J. UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set), 2015 Military Communications and Information Systems Conference (MilCIS), Canberra, ACT, Australia, 2015, pp. 1—6. DOI: 10.1109/MilCIS.2015.7348942.