|
ABSTRACTS OF ARTICLES OF THE JOURNAL "INFORMATION TECHNOLOGIES".
No. 7. Vol. 31. 2025
DOI: 10.17587/it.31.356-363
S. S. Kolmogorova1,2, Cand. of Tech. Sc., Associate Professor, S. A. Ivanov1, Cand. of Tech. Sc., Associate Professor, V. S. Pavlov1, Cand. of Agricult. Sc., Associate Professor,
1Saint Petersburg State Forest Technical University, St. Petersburg, Russian Federation,
2 Saint Petersburg Electrotechnical University "LETI", St. Petersburg, Russian Federation
Predictive Models for a System Collection Streaming Big Data System from Distributed Electroinduction Sensors
Received on 06.09.2024
Accepted on 11.01.2025
The article discusses various approaches to predicting parameters, in particular the characteristics of the electromagnetic field based on data from a distributed environment. Several classification models were investigated on a simulated sensor data set, covering three different groups of categorization methods: Bayesian methods based on Bayes' theorem (Naive Bayes and Multinomial Naive Bayes); decision tree methods, which are multi-option methods and the basic elements of the Decision Stump, Hoeffding Tree (Very Fast Decision Trees), Hoeffding Option Tree and Hoeffding Adaptive Tree algorithms; meta/ ensemble methods, which are the combination of a set of classification models that perform the same task, and solutions of individual models that are combined to determine the output. Experimental analysis has shown that the use of models can significantly improve forecasting efficiency. The methods proposed in the article are aimed at efficient classification. The article presented by the authors obtained results related to distributed machine learning. The first is the performance of classifiers with and without regularization in terms of the accuracy metric, the second is the relationship between the size of the dataset and this metric. In this work, in order to avoid overfitting and subsequent reduction in model accuracy, 1 regularization or Lasso regression is used. Regularization or Lasso regression is used. Thus, the results obtained are effectively implemented in a real-time system that measures streaming information about the parameters of the electromagnetic field.
Keywords: forecasting, categorization, electrometry, distributed data collection system, streaming big data, distributed system, classification models
P. 356-363
Full text on eLIBRARY
References
- Kolmogorova S. S., Kolmogorov À . S., Baranov D. S., Mokryak À . V. Electromagnetic Field Monitoring Platform for Ensuring Occupational and Industrial Facilities Safety, Occupational Safety in Industry, 2022, no. 2, doi: 10.24000/0409-2961-2022-2-58-63 (in Russian).
- Sterenczak K., Laurin G. V., Chirici G., Coomes D. A., Dalponte M., Latifi H., Puletti N. Global Airborne Laser Scanning Data Providers Database (GlobALS) — A New Tool for Monitoring Ecosystems and Biodiversity, Remote Sens., 2020, no. 12, p. 1877.
- Kutuzov M. E., Kolmogorova S. S. Use of recurrent neural network and prediction in data processing from temperature sensors for the forest industry, Information technologies and automation of management: Proceedings of the XIII All-Russian Scientific and Practical Conference of students, graduate students, educators and industry, May 27-28, 2022, Omsk, Omsk State Technical University, 2022, pp. 154—164 (in Russian).
- Certificate of state registration of computer program No. 2022666360 Russian Federation. Processing of measurement data by artificial intelligence algorithms: No. 2022665739: applied. 24.08.2022: published on 31.08.2022 / S. S. Kolmogorova, M. E. Kutuzov; applicant Federal State Budgetary Educational Institution of Higher Education "S. M. Kirov St. Petersburg State Forest Engineering University" (in Russian).
- Certificate of state registration of computer program No. 2023663221 Russian Federation. Distributed system of stream data preparation: ¹ 2023661886: applied. 06.06.2023: published on 20.06.2023 / S. S. Kolmogorova; applicant Federal State Budgetary Educational Institution of Higher Education "S. M. Kirov St. Petersburg State Forest Engineering University" (in Russian).
- Yan M., Liu P., Zhao R., Liu L., Chen W., Yu X., Zhang J. Microclimate Monitoring System based on Wireless Sensor Network, J. Intell. Fuzzy Syst., 2018, vol. 35, pp. 1325—1337.
- Kolmogorova S. S., Golubyatnikova N. O. On the application of big data structure regularization for the distributed system of collecting and forecasting parameters of observational objects, Vestnik of Voronezh State Technical University, 2022, vol. 18, no. 5, pp. 91—99, doi: 10.36622/VSTU.2022.18.5.012 (in Russian).
- Zhang A., Yao F. Analysis of Effective E-Commerce Coordination Big Data Processing Strategies under Infinite Deep Neural Network Topology, Proceedings — 2021 Asia-Pacific Conference on Communications Technology and Computer Science, ACCTCS 2021, Shenyang, 2021, pp. 306—309.
- Kupriyanov M. S., Kholod I. I. Parallelization of the Naive Bayes algorithm for processing distributed data, Soft measurements and calculations, 2019, no. 8 (21)., pp. 25—32.
- Sankaralingam S. K., Nagarajan N. S., Narmadha A. S. Energy aware decision stump linear programming boosting node classification based data aggregation in WSN, Computer Communications, 2020, vol. 155, pp. 133—142.
- Elbasi E., Zreikat A. I. Heart Disease Classification for Early Diagnosis based on Adaptive Hoeffding Tree Algorithm in IoMT Data, International Arab Journal of Information Technology, 2022, vol. 20, no. 1.
- Van Rijn Ja. N., Holmes G., Pfahringer B., Vanschoren J. The online performance estimation framework: heterogeneous ensemble learning for data streams, Machine Learning, 2018, vol. 107, no. 1, pp. 149—176.
- Almeida R. De, Goh Ye. M., Monfared R. et al. An ensemble based on neural networks with random weights for online data stream regression, Soft Computing — A Fusion of Foundations, Methodologies and Applications, 2020, vol. 24, no. 13, pp. 9835—9855.
- Sun Y., Pfahringer B., Gomes H. M., Bifet A. SOKNL: A novel way of integrating K-nearest neighbours with adaptive random forest regression for data streams, Data Mining and Knowledge Discovery, 2022, vol. 36, no. 5, pp. 2006—2032.
- Minkin M. A., Morozov K. Yu. Algorithm for reducing the peak factor of DRM standard radio broadcasting signals using the window weighting method with feedback and adaptive change in window length, Infocommunication technologies, 2021, vol. 19, no. 1, pp. 64—73.
- Sheluhin O. I., Barkov V. V., Sekretarev S. A. The online classification of the mobile applications traffic using data mining, T-Comm., 2019, vol. 13, no. 10. pp. 60 — 67.
- Shibzukhov Z. M. Correct extensions of correct ??-algorithms, Mathematical methods of pattern recognition, 2011, vol. 15, no. 1, pp. 116—119.
- Ghomeshi H., Gaber M. M., Kovalchuk Ye. EACD: evolutionary adaptation to concept drifts in data streams, Data Mining and Knowledge Discovery, 2019, vol. 33, no. 3, pp. 663—694.
- Kolmogorova S. S., Biryukov S. V. Design of electric induction sensors and electric field measuring instruments (Monograph), St. Petersburg, Renome, 2022. 180 p., doi: 10.25990/7bky-3e46 (in Russian).
- Cross E. S., Williams L. R., Lewis D. K., Magoon G. R., Onasch T. B., Kaminsky M. L., Worsnop D. R., Jayne J. T. Use of Electrochemical Sensors for Measurement of Air Pollution: Correcting Interference Response and Validating Measurements., Atmos. Meas. Tech., 2017, vol. 10, p. 3575.
- Novikov V. A., Akhmedova S., Orlov A. I. Calibration of inertial sensors using neural networks, Gagarin Readings — 2019: Collection of abstracts of reports of the XLV International Youth Scientific Conference, Moscow, Barnaul, Akhtubinsk, April 16-19, 2019, pp. 548—549.
To the contents |
|