|
ABSTRACTS OF ARTICLES OF THE JOURNAL "INFORMATION TECHNOLOGIES".
No. 12. Vol. 30. 2024
DOI: 10.17587/it.30.632-640
A. S. Romanov, PhD in Engineering Science, Associate Professor, Senior Researcher, A. M. Fedotova, Postgraduate Student, Junior Researcher, A. V. Kurtukova, Postgraduate Student, Junior Researcher, A. A. Shelupanov, Professor, Head of the Department,
Tomsk State University of Control Systems and Radioelectronics, Competence Center of the National Technological Initiative "Trusted Interaction Technologies" (CT NTI "Trusted Interaction Technologies"), Tomsk, Russian Federation
Methodology for Identification the Author of Destructive Textual Data
The article is devoted to determining the authorship of short comments of social network users, including communities associated with destructive content. The study includes classification with a fixed and open set of authors. In the latter case, several experiments were conducted, including detection of destructive content by introducing authors of the such content. The results show, that the proposed methodology achieves a high accuracy of 85 %.
Keywords: attribution, neural networks, transfer learning, extremism, foreign agents, text processing, machine learning, classification
Acknowledgements: This research was funded by the Ministry of Science and Higher Education of Russia, Government Order for 20232025, project no. FEWM-2023-0015 (TUSUR).
P. 632-640
References
- Uvarov A. A. Information security of Russian citizens: current state, Lex russica, 2024, vol. 77, no. 1. pp. 133143 (in Russian).
- Devyatov A. V., Slinko A. A. Legal principles of countering terrorism and extremism in the Russian Federation, Aspirantskie tetradi, 2023, pp. 5559, available at: https://www.elibrary.ru/item.asp?id=50475183 (date of access: 01.03.24) (in Russian).
- Kiselev S. A. The mechanism of political influence of foreign agents in Russian society and ways of neutralizing it, Social'noe'konomicheskie processy sovremennogo obshhestva, 2023, pp. 119124, available at: https://phsreda.com/e-articles/10532/Action10532-107841.pdf (date of access: 01.03.24) (in Russian).
- Maksimov A. A., Bykova A. V., Pasenchuk V. A. Extremism as the main threat to information security, Mezhdunarodny'j zhurnal gumanitarny'x i estestvenny'x nauk, 2021, no. 2-2, pp. 163169 (in Russian).
- Batozhok P. I. Extremism as a threat to the information security of students in social networks, Vozmozhnosti primeneniya rezul'tatov e'mpiricheskix issledovanij dlya izucheniya aktual'ny'x problem sovremennosti, 2022, pp. 2933, available at: https://elibrary.ru/item.asp?id=49481015 (date of access: 01.03.24) (in Russian).
- Shepelev V. V., Balakin K. A., Sukhorukova N. A. Countering the interference of foreign agents in Russia's youth policy as the basis for the security of the fatherland, 75-letie Velikoj Pobedy': istoricheskij opy't i sovremenny'e problemy voennoj bezopasnosti Rossii, 2020, vol. 1, pp. 463470 (in Russian).
- Semenov V. V. Method for the formation of informative features in tasks of quantitative analysis of objects, Informacionnye tekhnologii, 2023, vol. 29, no. 9, pp. 467472 (in Russian).
- Fedotova A., Romanov A., Kurtukova A., Shelupanov A. Digital Authorship Attribution in Russian-Language Fanfiction and Classical Literature, Algorithms, 2022, vol. 16, no. 1, available at: https://www.mdpi.com/1999-4893/16/1/13 (date of access: 01.03.24).
- Romanov A. S. Feature selection methods for authorship attribution in cybersecurity context, Modeling, Optimization and Information Technology, 2024, vol. 12, no. 1, available at: https://moitvivt.ru/ru/journal/pdf?id=1489 (date of access: 01.03.24) (in Russian).
- Fedotova A., Romanov A., Kurtukova A., Shelupanov A. Authorship Attribution of Social Media and Literary Russian-Language Texts Using Machine Learning Methods and Feature Selection, Future Internet, 2022, vol. 14, no. 4, available at: https://doi.org/10.3390/fi14010004 (date of access: 01.03.24).
- Romanov A. S. Text authorship identification for open set of candidates in cybersecurity context, Modeling, Optimization and Information Technology, 2024, vol. 12, no. 1, available at: https://moitvivt.ru/ru/journal/pdf?id=1510 (date of access: 01.03.24) (in Russian).
- Ranaldi L., Ranaldi F., Fallucchi F., Zanzotto F. M. Shedding Light on the Dark Web: Authorship Attribution in Radical Forums, Information, 2022, vol. 13, no. 9, available at: https:// www.mdpi.com/2078-2489/13/9/435 (date of access: 01.03.24).
- Transformers, available at: https://huggingface.co/docs/transformers/index (date of access: 01.03.24).
- Abbasi A., Chen H. Applying authorship analysis to extremist-group web forum messages, IEEE Intelligent Systems, 2005, vol. 20, pp. 6775.
- Litvinova T., Litvinova O., Panicheva P. Authorship attribution of Russian forum posts with different types of n-gram features, Proceedings of the 2019 3rd International Conference on Natural Language Processing and Information Retrieval, 2019, pp. 914, available at: https://dl.acm.org/doi/abs/10.1145/3342827.3342834 (date of access: 01.03.24).
- KavkazChat dataset, available at: https://www.azsecure-data.org/dark-web-forums.html (date of access: 01.03.24).
- List of terrorists and extremists, available at: https://fedsfm.ru/documents/terrorists-catalog-portal-act (date of access: 01.03.24) (in Russian).
- Clustering algorithms, available at: https://maartengr.github.io/BERTopic/getting_started/clustering/clustering.html#k-means (date of access: 01.03.24).
- Evaluation metrics, available at: https://www.kaggle.com/code/nikhilkhetan/setting-up-evaluation-metrics (date of access: 01.03.24).
- How ISIS Uses Twitter, available at: https://towardsdata-science.com/how-isis-uses-twitter-10065790354a (date of access: 01.03.24).
- Drachev G. A. Development of an algorithm for extracting and encoding data from log messages of a computer system for anomaly detection systems, Informacionnye tekhnologii, 2023, vol. 29, no. 7, pp. 351359 (in Russian).
- Burkhardt H. A., Ding X., Kerbrat A., Comtois K. A., Cohen T. From benchmark to bedside: transfer learning from social media to patient-provider text messages for suicide risk prediction, Journal of the American Medical Informatics Association, 2023, vol. 30, no. 6, pp. 10681078.
- Yafooz W. M. S., Al-Dhaqm A., Alsaeedi A. Detecting Kids Cyberbullying Using Transfer Learning Approach: Transformer Fine-Tuning Models, Kids Cybersecurity Using Computational Intelligence Techniques. Cham: Springer International Publishing, 2023, vol. 1080, pp. 255267.
- Neyshabur B., Sedghi H., Zhang C. What is being transferred in transfer learning?, Advances in neural information processing systems, 2020, vol. 33, pp. 512523.
- Russian toxicity classifier, available at: https://hugging-face.co/s-nlp/russian_toxicity_classifier (date of access: 01.03.24).
- Emotion detection, available at: https://huggingface.co/ cointegrated/rubert-tiny2-cedr-emotion-detection (date of access: 01.03.24).
- Russian sensitive topics, available at: https://huggingface.co/aanc/russian-sensitive-topics (date of access: 01.03.24).
- Kotelnikova A., Paschenko D., Razova E. Lexicon-based methods and BERT model for sentiment analysis of Russian text corpora, CEUR Workshop Proceedings, 2021, vol. 2922, pp. 7381.
- Burtsev M., Seliverstov A., Airapetyan R., Arkhipov M., Baymurzina D., Bushkov N. et. al. Deeppavlov: Open-source library for dialogue systems, Proceedings of ACL 2018, System Demonstration, 2018, pp. 122127.
- Venckauskas A., Karpavicius A., Damasevicius R., Marcinkevicius R., Kapociute-Dzikiene J., Napoli C. Open class authorship attribution of lithuanian internet comments using one-class classifier, 2017 Federated Conference on Computer Science and Information Systems (FedCSIS). IEEE, 2017, pp. 373382.
- Bartusiak E. R., Delp E. J. Transformer-Based Speech Synthesizer Attribution in an Open Set Scenario, 2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA), IEEE, 2022, pp. 329336.
To the contents
|
|