|
ABSTRACTS OF ARTICLES OF THE JOURNAL "INFORMATION TECHNOLOGIES".
No. 12. Vol. 30. 2024
DOI: 10.17587/it.30.622-632
I. Yu. Kashirin, Dr. Tech. Sc., Professor, https://orcid.org/0000-0003-1694-7410
Ryazan State Radio Engineering University named after V. F. Utkin, Ryazan, Russian Federation
Tokenization of Political Texts in BERT Models Using ICF+ Ontologies
The design of machine learning language models, as well as their ensembles, used in complex analytics of news texts of domestic and Western electronic media is considered. An example of software implementation of a new language neural network model with problem-oriented ontological tokenization is given. The language used as tools is Python v.3.10, Anaconda v.2.1. The effectiveness of the approach in comparison with the best foreign analogues is confirmed by a series of experiments using the example of classifying news articles according to their ideological orientation into Western and English-language Russian ones.
Keywords: Bert models, ontological models, ICF+ relation, tokenizer, retriever, political news, ensembles of ML models, forecasting, semantic similarity
P. 622-632
References
- Anastas'yev A. A., Astashkin M. S., Agafonov P. A., Kashirin I. Yu. Determining the reliability of news using knowledge-based ML models, IIASU'23 — Artificial intelligence in management, control, and data processing systems. Proceedings of the II All-Russian scientific conference (Moscow, April 27—28, 2023), 2023, vol. 2, pp. 21—27.
- Platonov Ye. N., Rudenko V. Yu. Identification and classification of toxic statements by machine learning methods, Data modeling and analysis, 2022, vol. 12, no. 1, pp. 27—48.
- Badjatiya P., Gupta S., Gupta M., Varma V. Deep learning for hate speech detection in tweets, Proceedings of the 26th International Conference on World Wide Web Companion, 2017, pp. 759—760.
- Agrawal A., An A. Affective representations for sarcasm detection, 41st International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2018, pp. 1029—1032.
- BingLiu H., Shu L., Yu Ph. S. BERT post-training for review reading comprehension and aspect-based sentiment analysis, arXiv preprint arXiv:1904.02232 (20I9).
- Chiarcos C., Apostol E.-S., Kabashi B., Truica C.-O. Modelling frequency, attestation, and corpus-398 based information with OntoLex-FrAC, Proceedings of the 29 th International Conference on 400 Computational Linguistics, pp. 4018—4027.
- Roumeliotis K. I., Tselikas N. D. ChatGPT and Open-AI Models: A Preliminary Review, Future Internet, 2023, vol. 15, pp. 192, available at: https://doi.org/10.3390/fi15060192.
- An international repository for data analysis and original technological solutions. [Electronic resource]. 2024. Date of update: 10.04.2024. URL: https://www.kaggle.com/ (date of access: 16.04.2022).
- 9. International repository of language neural network models. [Electronic resource], 2024, update date: 12.03.2024, available at: https://huggingface.co/models (date of access: 26.09.2023).
- Kashirin I. Yu. Application of hierarchical number theory in the construction of ICF taxonomy for optimization of neural networks, Vestnik RGRTU, 2022, pp. 118—126
- Bader F., Calvanese D., MacGuinness D., Nardi D., Patel Schneider P. ed. The Description Logics Handbook. Theory, Implementation and Applications, New York, Cambridge University Press, 2003.
- Kashirin I. Yu., Filatov I. Yu. Formalized Description of Intuitive Perception Of Spatial Situations, 2019 8th Mediterranean Conference on Embedded Computing (MECO), Budva, Montenegro, 2019, pp. 1—4.
- Duineveld A. J., Stoter R., Weiden M. R., Kenepa B., Benjamins V. R. WonderTools? A comparative study of ontological engineering tools, International Journal of Human-Computer Studies, 2000, vol. 52, no. 6, pp. 1111—1133.
- Kashirin D. I., Kashirin I. YU., Pyl'kin A. N. Polimorficheskoye predstavleniye znaniy v Semantic Web, Moscow, Goryachaya liniya — Telekom, 2009, I38 p.
- Kashirin I. Yu. Iyerarkhicheskiye chisla dlya proyektirovaniya taksonomiy iskusstvennogo intellekta ICF, Vestnik RGRTU, 2020, no. 71, pp. 71—82.
- Definition of hierarchical numbers. [Electronic resource], 2024, update date: 03/04/2024, available at: https://kashirin.net/definition-of-hierarchical-numbers (access date: 04/16/2022).
To the contents
|
|