main| new issue| archive| editorial board| for the authors| publishing house|
Ðóññêèé
Main page
New issue
Archive of articles
Editorial board
For the authors
Publishing house

 

 


ABSTRACTS OF ARTICLES OF THE JOURNAL "INFORMATION TECHNOLOGIES".
No. 12. Vol. 30. 2024

DOI: 10.17587/it.30.622-632

I. Yu. Kashirin, Dr. Tech. Sc., Professor, https://orcid.org/0000-0003-1694-7410
Ryazan State Radio Engineering University named after V. F. Utkin, Ryazan, Russian Federation

Tokenization of Political Texts in BERT Models Using ICF+ Ontologies

The design of machine learning language models, as well as their ensembles, used in complex analytics of news texts of domestic and Western electronic media is considered. An example of software implementation of a new language neural network model with problem-oriented ontological tokenization is given. The language used as tools is Python v.3.10, Anaconda v.2.1. The effectiveness of the approach in comparison with the best foreign analogues is confirmed by a series of experiments using the example of classifying news articles according to their ideological orientation into Western and English-language Russian ones.
Keywords: Bert models, ontological models, ICF+ relation, tokenizer, retriever, political news, ensembles of ML models, forecasting, semantic similarity

P. 622-632

References

  1. Anastas'yev A. A., Astashkin M. S., Agafonov P. A., Kashirin I. Yu. Determining the reliability of news using knowledge-based ML models, IIASU'23 — Artificial intelligence in management, control, and data processing systems. Proceedings of the II All-Russian scientific conference (Moscow, April 27—28, 2023), 2023, vol. 2, pp. 21—27.
  2. Platonov Ye. N., Rudenko V. Yu. Identification and classification of toxic statements by machine learning methods, Data modeling and analysis, 2022, vol. 12, no. 1, pp. 27—48.
  3. Badjatiya P., Gupta S., Gupta M., Varma V. Deep learning for hate speech detection in tweets, Proceedings of the 26th International Conference on World Wide Web Companion, 2017, pp. 759—760.
  4. Agrawal A., An A. Affective representations for sarcasm detection, 41st International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2018, pp. 1029—1032.
  5. BingLiu H., Shu L., Yu Ph. S. BERT post-training for review reading comprehension and aspect-based sentiment analysis, arXiv preprint arXiv:1904.02232 (20I9).
  6. Chiarcos C., Apostol E.-S., Kabashi B., Truica C.-O. Modelling frequency, attestation, and corpus-398 based information with OntoLex-FrAC, Proceedings of the 29 th International Conference on 400 Computational Linguistics, pp. 4018—4027.
  7. Roumeliotis K. I., Tselikas N. D. ChatGPT and Open-AI Models: A Preliminary Review, Future Internet, 2023, vol. 15, pp. 192, available at: https://doi.org/10.3390/fi15060192.
  8. An international repository for data analysis and original technological solutions. [Electronic resource]. 2024. Date of up­date: 10.04.2024. URL: https://www.kaggle.com/ (date of access: 16.04.2022).
  9. 9. International repository of language neural network models. [Electronic resource], 2024, update date: 12.03.2024, available at: https://huggingface.co/models (date of access: 26.09.2023).
  10. Kashirin I. Yu. Application of hierarchical number theory in the construction of ICF taxonomy for optimization of neural networks, Vestnik RGRTU, 2022, pp. 118—126
  11. Bader F., Calvanese D., MacGuinness D., Nardi D., Patel Schneider P. ed. The Description Logics Handbook. Theory, Im­plementation and Applications, New York, Cambridge University Press, 2003.
  12. Kashirin I. Yu., Filatov I. Yu. Formalized Description of Intuitive Perception Of Spatial Situations, 2019 8th Mediterranean Conference on Embedded Computing (MECO), Budva, Montenegro, 2019, pp. 1—4.
  13. Duineveld A. J., Stoter R., Weiden M. R., Kenepa B., Benjamins V. R. WonderTools? A comparative study of ontological engineering tools, International Journal of Human-Computer Studies, 2000, vol. 52, no. 6, pp. 1111—1133.
  14. Kashirin D. I., Kashirin I. YU., Pyl'kin A. N. Polimorficheskoye predstavleniye znaniy v Semantic Web, Moscow, Goryachaya liniya — Telekom, 2009, I38 p.
  15. Kashirin I. Yu. Iyerarkhicheskiye chisla dlya proyektirovaniya taksonomiy iskusstvennogo intellekta ICF, Vestnik RGRTU, 2020, no. 71, pp. 71—82.
  16. Definition of hierarchical numbers. [Electronic resource], 2024, update date: 03/04/2024, available at: https://kashirin.net/definition-of-hierarchical-numbers (access date: 04/16/2022).

 

 

To the contents