Informacionnye Tehnologii, 2018, vol. 24, no. 11, pp. 719-724

Ðóññêèé

ABSTRACTS OF ARTICLES OF THE JOURNAL "INFORMATION TECHNOLOGIES".
No. 11. Vol. 24. 2018

DOI: 10.17587/it.24.719-724

A. B. Sorokin, PhD in Technique, Associate Professor, e-mail: ab_sorokin@mail.ru, A. P. Kushnarev, master student, e-mail: brainzeater@gmail.com, Moscow Technological University (MIREA)

Morphological Text Analyzer for Revealing the Completeness of Information

This article discusses the use of technology of automatic part-of-speech tagging of Russian-language texts, presented in digital form, in order to determine the excess or lack of information in the text, the identification and construction of the concept. The main attention is paid to the stage of morphological analysis, as one of the most difficult stages in the analysis of the text, due to the specific features of the morphology of the Russian language, associated with the ambiguity of matching words of a particular part of the speech. Improving the accuracy of the analysis of Russian-language texts is achieved by identifying new patterns among the five parts of speech and by adding new inflectional endings to the existing ones in Porter algorithm. Removal of homonymy is achieved by creating an additional dictionary of homonyms, which will contain the most commonly used word forms that have the property of homonymy. Identification of excessive and insufficient information for understanding the text occurs in the process of constructing a conceptual structure and generating direct logical inference in the software package "Designer + Solver".
Keywords: conceptual structure, computational linguistics, part-of-speech tagging, POS tagging, automatic processing of documents, processing of texts in natural language, Porter stemming algorithm

P. 719–724

To the contents