Journal "Software Engineering"
a journal on theoretical and applied science and technology
ISSN 2220-3397

Issue N9 2018 year

DOI: 10.17587/prin.9.425-432
Extraction of Explicit Consumer Intentions from Social Network Messages
I. S. Pimenov, pimenov.1330@yandex.ru, Novosibirsk State University, Novosibirsk, 630090, Russian Federation, N. V. Salomatina, salomatina_nv@live.ru, Sobolev institute of mathematics, Novosibirsk, 630090, Russian Federation
Corresponding author: Salomatina Natalya V., PhD, Senior Researcher, Sobolev institute of mathematics, Novosibirsk, 630090, Russian Federation, E-mail: salomatina_nv@live.ru
Received on August 20, 2018
Accepted on August 21, 2018

The problem of automatic extraction of facts from Russian texts was approached in this paper. The facts under examination were the intentions of social network users to purchase certain goods or use certain services. The utilized approach is based on the semantic tagging of user messages by an expert and the automatic construction of rules. A training set for expert annotation consisted of messages from the "VKontakte" social network, selected through the LeadScanner API. The invented system of semantic tags allowed distinguishing between various intentional blocks: objects, their different properties and emphatic constructions. Pre-processing of the training set included lemmatization and grammatical tagging with PyMorphy2. Then, on the material of the training set, a directed graph was constructed. Each node in this graph would correspond to an intentional block, including information about its expertly-assigned intentional tag, grammatical and/or lexical properties of its main word. The edges of the graph would connect the intentional blocks that could be found in adjacent positions across all the messages of the training set. Extraction of intention objects and their properties was achieved by test set analysis in accordance to the constructed graph. Test set included both messages containing non-consumer intentions or no intentions at all. The results of the testing stage show that the approach used allows ascertaining if a particular message expresses intention, and, if it does, extracting the intention object along with its relevant properties. The precision and recall of intention extraction was 81 % and 74 % respectively. The data extracted can be used for further refinement of message classification.

Keywords: intention, intention marker, intentional block with annotation, directed graph, fact extraction
pp. 425–432
For citation:
Pimenov I. S., Salomatina N. V. Extraction of Explicit Consumer Intentions from Social Network Messages, Programmnaya Ingeneria, 2018, vol.9, no. 9, pp. 425—432
This work was supported by the Russian Foundation for Basic Research, project no. 0314-2016-0015.