Journal "Software Engineering"
a journal on theoretical and applied science and technology
ISSN 2220-3397

Issue N6 2018 year

DOI: 10.17587/prin.9.262-271
Graph Methods for Computing Semantic Similarity of a Hair of Keywords and Their Application to the Problem of Keywords Clustering
K. V. Lunev, e-mail: kirilllunev@gmail.com, Faculty of mechanics and mathematics, Lomonosov Moscow State University, Institute of mechanics, Moscow, 119192, Russian Federation
Corresponding author: Lunev Kirill V., Postgraduate Student, Faculty of mechanics and mathematics, Lomonosov Moscow State University, Institute of mechanics, Moscow, 119192, Russian Federation, E-mail: kirilllunev@gmail.com
Received on March 03, 2018
Accepted on April 13, 2018

The article presents the results of research on the direction of search models, algorithms and software to determine the semantic similarity between two keywords. The methods which are used in the work are based on the graph theory algorithms. The document is represented as a set of keywords associated with the document. A measure of contextual similarity of a pair of keywords is developed. A keywords graph is constructed for a given collection of documents. The nodes of the graph correspond to the keywords, and edges represent the fact of the contextual closeness of a pair of words. The method of clustering of the constructed graph is presented below. The keywords included in one cluster have the property of semantic similarity, which is an important result of this work. Software implementation of the developed models has been tested on the collections of scientific publications keywords, as well as on the collection of posts tags in the VKontakte social network.

Keywords: semantic similarity, natural language processing, graph algorithms, graph theory, clustering
pp. 262–271
For citation:
Lunev K. V. Graph Methods for Computing Semantic Similarity of a Pair of Keywords and Their Application to the Problem of Keywords Clustering, Programmnaya Ingeneria, 2018, vol. 9, no. 6, pp. 262—271.'