Journal "Software Engineering"
a journal on theoretical and applied science and technology
ISSN 2220-3397

Issue N8 2014 year

Measuring Semantic Similarity between Two Sentences
K. V. Lunev, Student, Programmer, Lomonosov Moscow State University, e-mail: kirilllunev@gmail.com

In this paper the results of initial researches on retrievals of models, algorithms and software tools of the determination of the semantic similarity of two sentences are presented. These results can be used by search engines for outputting more relevant content, and also in solving the problems of clustering, generalization, text indexing and many others. As it is supposed by the author in the paper, sentences can be divided into three parts, each of these parts is the definition of a fact, to be exact: what happened, where and when. The algorithm of the division into these parts is not considered in this stage of the research. The author proposes the metrics, on the basis of which the semantic similarity of the parts of sentences is determined. The data of this semantic similarity are used for the searching the semantic similarity the whole sentences. For this purpose the semantic net WordNet, the search engine Yandex, the Geocoding API and proper algorithms are used.

Keywords: natural language processing, sentence semantic similarity, algorithms, search engine, geocoding
pp. 30–39