Journal "Software Engineering"
a journal on theoretical and applied science and technology
ISSN 2220-3397

Issue N10 2015 year

Method of Extracting Hyponym-Hypernym Relationships for Nouns from Definitions of Explanatory Dictionaries
Yu. A. Kiselev, Junior Research Fellow, e-mail: yuri.kiselev@urfu.ru, S. V. Porshnev, Head of Department, e-mail: sergey_porshnev@mail.ru, M. Yu. Mukhin, Professor, e-mail: mfly@sky.ru, Ural Federal University named after the first President of Russia B. N. Yeltsin, Ekaterinburg

Method allowing automating retrieving the pairs of nouns linked with hyponym-hypernym relations based on the processing of definitions from explanatory dictionaries is proposed. Quantitative assessments of the precision and the recall of the proposed method, testifying that the effectiveness of the method is sufficient for filling the database thesaurus with hyponym-hypernym relationships, are obtained. The average precision of the method is 0.58. The precision equals to 0.68 for the most frequent words of the Russian language. It's ascertained that the precision for the words belonging to a certain semantic class does not significantly vary from random words. 23,500 hyponym-hypernym pairs were correctly retrieved from the Dictionary of the Russian Language. Such recall is enough for the further constructing of hyponym-hypernym hierarchy between concepts, based on these pairs. All the retrieved data will be used in YARN project — a large open WordNet-like machine-readable thesaurus for the Russian language through crowdsourcing. The analysis of agreement between Russian native speakers in determining the correctness of the retrieved hyponym-hypernym pairs let us to make a conclusion about the possibility of the use of crowdsourcing for filling the thesaurus.

Keywords: thesaurus, dictionary, semantic relationships, hyponym-hypernym relationships, crowdsourcing, Russian language
pp. 38–48