Journal "Software Engineering"
a journal on theoretical and applied science and technology
ISSN 2220-3397

Issue N10 2015 year

Construction of Quality Function for Scientific Papers Author Names Disambiguation Problem Using Supervised Learning Techniques
S. A. Afonin, Leading Researcher, Institute of Mechanics, Lomonosov Moscow State University, A. E. Gasparianc, Student, e-mail: gaspariants@mail.ru, Lomonosov Moscow State University

In this paper the authors' names disambiguation problem is considered. This problem consists of finding a mapping between given set of scientific paper authors, i.e. tuple of strings, and the set of all authors' records of a bibliographic database. More than one database record may be matched by author's name. In this case disambiguation is conducted using information about previous joint works recorded in the database. A number of numerical features reflecting quality of the mapping proposed, and supervised learning techniques were applied in order to obtain final decision rule. Simulation on real a data set shows 86...97 % accuracy depending on number of coauthors and number of matching records.

Keywords: automatic learning, classification, e-library, bibliographic record, author name, disambiguation, duplicate search
pp. 31–37