Journal "Software Engineering"
a journal on theoretical and applied science and technology
ISSN 2220-3397
Issue N12 2017 year
This article addresses problems related to automated processing of bibliographic data in scientometric systems by means of statistical analysis of large collections of such data. Experimental results on authors identification in bibliographic data are based on the data set presented in ISTINA, a scientometric information system developed and deployed at Moscow State University. The new algorithm for authors identification, presented in this paper, shows 95 % accuracy on the considered data set. Utilization of this algorithm improves the quality of data entered into the system, thus leading to a more reliable scientometric characteristics of individual researchers and administrative units. The paper also discusses possible approaches to some related practically important problems, such as thematic search and classification of publications using coauthoring graph and thematic classification of scientific journals. Automatic discovery of researchers topics of interest allows for on-demand generation of various documents suitable for decision making in specific research areas and could be useful for supplying system users with information on relevant journals or upcoming scientific events.