Journal "Software Engineering"
a journal on theoretical and applied science and technology
ISSN 2220-3397
Issue N3 2014 year
In this paper we consider a method for extraction of various references of a concept or a named entity mentioned in a news cluster. The method is based on joint applying of heterogeneous similarity features, such as the structural organization of news clusters, comparison of various word contexts, and information from predefined resources. The word contexts are used as basis for multiword expression extraction and main entity detection. At the end of cluster processing groups of thematically-related elements are obtained, in which the main element of a group is determined. Evaluation of the proposed algorithm is performed in news cluster summarization task..