Journal "Software Engineering"
a journal on theoretical and applied science and technology
ISSN 2220-3397

Issue N4 2013 year

Validation of the Thematic Models for Document Collections
A. A. Kuzmin , V. V. Strijov , e-mail: strijov@ccas.ru

Consider a collection of documents with expert thematic model. To verify the adequacy of the expert model build an algorithmic model by hierarchical clustering text collections. The agglomerative and divisive clustering methods are investigated. The algorithmic model error in comparison to the expert model is estimated. The differences between expert model and algorithmic model are visualized.

Keywords: document collection, thematic model, hierarchical model, clustering
pp. 16–20