Journal "Software Engineering"
a journal on theoretical and applied science and technology
ISSN 2220-3397

Issue N3 2024 year

DOI: 10.17587/prin.15.125-133
Methods for Thematic Analysis and Ranking in Information Analysis Systems Based on Relational DBMS
D. A. Shachnev, PhD, Researcher, mitya57@gmail.com, Lomonosov Moscow State University, Moscow, 119991, Russian Federation
Corresponding author: Dmitry A. Shachnev, PhD, Researcher Lomonosov Moscow State University, Moscow, 119991, Russian Federation, E-mail: mitya57@gmail.com
Received on December 13, 2023
Accepted on January 15, 2024

The paper presents the models and algorithms for searching, filtering and ranking data in an information analysis system, based on the ontological description of its structure. The rules for filtering and ranking objects based on their boolean, numeric or enumerable properties are specified by the users, and based on these rules SQL queries are evaluated. Another problem considered is taking subject area into account when searching for objects. Adding this allows one to solve many tasks that occur in current research information systems, such as searching for experts in given subject areas or generating reports. A search query can be specified in the form of a set of keywords, and a statistical approach is used for comparing keyword sets.

Keywords: information analysis systems, current research information systems, CRIS, subject area, ranking, keywords, rubrics
pp. 125–133
For citation:
Shachnev D. A. Methods for Thematic Analysis and Ranking in Information Analysis Systems Based on Relational DBMS, Programmnaya Ingeneria, 2024, vol. 15, no. 3, pp. 125—133. DOI: 10.17587/prin.15.125-133.
References:
    • Mohmmed A. G. M., Osman S. E. F. Study on SQL vs. NoSQL vs. NewSQL, Journal of Multidisciplinary Engineering Science Studies (JMESS), 2017, vol. 3, no. 6, pp. 1821—1823, available at: https://www.jmess.org/wp-content/uploads/2017/07/ JMESSP13420354.pdf (date of access 12.12.2023).
    • Torres A., Galante R., Pimenta M. S., Martins A. J. B. Twenty years of object-relational mapping: A survey on patterns, solutions, and their implications on application design, Information and Software Technology, 2017, vol. 82, pp. 1—18. DOI: 10.1016/j.infsof.2016.09.009.
    • Bayer M. SQLAlchemy, The Architecture of Open Source Applications Volume II: Structure, Scale, and a Few More Fearless Hacks / Eds A. Brown, G. Wilson, aosabook.org, 2012, available at: https://aosabook.org/en/v2/sqlalchemy.html (date of access 12.12.2023).
    • Krivchikov M., Shachnev D., Vasenin V., Zenzinov A. "ISTINA" data analysis system: Cross-cutting technologies for science and education, 2019 Actual Problems of Systems and Software Engineering (APSSE), 2019, pp. 146—156. DOI: 10.1109/ APSSE47353.2019.00026.
    • Das S., Sundara S., Cyganiak R. R2RML: RDB to RDF mapping language, W3C, 2012, available at: https://www.w3.org/ TR/2012/REC-r2rml-20120927/ (date of access 12.12.2023).
    • Mikolov T., Chen K., Corrado G., Dean J. Efficient estimation of word representations in vector space, 2013, available at: https://arxiv.org/abs/1301.3781(date of access 12.12.2023).
    • Biemann C. Ontology learning from text: A survey of methods, Journal for Language Technology and Computational Linguistics, 2005, vol. 20, no. 2, pp. 75—93. DOI: 10.21248/jlcl.20.2005.76.
    • Vasenin V., Lunev K., Afonin S., Shachnev D. Methods for intelligent data analysis based on keywords and implicit relations: The case of "ISTINA" data analysis system, 2019 Actual Problems of Systems and Software Engineering (APSSE), 2019, pp. 157—161. DOI: 10.1109/APSSE47353.2019.00027.
    • Sahlgren M. The distributional hypothesis, Italian Journal of Linguistics, 2008, vol. 20.1, pp. 33—53, available at: https://www.italian-journal-linguistics.com/app/uploads/2021/05/Sahlgren-1.pdf (date of access 12.12.2023).
    • Dosilovic F. K., Brcic M., Hlupic N. Explainable artificial intelligence: A survey, 2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO), IEEE, 2018, pp. 0210—0215. DOI: 10.23919/MIPRO.2018.8400040.
    • Jones K. S. A statistical interpretation of term specificity and its application in retrieval, Journal of Documentation, 1972, vol. 28, no. 1, pp. 11—21. DOI: 10.1108/eb026526.
    • Singhal A. Modern information retrieval: A brief overview, IEEE Data Engineering Bulletin, 2001, vol. 24, pp. 35—43, available at: http://singhal.info/ieee2001.pdf (date of access 12.12.2023).
    • Sidorov G., Gelbukh A., Gomez-Adorno H., Pinto D. Soft similarity and soft cosine measure: Similarity of features in vector space model, Computacion y Sistemas, 2014, vol. 18, no. 3, pp. 491—504. DOI: 10.13053/CyS-18-3-2043.
    • Revised field of science and technology (FOS) classification in the Frascati Manual, Organisation for Economic Co-operation and Development, 2007, available at: https://web-archive.oecd. org/2012-06-15/138575-38235147.pdf (date of access 12.12.2023).
    • Leacock C., Chodorow M. Combining local context and WordNet similarity for word sense identification, WordNet: An Electronic Lexical Database / Ed. C. Fellbaum, MIT Press, Cambridge, 1998, pp. 265—283. DOI: 10.7551/mitpress/7287.003.0018.