Journal "Software Engineering"
a journal on theoretical and applied science and technology
ISSN 2220-3397

Issue N7 2021 year

DOI: 10.17587/prin.12.358-372
Predictive Credit Risk Analytics Using Borrowers' Digital Footprint and Methods of Statistical Machine Learning
E. V. Orlova,, Ufa State Aviation Technical University, Ufa, 450008, Russian Federation
Corresponding author: Orlova Ekaterina V., Professor, Ufa State Aviation Technical University, Ufa, 450008, Russian Federation, E-mail:
Received on May 20, 2021
Accepted on June 21, 2021

The article considers the problem of reducing the banks credit risks associated with the insolvency of borrowers — individuals using financial, socio-economic factors and additional data about borrowers digital footprint. A critical analysis of existing approaches, methods and models in this area has been carried out and a number of significant shortcomings identified that limit their application. There is no comprehensive approach to identifying a borrowers creditworthiness based on information, including data from social networks and search engines. The new methodological approach for assessing the borrowers risk profile based on the phased processing of quantitative and qualitative data and modeling using methods of statistical analysis and machine learning is proposed. Machine learning methods are supposed to solve clustering and classification problems. They allow to automatically determine the data structure and make decisions through flexible and local training on the data. The method of hierarchical clustering and the k-means method are used to identify similar social, anthropometric and financial indicators, as well as indicators characterizing the digital footprint of borrowers, and to determine the borrowers risk profile over group. The obtained homogeneous groups of borrowers with a unique risk profile are further used for detailed data analysis in the predictive classification model. The classification model is based on the stochastic gradient boosting method to predict the risk profile of a potencial borrower. The suggested approach for individuals creditworthiness assessing will reduce the banks credit risks, increase its stability and profitability. The implementation results are of practical importance. Comparative analysis of the effectiveness of the existing and the proposed methodology for assessing credit risk showed that the new methodology provides predictive ana­lytics of heterogeneous information about a potential borrower and the accuracy of analytics is higher. The proposed techniques are the core for the decision support system for justification of individuals credit conditions, minimizing the aggregate credit risks.

Keywords: data analytics, machine learning, clustering, classification, creditworthiness, digital footprint, reduction of credit risks
pp. 358–372
For citation:
Orlova E. V. Predictive Credit Risk Analytics Using Borrowers' Digital Footprint and Methods of Statistical Machine Learning, Programmnaya Ingeneria, 2021, vol. 12, no. 7, pp. 358—372.