Enhancing Loan Default Prediction with Text Mining

Barry Egan, Kyle Goslin
2022 International Conference on Intelligent Environments  
Credit scoring is a popular method used by financial institutions to evaluate an applicants' risk of default. However, in certain circumstances, an individual's credit score is not an accurate indicator of their risk of default as it may be based on outdated information from a single point in time, or individuals may have no prior credit history from which to build the credit score. Several studies have investigated using text data to enhance the classification of loan default, with varying
more » ... ees of success. This research examines if the text data contained in the loan applications of a peer-to-peer (P2P) lending platform can be utilized to enhance loan default prediction. In this research, two models were created and optimized: one using only text data and the other using numeric data. The text and numeric models were then combined to see whether the classification performance of the individual models can be enhanced. The classification performance of the text model was superior to that of the numeric model, achieving accuracies 15.73% and 33.82% higher; however, by combining the models, there was a considerable improvement to the model's classification performance of between 2.8% and 19.87% respectively. Results showed that text data holds significant value for assessing credit risk, and when text data and numeric data are combined there is an enhancement in the prediction of loan default.
doi:10.3233/aise220039 dblp:conf/intenv/EganG22 fatcat:dmd7tkertzctliywa65exje674