Mortgage Default: Classification Trees Analysis

David Feldman, Shulamith Gross
2004 Social Science Research Network  
We introduce the powerful, flexible and computationally efficient nonparametric Classification and Regression Trees (CART) algorithm to the real estate analysis of mortgage data. CART.s strengths in dealing with large data sets, high dimensionality, mixed data types, missing data, different relationships between variables in different parts of the measurement space, and outliers, is particularly appropriate for our data set. Moreover, CART is intuitive and easy to interpret and implement. We
more » ... cuss the pros and cons of CART vis-à-vis traditional methods such as linear logistic regression, nonparametric additive logistic regression, discriminant analysis, partial least squares classification, and neural networks, with particular emphasis on real estate. We apply CART to produce the first academic mortgage default study of Israeli data. We find that borrowers. features, rather than mortgage contracts features, are the strongest predictors of default if accepting "bad" borrowers is more costly than rejecting .good. ones. If these costs are equal, mortgage features are used as well. The higher (lower) the ratio of misclassification costs of bad risks versus good ones, the lower (higher) are the resulting misclassification rates of bad risks and the higher (lower) are the misclassification rates of good ones. This is consistent with real world stylized facts of rejection of good risks in attempt to avoid bad ones. JEL Codes: C12, D12, G21, R29
doi:10.2139/ssrn.659881 fatcat:etxq4c4zybdljfxipsu4uhezzu