Máquinas de vetores suporte com aplicação em classificação de crédito
Abstract
The credit granting represents one of the products with the highest profitability within a financial institution. However, to ensure profit, institutions must know to whom they lend their capital. In this scenario, a fundamental tool to assist in decision-making regarding the granting of funds is the credit risk which purpose is to predict the creditworthiness of a borrower, classifying the customer as non-defaulting or a defaulting customer. Therefore, this tool must reproduce results close to reality with a low margin of error to avoid financial losses for the credit-granting institution. Nonetheless, in the context of credit analysis, the databases used in the credit risk contain more observations referring to non-defaulting customers (majority class) than defaulting customers (minority class) turning them imbalanced and prone to lead to bias in credit risk. Alternatives to overcome such bias in the classification and adequately deal with the problem of class imbalance is to apply a pre-processing in the data set to balance the classes or modify the classification algorithm. Therefore, in the credit risk context, this work proposes to apply the support vector machine classifier in the discrimination of customers requesting a loan, comparing the performance of this technique both in balanced and imbalanced data sets. In the former will be used the oversampling SMOTE method and in the later the cost-sensitive support vector machine methodology since it is proposed to deal with imbalaced classes. Furthermore, this work compare the performance of the support vector machine classifier with other classifiers commonly used in the credit scenario, such as logistic regression and random forest. The study will be applied to real data and evaluated regards to some metrics that measure the prediction performance.
Collections
The following license files are associated with this item: