Modelo de classificação para dados desbalanceados: método SMOTE e variantes
Carregando...
Data
Autores
Título da Revista
ISSN da Revista
Título de Volume
Editor
Universidade Federal de São Carlos
Resumo
Often, in classification models, we encounter databases that have highly imbalanced classes, such as: rare disease diagnostic data, manufacturing defects, fraudulent transactions, etc. Training a model on a dataset with few observations of a particular class results in poor predictive performance, especially for observations belonging to the minority class. In this Undergraduate Thesis, we present and compare different variants of the Synthetic Minority Over-sampling TEchnique (SMOTE) method for oversampling imbalanced data used in classification models, specifically Logistic Regression, in order to demonstrate how these techniques can improve the ability to identify and predict observations from the minority class in realistic and imbalanced scenarios, as well as to determine which combination of sampling technique and Logistic Regression classification model leads to better performance.
Descrição
Citação
NORA, Andrielle Couto. Modelo de classificação para dados desbalanceados: método SMOTE e variantes. 2024. Trabalho de Conclusão de Curso (Graduação em Estatística) – Universidade Federal de São Carlos, São Carlos, 2024. Disponível em: https://repositorio.ufscar.br/handle/20.500.14289/19545.
Coleções
item.page.endorsement
item.page.review
item.page.supplemented
item.page.referenced
Licença Creative Commons
Exceto quando indicado de outra forma, a licença deste item é descrita como Attribution-NonCommercial-NoDerivs 3.0 Brazil
