Modelos alternativos para classificação em dados desbalanceados
Abstract
In binary classification, the most used method is logistic regression model. However, several authors indicate that this model is not suitable when the data are imbalanced; for this, different asymmetric link functions as alternatives for binary response models have been proposed, for example, in recent years the power (P) and reverse power (RP) distributions have been presented. In this work we develop new properties of the P and RP distributions in the context of models for classification on imbalanced data. Also, some metrics for classification are studied through a simulation study, and an application of the studied methodology is presented.
In addition, we extend the binary regression models to the case of mixed models for binary classification in the context of a longitudinal studies. To evaluate the performance of the models, a simulation study is performed. Additionally, an application is considered concerning the studied methodology in a dataset in which the response is longitudinal and imbalanced. For parameter estimation the Bayesian approach is considered using a MCMC procedure through the No-U-Turn Sampler (NUTS) algorithm. Further predictive checks, randomized Bayesian quantile residuals and a measure of Bayesian influence are considered for model diagnosis. Different models are compared using model selection criteria.
Collections
The following license files are associated with this item: