Estatística - Interinstitucional (PIPGEs)https://repositorio.ufscar.br/handle/ufscar/82052019-03-23T16:35:25Z2019-03-23T16:35:25ZModelos de regressão para resposta binária na presença de dados desbalanceadoshttps://repositorio.ufscar.br/handle/ufscar/111032019-03-19T21:59:57Z2019-02-22T00:00:00ZModelos de regressão para resposta binária na presença de dados desbalanceados
In binary regression, imbalanced data result from the presence of values equal to zero (or one) in a proportion that is significantly greater than the corresponding real values of one (or zero). In this work, we evaluate two methods developed to deal with imbalanced data and compare them to the use of asymmetric links. The results based on simulation study show, that correction methods do not adequately correct bias in the estimation of regression coefficients and that the models with power links and reverse power considered produce better results for certain types of imbalanced data.
Additionally, we present an application for imbalanced data, identifying the best model among the various ones proposed.
The parameters are estimated using a Bayesian approach, considering the Hamiltonian Monte-Carlo method, utilizing the No-U-Turn Sampler algorithm and the comparisons of models were developed using different criteria for model comparison, predictive evaluation and quantile residuals.
2019-02-22T00:00:00ZPenalized regression methods for compositional datahttps://repositorio.ufscar.br/handle/ufscar/110342019-02-27T17:23:55Z2018-12-10T00:00:00ZPenalized regression methods for compositional data
Compositional data consist of known vectors such as compositions whose components are positive and defined in the interval (0,1) representing proportions or fractions of a "whole", where the sum of these components must be equal to one. Compositional data is present in different areas, such as in geology, ecology, economy, medicine, among many others. Thus, there is great interest in new modeling approaches for compositional data, mainly when there is an influence of covariates in this type of data. In this context, the main objective of this thesis is to address the new approach of regression models applied in compositional data. The main idea consists of developing a marked method by penalized regression, in particular the Lasso (least absolute shrinkage and selection operator), elastic net and Spike-and-Slab Lasso (SSL) for the estimation of parameters of the models. In particular, we envision developing this modeling for compositional data, when the number of explanatory variables exceeds the number of observations in the presence of large databases, and when there are constraints on the dependent variables and covariates.
2018-12-10T00:00:00ZModelagem conjunta de dados longitudinais e de sobrevivência para avaliação de desfechos clínicos do partohttps://repositorio.ufscar.br/handle/ufscar/109422019-02-12T16:49:25Z2018-12-06T00:00:00ZModelagem conjunta de dados longitudinais e de sobrevivência para avaliação de desfechos clínicos do parto
As most pregnancy-related deaths and morbidities are clustered around the time of child birth, the quality of care during this period is crucial for mothers and their babies. To monitor the women at this stage, the partograph has been the central tool used in recent decades and, motivated by its simplicity, is frequently used in low-and middle-income countries. However, its use is highly questioned due to lack of evidence to justify a contribution to labor. To improve the quality of labor in these circumstances, the BOLD project has been developed in order to reduce the occurrence of pregnancy-related problems and in order to develop a modern tool, called SELMA, which is projected as an alternative to partograph. Aiming to associate fixed and dynamic characteristics evaluated in the delivery and to identify which elements can be used as triggers for performing an intervention, and thus preventing a bad outcome, this thesis proposes the use of survival models with time dependent covariates. Initially, we consider the joint modeling of survival and longitudinal data using flexible parametric hazard functions. In this sense, we propose the use of five generalizations of Weibull distribution, the Nagakami model and an inedited framework to discriminate usual parametric models via the generalized Gamma distribution, performing an extensive simulation study to evaluate the maximum likelihood estimations and the proposed discrimination criteria. Indeed, by its own nature, the birth leads us to a context of multiple events, referring to the use of multi-state models. These are models for a stochastic process which at any time occupies one of a few possible states. In general, they are the most common models to describe the development of longitudinal failure time data and are often used in medical applications. Considering this context, we proposed the inclusion of a time dependent covariate in the multi-state model using a modified version of the input data, which gave us satisfactory results similar to those expected in clinical practice.
2018-12-06T00:00:00ZO corte do FBST em modelos de alta dimensionalidadehttps://repositorio.ufscar.br/handle/ufscar/108812019-01-30T12:00:26Z2018-12-03T00:00:00ZO corte do FBST em modelos de alta dimensionalidade
The problem of controlling the significance level of the FBST (Full Bayesian Significant Test) test
is studied in the context of Bayesian models for density, thus, a Bayesian method is shown that
works with density estimation estimation and how the FBST should be conducted in that situation
with this method when it is desired to test if one population has certain density distribution or
equality test of two populations. For this, a modified e-value definition is presented that is an
alternative to calculate the FBST measure. At end a simulation study with different density
distributions and analysis the power function of the test in cases of one and two populations.
2018-12-03T00:00:00Z