Aplicação de métodos de redução de dimensionalidade não lineares em classificadores paramétricos e não paramétricos

Hirasawa, João Gabriel Viana

Ver/

Trabalho de conclusão de curso (1.904Mb)

Fecha

2023-12-04

Autor

Hirasawa, João Gabriel Viana

Metadatos

Mostrar el registro completo del ítem

Resumen

Much of the data collected and used in machine learning applications is structured in high-dimensional spaces. Images, text documents and sensor data are some examples of data collected all the time, and whose number of attributes can easily exceed the number of samples in the set. As a consequence, the curse of dimensionality requires the study of ways to mitigate negative effects in models that use these high-dimensional data sets. One solution to deal with this is dimensionality reduction methods, which seek to generate representations with a more tangible number of dimensions, minimizing the loss of information. In this way, the use of such methods within machine learning becomes a field with potential, as they simplify the structure of the data that feeds the models. This work aimed to evaluate the use of different non-linear dimensionality reduction methods together with parametric and non-parametric models in classification tasks. UMAP and PaCMAP were used on high-dimensional data sets, available on the OpenML platform, and the classification performance of the Quadratic Discriminant Analysis (QDA), Gaussian Naive Bayes, k-NN and XGBoost models was evaluated. The results obtained show an improvement in performance for parametric models, mainly with the use of the supervised implementation of UMAP. Although they were not as effective in a more robust and heavy model, XGBoost, the use of the methods represented an improvement in the model's execution time, which indicates an opportunity for application and study in these situations.

URI

https://repositorio.ufscar.br/handle/ufscar/18998

Colecciones

El ítem tiene asociados los siguientes ficheros de licencia:

Creative Commons

Excepto si se señala otra cosa, la licencia del ítem se describe como Attribution 3.0 Brazil