Análise de algoritmos de construção de grafos em conjuntos de dados de alta dimensionalidade
Carregando...
Data
Autores
Título da Revista
ISSN da Revista
Título de Volume
Editor
Universidade Federal de São Carlos
Resumo
The rapid development and adoption of information technologies, such as the internet, have resulted in an explosion in data generation, significantly increasing the volume of new instances with various attributes. The vast number of attributes present in certain datasets does not necessarily imply better performance, as algorithm performance decreases as attributes are excessively added (Curse of Dimensionality). Therefore, in the field of machine learning, traditional clustering algorithms play a crucial role in analyzing such data by segmenting it based on similarities. A more recent approach involves associating graph construction algorithms with community detection algorithms. In this approach, graphs represent the relationships between the data, while community detection algorithms are responsible for revealing densely connected groups and identifying hidden patterns. In light of this, the aim of this study is to perform an analysis of graph construction algorithms combined with community detection on high-dimensional datasets. Synthetic datasets were used, with the primary evaluation metric being the normalized mutual information index (NMI). Additionally, datasets with different numbers of attributes, groups, samples, and overlap were evaluated to observe the effects of dimensionality on the results. Traditional clustering algorithms, such as K-means and Agglomerative Clustering, were considered as baselines for comparing the performance of others. The results indicated that graph construction algorithms combined with community detection algorithms are a viable alternative and even yield better results in some cases than traditional clustering algorithms for high-dimensional datasets.
Descrição
Citação
SOARES, Vitor Freitas Xavier. Análise de algoritmos de construção de grafos em conjuntos de dados de alta dimensionalidade. 2023. Trabalho de Conclusão de Curso (Graduação em Engenharia de Computação) – Universidade Federal de São Carlos, São Carlos, 2023. Disponível em: https://repositorio.ufscar.br/handle/20.500.14289/18517.
Coleções
item.page.endorsement
item.page.review
item.page.supplemented
item.page.referenced
Licença Creative Commons
Exceto quando indicado de outra forma, a licença deste item é descrita como Attribution 3.0 Brazil
