Classificação semissupervisionada baseada em densidade com reconhecimento de anomalias

Carregando...
Imagem de Miniatura

Título da Revista

ISSN da Revista

Título de Volume

Editor

Universidade Federal de São Carlos

Resumo

In the context of data mining, the task of anomaly detection is important because observations that deviate from the majority can negatively affect machine learning models or represent the main object of interest in various real-world scenarios. At the same time, semi-supervised classification tasks are essential in situations where labeled data are scarce. In this work, we suggest unifying these two tasks into a single integrated process: we propose combining a state-of-the-art density-based clustering algorithm capable of detecting outliers with two well-known density-based semi-supervised classifiers, with the goal of producing hybrid methods capable of performing both tasks. Experiments conducted on 42 semi-synthetic datasets with different proportions of labeled objects and two distinct types of anomalies showed that the investigated anomaly detection method outperforms similar approaches, especially on datasets containing global anomalies. The results also demonstrate that when the outlier detection method is combined with the semi-supervised classifiers, there is only a minor impact on classification quality. Thus, we show that the proposed hybrid approaches constitute viable alternatives to their respective original methods, enabling explicit identification of anomalies without significantly compromising classification performance.

Descrição

Citação

MASS, Bruno. Classificação semissupervisionada baseada em densidade com reconhecimento de anomalias. 2025. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de São Carlos, São Carlos, 2025. Disponível em: https://repositorio.ufscar.br/handle/20.500.14289/23433.

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced

Licença Creative Commons

Exceto quando indicado de outra forma, a licença deste item é descrita como Attribution-NonCommercial-NoDerivs 3.0 Brazil