Avaliação do impacto da seleção de partições base em ensemble multiobjetivo
Pedote, Gabriel Leonardo
MetadataShow full item record
Unsupervised data clustering is not a trivial process, as no previous knowledge is available and real data is often complex and multi-faceted. To make matters worse, traditionally, clustering aims to describe the data being explored under a single perspective. However, it is broadly known that in several cases this approach imposes serious limitations on what could be extracted with the analysis. Furthermore, changes in parameters and preprocessing techniques can dramatically change the final result, either by evidencing or by hiding a possible plural meaning presented in the data. To tackle some of these issues, recent efforts that build knowledge considering multiple partitions as base, such as ensemble clustering, emerged. However, special care must be taken in the composition of those partitions, as their quality and diversity proved to be closely related to their performances. To enhance the quality and diversity of those multiple partitions — and provide better results —, a number of methods to evaluate and select a subset of the partitions have been proposed and successfully applied. In this work, we expand this discussion by evaluating the impact of some of the state-of-the-art selection methods in the novel context of multi-objective cluster ensemble. In this novel context, our analysis show improvements in two important issues: (i) the results are more concise, which facilitates posterior manual analysis, and (ii) are obtained with less computational effort. All of that without affecting the quality of the results.