Análise comparativa entre algoritmos de classificação de gênero musical baseados em diferentes representações visuais e transferência de aprendizado
Resumen
Organizing and retrieving musical information automatically is a highly demanded activity. Labeling music with information that succinctly describes it has implications for these and other related tasks. One of the most widely used approaches to label musical recordings is through genre information. However, this task is quite challenging. In recent years, the literature has shown significant progress in this task by applying machine learning algorithms based on deep neural networks (DNN). In this scenario, the commonly adopted practice is to use time-frequency visual representations of audio as input for a DNN. Therefore, the aim of this work is to perform a comparative analysis between the impact of using various visual representations of music, such as Spectrogram, Mel-spectrogram, Chromagram, Tempogram, and Tonnetz, obtained from its audio, and transfer learning in genre classification through DNNs. This research will present the foundational knowledge and acquired results, as well as an analysis of the outcomes. This analysis aims to understand the processes that contributed to the transfer learning approach outperforming the use of various visual representations in achieving better and more consistent results.
Colecciones
El ítem tiene asociados los siguientes ficheros de licencia: