Modelo hierárquico Bayesiano não paramétrico aplicado em modelagem de tópicos
Visualizar/ Abrir
Data
2024-02-19Autor
Cunha, Robson Ortz Oliveira
Metadata
Mostrar registro completoResumo
Given the growing need and importance of analyzing textual data in the field of artificial intelligence, models that can better understand human language and deal with unstructured data are increasingly relevant gains. In this work, we developed a study on the Hierarchical Dirichlet Process (HDP) in modeling textual topics, exploring its practical aspects by applying it to a data set (\textit{corpus}) of legal processes, composed of three types of different procedures. We will discuss the main properties of HDP, from a Bayesian perspective, assuming that the data comes from a Multinomial probability distribution, based on the \textit{bag-of-words} textual representation model, commonly used in natural language processing . We also proceeded with some textual pre-processing techniques, which resulted in more parsimonious documents (data), and with a simulation study to verify the model's performance. At the end of the work, we present the results of the applications carried out and discuss the issues of data analysis in jurimetry.
Collections
Os arquivos de licença a seguir estão associados a este item: