Prevendo a popularidade de um post no Instagram via métodos de Machine Learning

Silva, Marcos Costa da

dc.contributor.author	Silva, Marcos Costa da
dc.date.accessioned	2024-02-15T18:06:03Z
dc.date.available	2024-02-15T18:06:03Z
dc.date.issued	2024-01-24
dc.identifier.citation	SILVA, Marcos Costa da. Prevendo a popularidade de um post no Instagram via métodos de Machine Learning. 2024. Trabalho de Conclusão de Curso (Graduação em Estatística) – Universidade Federal de São Carlos, São Carlos, 2024. Disponível em: https://repositorio.ufscar.br/handle/ufscar/19292.	*
dc.identifier.uri	https://repositorio.ufscar.br/handle/ufscar/19292
dc.description.abstract	This work was motivated by the current technological scenario in which we live, where people are increasingly connected on social networks, generating a vast amount of data daily. If used appropriately, these data can provide valuable insights in the digital market. By combining this source of information with modern methods of artificial intelligence, more specifically, machine learning algorithms, we were able to study the main characteristics that drive the popularity of a post on Instagram. In our research, we used a real database with profile information from 1887 Instagram users. Initially, we explored the data through statistical methods to understand its disposition and composition. Then, we selected the variables of interest and greater relevance to predict the popularity of a publication, using techniques such as feature importance through the LightGBM algorithm on the training set. Subsequently, we normalized the selected continuous variables and modeled the data using both the LightGBM algorithm and the Multilayer Perceptron neural network. As a final result, we concluded that the indicators developed during the feature engineering process, along with the variables from the original database such as the number of views, comments, followers, and the variables generated in Natural Language Processing (NLP) such as the sentiment of the caption, proved to be the most relevant for predicting the popularity of a post on Instagram. Among all the fitting tests performed, the model that demonstrated the best performance in handling the information was LightGBM, with optimized parameters, achieving an RMSE of 0.04 and an R2 of 0.99.	eng
dc.description.sponsorship	Não recebi financiamento	por
dc.language.iso	por	por
dc.publisher	Universidade Federal de São Carlos	por
dc.rights	Attribution-NonCommercial-NoDerivs 3.0 Brazil	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc-nd/3.0/br/	*
dc.subject	Instagram	eng
dc.subject	LightGBM	eng
dc.subject	Multi layer perceptron	eng
dc.title	Prevendo a popularidade de um post no Instagram via métodos de Machine Learning	por
dc.title.alternative	Predicting the popularity of an Instagram post through Machine Learning methods	eng
dc.type	TCC	por
dc.contributor.advisor1	Izbicki, Rafael
dc.contributor.advisor1Lattes	http://lattes.cnpq.br/9991192137633896	por
dc.description.resumo	Este trabalho foi motivado pelo cenário tecnológico atual em que vivemos, no qual as pessoas estão cada vez mais conectadas nas redes sociais, gerando diariamente uma vasta quantidade de dados. Se utilizados de maneira adequada, esses dados podem proporcionar insights valiosos no mercado digital. Ao combinar essa fonte de informação com métodos modernos de inteligência artificial, mais especificamente, algoritmos de machine learning, conseguimos estudar as principais características que impulsionam a popularidade de um post no Instagram. Em nossa pesquisa, utilizamos uma base de dados real com informações de perfil de 1887 usuários do Instagram. Inicialmente, exploramos os dados por meio de métodos estatísticos para compreender sua disposição e composição. Em seguida, selecionamos as variáveis de interesse e de maior relevância para predizer a popularidade de uma publicação, utilizando técnicas como features importance por meio do algoritmo LightGBM na base de treinamento. Posteriormente, normalizamos as variáveis contínuas selecionadas e modelamos os dados usando tanto o algoritmo LightGBM quanto a rede neural do tipo Perceptron Multicamadas. Como resultado final, concluímos que os indicadores desenvolvidos durante o processo de feature engineering, juntamente com as variáveis da base original, como o número de visualizações, comentários, seguidores, e as variáveis geradas no Processamento de Linguagem Natural (PLN), como o sentimento da legenda, mostraram-se as mais relevantes para prever a popularidade de um post no Instagram. Entre todos os testes de ajuste realizados, o modelo que demonstrou o melhor desempenho na manipulação das informações foi o LightGBM, com parâmetros otimizados, alcançando um RMSE de 0.04 e um R^2 de 0.99.	por
dc.publisher.initials	UFSCar	por
dc.subject.cnpq	CIENCIAS EXATAS E DA TERRA::PROBABILIDADE E ESTATISTICA::ESTATISTICA	por
dc.publisher.address	Câmpus São Carlos	por
dc.publisher.course	Estatística - Es	por

Files in this item

Name:: license_rdf
Size:: 810bytes
Format:: application/rdf+xml

View/Open

Name:: Relatório_TGB_Marcos_Costa.pdf
Size:: 2.782Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Show simple item record

Except where otherwise noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 Brazil