Filtragem automática de opiniões falsas: comparação compreensiva dos métodos baseados em conteúdo

Cardoso, Emerson Freitas

Filtragem automática de opiniões falsas: comparação compreensiva dos métodos baseados em conteúdo

Arquivos

CARDOSO_Emerson_2017.pdf (3.15 MB)

Data

2017-08-04

Autores

Cardoso, Emerson Freitas

Editor

Universidade Federal de São Carlos

Resumo

Before buying a product or choosing for a trip destination, people often seek other people’s opinions to obtain a vision of the quality of what they want to acquire. Given that, opinions always had great influence on the purchase decision. Following the enhancements of the Internet and a huge increase in the volume of data traffic, social networks were created to help users post and view all kinds of information, and this caused people to also search for opinions on the Web. Sites like TripAdvisor and Yelp make it easier to share online reviews, since they help users to post their opinions from anywhere via smartphones and enable product manufacturers to gain relevant feedback quickly in a centralized way. As a result, most people nowadays trust personal recommendations as much as online reviews. However, competition between service providers and product manufacturers have also increased in social media, leading to the first cases of spam reviews: deceptive opinions published by hired people that try to promote or defame products or businesses. These reviews are carefully written in order to look like authentic ones, making it difficult to be detected by humans or automatic methods. Thus, they are used, in a misleading way, in attempt to control the general opinion, causing financial harm to business owners and users. Several approaches have been proposed for spam review detection and most of them use techniques involving machine learning and natural language processing. However, despite all progress made, there are still relevant questions that remain open, which require a criterious analysis in order to be properly answered. For instance, there is no consensus whether the performance of traditional classification methods can be affected by incremental learning or changes in reviews’ features over time; also, there is no consensus whether there is statistical difference between performances of content-based classification methods. In this scenario, this work offers a comprehensive comparison between traditional machine learning methods applied in spam review detection. This comparison is made in multiple setups, employing different types of learning and data sets. The experiments performed along with statistical analysis of the results corroborate offering appropriate answers to the existing questions. In addition, all results obtained can be used as baseline for future comparisons.

Palavras-chave

Spam (Mensagens eletrônicas), Processamento de linguagem natural (Computação), Natural language processing (Computer science), Spam (Electronic mail), Opiniões falsas, Classificação, Processamento de linguagem natural, Aprendizado de máquina, Spam reviews, Classification, Natural language processing, Machine learning

Citação

CARDOSO, Emerson Freitas. Filtragem automática de opiniões falsas: comparação compreensiva dos métodos baseados em conteúdo. 2017. Dissertação (Mestrado em Ciência da Computação) – Universidade Federal de São Carlos, Sorocaba, 2017. Disponível em: https://repositorio.ufscar.br/handle/20.500.14289/9141.

URI

https://repositorio.ufscar.br/handle/20.500.14289/9141

Coleções

Teses e Dissertações

Página do item completo

Filtragem automática de opiniões falsas: comparação compreensiva dos métodos baseados em conteúdo

Arquivos

Data

Autores

Título da Revista

ISSN da Revista

Título de Volume

Editor

Resumo

Descrição

Palavras-chave

Citação

URI

Coleções

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced