Aspectos linguísticos na descrição de notícias satíricas do português do Brasil: uma proposta tipológica

Wick-Pedro, Gabriela

Aspectos linguísticos na descrição de notícias satíricas do português do Brasil: uma proposta tipológica

Arquivos

Tese Final.pdf (5.19 MB)

Carta Orientador.pdf (135.9 KB)

Data

2022-10-27

Autores

Wick-Pedro, Gabriela

Editor

Universidade Federal de São Carlos

Resumo

The presence of deception on the web and in messaging applications has been a major contemporary problem. This context generated some initiatives in Linguistics and Computing to linguistically characterize related texts and automatically detect their occurrence. According to (RUBIN; CHEN; CONROY, 2015), there are three traditional types of misleading content: i) fabricated news: produced by what is called the brown press or tabloids; ii) rumors: news disguised to deceive the public and can be released by carelessness by traditional news agencies and iii) satirical news: news similar to real news, however, created for humor purposes. Theoretically, according to Simpson (2003), satire can be defined, based on a triad, as a discursive practice that establishes and results in an ironic incongruity between a satirical target, a satirical author and a satirical audience, and whose purpose is to criticize or mock the satirical target. Thus, if not recognized as humorous content, satirical news can create difficulties in understanding and false beliefs in the minds of more inattentive readers. Automatically detecting satirical news, therefore, proves to be relevant in the linguistic-computational bias, mainly added to the deficiency of works in the literature that consider the computational analysis of satire and the inexistence for the Portuguese language. The construction of a corpus of satirical news and its parallel of true news for Brazilian Portuguese is reported here. The corpus is composed of a subcorpus of 150 satirical news (22,963 words and 1,212 sentences) extracted from the Sensationalista website and another subcorpus of 150 real news (107,133 words and 5,721 sentences) extracted from several online news portals and corresponding to the articles satirical. The total corpus counts 130 thousand words and 6,900 sentences. Furthermore, this work proposes to analyze and describe the morphosyntactics aspects, the difference between the verbal occurrences of satirical news, as well as the main lexical characteristics found in satirical and true articles. To perform this task, the corpus was automatically annotated by the PALAVRAS parser (BICK, 2000). The NILC-Metrix tools (LEAL, 2021) were also used to measure the textual complexity in texts and the LIWC (PENNEBAKER et al., 2015), which evaluates emotional, cognitive and structural components of a given text, is based on the use of a dictionary containing sorting words into categories. Finally, it is expected to contribute to the linguistic description of satirical news and to create, through the results obtained in this research, bases for future Natural Language Processing (NLP) works focused on the automatic identification of misleading content for Brazilian Portuguese.

Palavras-chave

Notícia satírica, Sátira, Notícia falsa, Pistas linguísticas, Corpus, Satirical news, Satire, Fake news, Linguistic clues, Corpus

Citação

WICK-PEDRO, Gabriela. Aspectos linguísticos na descrição de notícias satíricas do português do Brasil: uma proposta tipológica. 2022. Tese (Doutorado em Linguística) – Universidade Federal de São Carlos, São Carlos, 2022. Disponível em: https://repositorio.ufscar.br/handle/20.500.14289/17089.

URI

https://repositorio.ufscar.br/handle/20.500.14289/17089

Coleções

Teses e Dissertações

Licença Creative Commons

Exceto quando indicado de outra forma, a licença deste item é descrita como Attribution-NonCommercial-NoDerivs 3.0 Brazil

Página do item completo

Aspectos linguísticos na descrição de notícias satíricas do português do Brasil: uma proposta tipológica

Arquivos

Data

Autores

Título da Revista

ISSN da Revista

Título de Volume

Editor

Resumo

Descrição

Palavras-chave

Citação

URI

Coleções

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced

Licença Creative Commons