Anotação e caracterização de fenômenos léxico-ortográficos em tweets do mercado financeiro
Carregando...
Data
Autores
Título da Revista
ISSN da Revista
Título de Volume
Editor
Universidade Federal de São Carlos
Resumo
With the growth of digital media and social networks, the importance of user-generated content (UGC) has increased considerably, creating demand for Natural Language Processing (NLP) tools and applications capable of handling the predominantly non-canonical language found across UGC genres. Annotated corpora are essential resources for this purpose, as are the description and analysis of their linguistic characteristics. Among these resources, corpora composed of tweets/posts stand out, given the relevance of the Twitter/X platform for various segments of society. In this undergraduate thesis, the manual annotation of lexical-orthographic phenomena in the financial-market tweet corpus known as DANTEStocks was further advanced. This annotation effort, which covered approximately 75% of the total tweets, enabled a preliminary characterization of the corpus based on a hierarchical typology of classes, types, and subtypes designed to capture creative phenomena and variation with respect to the standard norm.
Descrição
Palavras-chave
Citação
OLIVEIRA, Gabriela Pinheiro de. Anotação e caracterização de fenômenos léxico-ortográficos em tweets do mercado financeiro. 2025. Trabalho de Conclusão de Curso (Graduação em Linguística) – Universidade Federal de São Carlos, São Carlos, 2025. Disponível em: https://repositorio.ufscar.br/handle/20.500.14289/23309.
Coleções
item.page.endorsement
item.page.review
item.page.supplemented
item.page.referenced
Licença Creative Commons
Exceto quando indicado de outra forma, a licença deste item é descrita como Attribution-NonCommercial-NoDerivs 3.0 Brazil
