Investigação de técnicas de in-context learning para reconhecimento de entidades nomeadas para português

Carregando...
Imagem de Miniatura

Título da Revista

ISSN da Revista

Título de Volume

Editor

Universidade Federal de São Carlos

Resumo

In the field of Natural Language Processing (NLP), Named Entity Recognition (NER) is the task responsible for identifying named entities (NEs) in texts and classifying them according to predefined categories. This task enables the automated extraction of key information—such as names of people, organizations, locations, dates, and other elements—thus contributing to semantic text analysis. Given the effectiveness of Large Language Models (LLMs) for NER and other NLP tasks, this study conducted an initial investigation of several Prompt Engineering approaches for NER in texts written in standard or formal Portuguese. Specifically, we explored the In-Context Learning (ICL) strategy known as zero-shot prompting, in which the task is performed solely based on the LLM’s prior knowledge, and few-shot prompting, through which LLMs learn to perform a task from a few examples provided directly in the prompt. Each strategy was employed in combination with additional prompting techniques, including two instruction-based approaches (i.e., persona prompting and rule-based prompting) and one type of output formatting. To this end, a sample from the Porttinari-base corpus was used, consisting of 471 sentences manually annotated with NEs. This sample served as the gold standard for evaluating NER based on LLM with ICL techniques. Results show that a substantial performance improvement with the few-shot approach (F1-Score of 66.0%) compared to the zero-shot scenario (F1-Scores of 37.7%). Despite the sample limitations, it is concluded that Prompt Engineering offers a promising and agile alternative to the completion of NER to fine-tuned models.

Descrição

Citação

VINCENZI, Bianca Couto. Investigação de técnicas de in-context learning para reconhecimento de entidades nomeadas para português. 2025. Trabalho de Conclusão de Curso (Graduação em Linguística) – Universidade Federal de São Carlos, São Carlos, 2025. Disponível em: https://repositorio.ufscar.br/handle/20.500.14289/24018.

Coleções

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced

Licença Creative Commons

Exceto quando indicado de outra forma, a licença deste item é descrita como Attribution-NonCommercial-NoDerivs 3.0 Brazil