Explorando a avaliação de sumários automáticos multidocumento multilíngues

Nascimento, Darlan Xavier

Explorando a avaliação de sumários automáticos multidocumento multilíngues

Arquivos

Dissertação-Darlan Xavier Nascimento.pdf (2.67 MB)

Carta do orientador assinada.pdf (442.47 KB)

Data

2020-03-12

Autores

Nascimento, Darlan Xavier

Editor

Universidade Federal de São Carlos

Resumo

Multilingual Multi-Document Automatic Summarization (MMDS) is a computational task through which a summary is produced in a target language from a collection of at least two news stories which address the same subject, one in the user’s language and the other(s) in foreign language(s). The scientific literature shows that not many researches approach methods which generate summaries in Portuguese. Based on the CF and CFUL summarization methods, the present thesis describes the development of a study whose goal was to refine the summary quality evaluation, by varying (i) the native language of the producers of the reference summaries, that is, summaries written by human subjects after reading the corresponding source texts and which are necessary for the automatic calculation of informativeness, and (ii) the compression rate (desired summary size). Furthermore, this thesis outlines the enlargement of the corpus used for the investigation of these methods through the addition of texts in German (the original corpus included content in Portuguese and English) and the production of four extracts for each of the twenty clusters. The results show that the reference summaries are slightly impacted by their writer’s native language, even though additional factors might be taken into account, such as the size of each source text and the content compatibility. Regarding the summarization methods, this study found that extracts with a lower compression rate performed better when it came to the automatic evaluation of informativeness and worse in the assessment of linguistic quality.

Palavras-chave

Sumarização automática, Linguística computacional, Avaliação de sumários, Automatic summarization, Computational linguistics, Summary evaluation

Citação

NASCIMENTO, Darlan Xavier. Explorando a avaliação de sumários automáticos multidocumento multilíngues. 2020. Dissertação (Mestrado em Linguística) – Universidade Federal de São Carlos, São Carlos, 2020. Disponível em: https://repositorio.ufscar.br/handle/20.500.14289/12642.

URI

https://repositorio.ufscar.br/handle/20.500.14289/12642

Coleções

Teses e Dissertações

Licença Creative Commons

Exceto quando indicado de outra forma, a licença deste item é descrita como Attribution-NonCommercial-NoDerivs 3.0 Brazil

Página do item completo

Explorando a avaliação de sumários automáticos multidocumento multilíngues

Arquivos

Data

Autores

Título da Revista

ISSN da Revista

Título de Volume

Editor

Resumo

Descrição

Palavras-chave

Citação

URI

Coleções

item.page.endorsement

item.page.review

item.page.supplemented

item.page.referenced

Licença Creative Commons