Aprendizado de Máquina para classificação das linhagens de Copia e Gypsy de angiospermas

Tavares, Thayana Vieira

Visualizar/Abrir

TCC Final (1.035Mb)

Data

2021-12-17

Autor

Tavares, Thayana Vieira

Metadata

Mostrar registro completo

Resumo

This study used machine learning algorithms (neural network, decision tree and close neighbors algorithm) to create classification models of 11 lineages of the Copia and Gypsy superfamilies, using angiosperm DNA sequences as training. Of eight models, three were efficient and were able to satisfactorily classify the sequences, in addition to being potentially efficient in classifying data from angiosperm species that were not in the dataset used for training. A comparison of the classification of the three most efficient models was also carried out with the prediction made by the Blast program, which has an algorithm based on sequence alignment, as a result, excellent classification metrics were obtained, however, considering 80% identity and 80 % coverage for there to be a prediction, it failed to classify 30% of the sequences.

URI

https://repositorio.ufscar.br/handle/ufscar/15496

Collections

Os arquivos de licença a seguir estão associados a este item:

Creative Commons

Exceto quando indicado o contrário, a licença deste item é descrito como Attribution-NonCommercial-NoDerivs 3.0 Brazil