Aplicando princípios de aprendizado de máquina na construção de um biocurador automático para o Gene Ontology (GO)
Amaral, Laurence Rodrigues do
MetadataMostrar registro completo
Nowadays, the amount of biological data available by universities, hospitals and research centers has increased exponentially due the use of bioinformatics, with the development of methods and advanced computational tools, and high-throughput techniques. Due to this significant increase in the amount of available data, new strategies for capture, storage and analysis of data are necessary. In this scenario, a new research area is developing, called biocuration. The biocuration is becoming a fundamental part in the biological and biomedical research, and the main function is related with the structuration and organization of the biological information, making it readable and accessible to mens and computers. Seeking to support a fast and reliable understanding of new domains, different initiatives are being proposed, and the Gene Ontology (GO) is one of the main examples. The GO is one the main initiatives in bioinformatics, whose main goal is to standardize the representation of genes and their products, providing interconnections between species and databases. Thus, the main objective of this research is to propose a computational architecture that uses principles of never-ending learning to help biocurators in new GO classifications. Nowadays, this classification task is totally manual. The proposed architecture uses semi-supervised learning combining different classifiers used in the classification of new GO samples. In addition, this research also aims to build high-level knowledge in the form of simple IF-THEN rules and decision trees. The generated knowledge can be used by the GO biocurators in the search for important patterns present in the biological data, revealing concise and relevant information about the application domain.