Aprendizado de máquina construtivo e classificação hierárquica multirrótulo aplicados à geração de moléculas
Abstract
One of the goals of Medicinal Chemistry is to discover new molecules with drug-like characteristics, which is challenging because the search space is discrete, unstructured, and enormous. In recent years, computation has been used as an auxiliary tool in chemical research, and one of the fields of computer science that has gained visibility and applied in various areas of knowledge in recent years is Machine Learning. The field of Machine Learning can be divided into several areas of study. In this research, two fields of Machine Learning are addressed: Constructive Machine Learning and Hierarchical Multi-label Classification. This work explores how Constructive Machine Learning can learn the intrinsic rules of molecule databases and generate instances with similar characteristics to these. The chosen Constructive Machine Learning methods for the study can be divided into two types, those that use the SMILES molecular representation and the methods that use graphs to represent molecules. Considering the different possibilities for evaluating methods and generated molecules, this work proposes the use of hierarchical classification in the evaluation process. Using a hierarchical classifier previously trained on molecule datasets, the generated molecules are classified into a taxonomy. In this way, the relevance of the generated molecules to existing taxonomies can be verified. This work also proposes a measure of dissimilarity between two groups of molecules, the hierarchical distance, which takes into account the taxonomy of the molecules present in these groups to determine the dissimilarity between them.
Collections
The following license files are associated with this item: