Infraestrutura computacional altamente replicável e portável para pesquisa em Ciência de Dados utilizando OpenHPC
Abstract
High-performance computing or HPC refers to the use of supercomputers or the use of multiple computers in tasks that require a large amount of processing. An HPC infrastructure is a requirement to carry out research in the most varied areas of knowledge. Deploying and maintaining this type of infrastructure is not a simple task and that is why, in large scientific computing centers, there are large teams responsible for this task. This work reports the lessons learned from the implementation of HPC infrastructure at the Federal University of São Carlos, which has a reduced staff. The OpenHPC project, made available and maintained by the free software community, helps to reduce the complexity of this infrastructure. However, there were some adaptations to the standard OpenHPC installation process: (1) the correction of a security hole related to the way the node provisioner is configured; and (2) the use of Ceph as an alternative network file system with greater performance and reliability than NFS, but less complex to operate than Lustre. Then, this study addresses the use of containers as a way to promote the reproducibility and portability of scientific experiments. While container technologies for HPC environments such as Singularity are relatively mature, there is still not as plentiful an abundance of ready-to-use components as in Kubernetes and other cloud-based platforms. Therefore, this work collaborates with the implementation and documentation of a container Singularity to run the Apache Spark platform, widely used in data science research, in an HPC environment. Furthermore, this work proposes and documents a series of facilities for the day-to-day activities of a research group, for example, notification of the completion of experiments through instant messengers. Finally, the complete infrastructure is validated by performing some experiments.
Collections
The following license files are associated with this item: