The advent of technologies like ChatGPT has propelled machine learning (ML) into the spotlight, heralding a new era in a broad spectrum of research disciplines.
The ongoing surge in ML applications signifies a shift towards methodologies that extend far beyond what was possible just a few years ago.
For ML to truly revolutionize scientific inquiry, developing flexible and efficient modelling tools is imperative. In the field of molecular modelling, the Hylleraas Centre for Quantum Molecular Sciences, supported by Sigma2, is making significant strides in this direction using the framework of the Hylleraas Software Platform (HSP).
Maximizing research potential
Morten Ledum is serving as an Advanced User Support (AUS) User Liaison, funded by
Sigma2 for 18 months over three years. His support exemplifies the targeted effort to enhance strong user communities within Norway's research ecosystem, enabling them to leverage Sigma2 resources fully. As a Norwegian Centre of Excellence, the Hylleraas Centre boasts around 60 active users of national e-infrastructure services. Annually, these users consume close to 100 million CPU hours on Sigma2 resources. The Centre supports a broad spectrum of scientific applications, utilizing a diverse range of software that spans commercial, open-source, and custom in-house codes.
Optimizing performance with active learning
A key initiative for 2023 was the development of a fully automated active learning (AL) framework within the HSP, in collaboration with Sigbjørn Løland Bore, a young research talent at the Hylleraas Centre. This project also involves the HSP developer Tilmann Bondenstein and aims to leverage AL to refine machine-learning potentials (MLPs) for complex scientific phenomena such as ion transfer processes, phase transitions, and chemical reactions.
"This initiative not only showcases the Hylleraas Centre's commitment to pushing the boundaries of scientific research but also highlights the crucial support provided by Sigma2.
Together, we are paving the way to advance high-performance computing (HPC) interactions, to develop frameworks that can benefit communities well beyond the Hylleraas Centre."
Simen Reine, Senior Engineer at the Hylleraas Centre.
Unlike traditional methods that rely on a large number of quantum chemical (QC) calculations for data generation, AL offers a more efficient path by focusing computational efforts where they are most needed, thereby optimizing performance, increasing data efficiency and enhancing model precision.
The AL methodology is designed around a cyclic process: beginning with an initial dataset, it involves training a committee of ML models, identifying new data points through analysis and tuning of model deviations, and performing QC calculations to acquire data at these new points. This approach automates the process of developing robust and efficient MLPs. The framework developed within the HSP allows for a wide exploration of methodologies and the use of various software tools at each step. The HSP seamlessly integrates with both Sigma2 and EuroHPC resources, including Saga, Betzy, and LUMI. The AL framework affords researchers complete control over the choice of software and computational strategies for each of their QC calculations and ML modeling. Supported tools include VASP, CP2K, DeepMD, NequIP and LAMMPS, with the results automatically processed into Pandas data frames and the raw data for further analysis.