Dedicated to research excellence: Welcome to new and returning users

17.10.2023

So far, 2023 has been dominated by technology, and there is no indication that this trend will diminish as we embark on our 2nd resource allocation period of the year. Artificial Intelligence (AI) and Machine Learning (ML) have transitioned from mere buzzwords to becoming essential parts of our future daily lives.

It is fascinating to see that many projects on the national HPC and data storage services are utilising the power of AI and ML to drive groundbreaking research forward.

An abstract concept with a robotic arm holding a computer screen with a digitally displayed human brain

Enabling research through digital cornerstones

We take pride in our role in advancing scientific discoveries and knowledge that positively impacts our society. Supercomputers, advanced storage systems and high-capacity networks have become increasingly important as digital cornerstones for supporting scientific activities, research and innovation.

Before our 2nd resource allocation period this year, which commenced on 1 October, we received more than 500 applications from projects needing high-performance computing (HPC) and storage services. Now, we are proud to welcome more than 2,300 new and returning users on the Norwegian research infrastructure services — including Sigma2's part of the European supercomputer LUMI — for the coming 6 months.

We are eager to showcase some of the research made possible through our services. Below you can read more about our top 5 HPC projects in terms of CPU hours, top 5 storage projects in terms of terabytes (TB) and top 3 Sensitive Data Services (TSD) projects. We are very excited to follow these projects and witness the remarkable contributions they make to their respective fields. In addition to the top lists, we would also like to spotlight some other exciting projects, so please keep on reading.
Top 5 HPC

These projects span various scientific domains, from solar atmospheric modeling to quantum chemistry, and they demonstrate the remarkable potential of HPC in advancing our understanding of the natural world and addressing pressing challenges.

Click on the titles below to read more about each project and see how many CPU hours they have been awarded.

1. Solar Atmospheric modelling

The Sun's impact on Earth is crucial, affecting human health, technology, and critical infrastructures. Solar magnetism is central to understanding the Sun's magnetic field, its activity cycle, and its effects on our environment. Mats Carlsson and his fellow researchers are at the forefront of modelling the Sun's atmosphere with the Bifrost code, aiding our comprehension of the Sun's outer magnetic atmosphere, which influences phenomena like solar wind, satellites and climate patterns.

  • Project leader: Mats Carlsson
  • Institution: Rosseland Centre for Solar Physics, UiO

This project has been awarded 140.100 CPU hours.

2. Combustion of hydrogen blends in gas turbines at reheat conditions

Research into Carbon Capture and Storage (CCS) for fossil fuel-based power generation is crucial in mitigating climate change. With modern gas turbines (GTs) finely tuned for conventional fuel mixtures, CCS introduces challenges due to variations in fuel and operating conditions. To address this, high-resolution computational fluid dynamics is used to study hydrogen-blend combustion in gas turbines, aiming to understand thermo-acoustic behaviour in staged combustion chambers. Utilising Sandia NL's S3D code and supercomputing resources, this research seeks to ensure stable and safe gas turbine operation for CCS applications, contributing to climate change mitigation efforts.

  • Project leader: Andrea Gruber
  • Institution: SINTEF Energi AS

This project has been awarded 75,000 CPU hours.

3. Bjerknes Climate Prediction Unit

The Bjerknes Climate Prediction Unit focuses on developing prediction models to address climate and weather-related challenges. These include predicting precipitation for hydroelectric power, sea surface temperatures for fisheries, and sea-ice conditions for the shipping industry. Bridging the gap between short-term weather forecasts and long-term climate projections, their goal is to create a highly accurate prediction system for northern climate. This involves understanding climate variability, developing data assimilation methods, and exploring the limits of predictability.

  • Project leader: Noel Sebastian Keenlyside
  • Institution: Geophysical Institute, UiB

This project has been awarded 45,000 CPU hours.

4. Heat and Mass Transfer in Turbulent Suspensions

This project focuses on performing detailed simulations of heat and mass transfer in two-phase systems, involving dilute suspensions of particles or bubbles in gas or liquid flows. These systems are common in both natural and technological settings. Building upon advancements in flow simulation algorithms, particularly in the areas of turbulent heat transfer and droplet evaporation, the project seeks to explore the added complexity introduced by phase-change thermodynamics. This complexity includes variable droplet size and phenomena like film boiling, marking a significant advancement from previous studies.

  • Project leader: Luca Brandt
  • Institution: Department of Energy and Process Engineering, NTNU

This project has been awarded 45,000 CPU hours.

5. Quantum chemical studies of molecular properties and energetics of large and complex systems

The Hylleraas Centre for Quantum Molecular Sciences employs advanced computational chemistry codes to simulate complex molecular systems interacting with fields and radiation. Their mission is to comprehend and predict new chemistry, physics, and biology in molecules under extreme conditions and electromagnetic fields, employing cutting-edge computational methods and parallel computing resources.

  • Project leader: Thomas Bondo Pedersen
  • Institution: The Hylleraas Centre for Quantum Molecular Sciences, Department of Chemistry, UiO

This project has been awarded 28,370 CPU hours.

While we applaud the impressive achievements of our top users, let's explore a fascinating project that demonstrates how high-performance computing (HPC) can drive development in a field like biology.
Do you know which type of animal cell is the most variable in terms of its shape and size?

It's the sperm cell, or spermatozoa, which undergoes rapid changes and evolution.

Bilde
A bullfinch sitting on a branch.
The project explores genes and proteins behind sperm variation. Bullfinch photographed by Erica H Leder.

A biology and genomics project at the Natural History Museum (UiO) seeks to unravel why and how sperm cells have diversified within the diverse group of passerines, or songbirds. To do this, they are awarded 150,000 CPU hours on Saga, the preferred supercomputer among many biology users. The Natural History Museum has collected extensive data on sperm length in various bird species over the past 15 years.

Bilde
A siskin sitting on a wooden stake.
The museum collected samples to study sperm genetics in passerine birds. Siskin photographed by Erica H Leder.

This data is invaluable for studying how bird sperm has evolved in terms of its structure and function over evolutionary history. To better understand the genetics behind sperm variation, the museum collected blood and tissue samples along with sperm samples, enabling researchers to study the genes responsible for these differences more effectively. The project will employ various scientific methods to identify which genes contribute to the variation in sperm length among passerine birds. Additionally, the project has expanded to explore the proteins in sperm, seminal fluid, and egg yolk membranes.

Sperm cells are evolving, even though their main role in fertilisation remains crucial. These cells are influenced by two opposing forces: one that ensures they perform their essential job in fertilisation and another that drives changes in their traits to improve their chances of fertilising eggs. These variations in sperm traits can affect their success in fertilising eggs, how females choose sperm, and even maintain the separation of different species by hindering interbreeding.

Supercomputing power needed

The researchers need supercomputer resources for routinely dealing with large amounts of genomic or transcriptomic (the study of cellular RNA transcripts) data.

—We are reconstructing phylogenetic relationships of bird species and estimating divergence times which require millions of iterations and often take days to complete. We use these divergence time estimates of species and populations to better understand the speed of sperm morphological evolution. Additionally, we use transcriptomic data from testes (bulk RNAseq and single-cell sequencing) to compare how the genes responsible for sperm morphology have changed across species, says Erica H Leder, Associate Professor II at the Natural History Museum.

Bilde
Preliminary analysis of single-cell transcriptomic data from testes of three species of birds - BF = Eurasian bullfinch, SI = Eurasian siskin, ZF = zebra finch.
Preliminary analysis of single-cell transcriptomic data from testes of three species of birds - BF = Eurasian bullfinch, SI = Eurasian siskin, ZF = zebra finch.

These analyses take advantage of the large capacity for parallelisation and RAM that are available using a supercomputer, in this case, Saga. The main outcome of this project will be a comprehensive list of genes linked to and underlying the differences in sperm structure among passerine birds, and a greater understanding of how sperm morphology contributes to speciation. The project leader is Postdoctoral Fellow, Emma Whittington.

Source: Sperm Evolution in Birds (Prosjektbanken — Forskningsrådet) and the project application.

Top 5 storage

The top 5 storage projects span a wide range of fields, from language technology to climate modelling, and demonstrate the critical role of advanced storage systems in facilitating research and innovation.

Click on the titles below to read more about each project and see how much storage resources they got for this period.

1. High-Performance Language Technologies

This timely project addresses the rapid advancements in Natural Language Processing (NLP) and its strong presence in Norwegian universities. It focuses on countering the dominance of major American and Chinese tech companies in training Very Large Language Models (VLLMs) and Machine Translation (MT) systems, which have wide applications.

To foster diversity, the project aims to lower entry barriers by establishing a European language data space, facilitating data gathering, computation, and reproducibility for universities and industries to develop free language and translation models across European languages and beyond.

  • Project leader: Stephan Oepen
  • Institution: Department of Informatics, UiO

This project has gotten 2,749 TB of storage resources.

2. Storage for nationally coordinated NorESM experiements

The Norwegian climate research community has developed and maintained the Norwegian Earth System Model (NorESM) since 2007. NorESM has been instrumental in international climate assessments, including the 5th IPCC report through CMIP5.

The consortium, comprising multiple research institutions, is dedicated to contributing to the next phase, CMIP6, in partnership with global modelling centres. This collaborative effort seeks to advance our knowledge of climate change and provide accessible data to scientists, policymakers and the public.

  • Project leader: Mats Bentsen
  • Institution: NORCE Norwegian Research Centre AS

This project has gotten 1,979 TB of storage resources.

3. Bjerknes Climate Prediction Unit

The Bjerknes Climate Prediction Unit focuses on developing advanced prediction models to address climate and weather-related challenges, such as predicting precipitation for hydroelectric power, sea surface temperatures for fisheries, and sea-ice conditions for shipping.

They aim to bridge the gap between short-term weather forecasts and long-term climate projections by assimilating real-world weather data into climate models. This project faces three key challenges: understanding predictability mechanisms, developing data assimilation methods, and assessing climate predictability limits.

  • Project leader: Noel Keenlyside
  • Institution: Geophysical Institute, UiB

This project has gotten 1,539 TB of storage resources.

4. Storage for INES — Infrastructure for Norwegian Earth System modelling

The INES project, led by Norwegian climate research institutions, maintains and enhances the Norwegian Earth System Model (NorESM). Its goals include providing a cutting-edge Earth System Model, efficient simulation infrastructure, and compatibility with international climate data standards.

The Norwegian Climate Modeling Consortium plans to increase contributions for further development and alignment with research objectives.

  • Project leader: Terje Koren Berntsen
  • Department of Geosciences, UiO

This project has gotten 1,319 TB of storage resources.

5. High-latitude coastal circulation modelling

This project, managed by Akvaplan-niva, encompasses physical oceanographic modelling activities, covering coastal ocean circulation modelling in the Arctic, Antarctic, and along the Norwegian coast.

These activities involve storing significant amounts of data, including high-resolution coastal models and atmospheric model output needed to force the ocean circulation models. To achieve high resolution in various regions, we must store substantial model data and results from specific simulations.

  • Project leader: Magnus Drivdal
  • Institution: Akvaplan-niva AS

This project has gotten 990 TB of storage resources.

Also on NIRD, we have an interesting new project which may be worth noticing. The NeuroConvergence project focuses on neuropsychiatric disorders within the field of genomics. What's particularly interesting is the project's utilisation of artificial intelligence and machine learning to uncover and understand these disorders.


Can advanced technology and teamwork cure neuropsychiatric disorders?

Neuropsychiatric disorders (NPDs) like ADHD, bipolar disorder, depression and schizophrenia cause loss of lives, suffering and societal costs worldwide. Traditional pharmacotherapies for these conditions introduced 50-100 years ago, lack effectiveness and specificity. However, recent advances in molecular genetics and cutting-edge technologies are bringing forth a new era for NPD treatment.

Bilde
A woman`s hand on a broken mirror.
The NeuroConvergence project will combine artificial intelligence techniques, including computation, modelling, simulation, machine learning, big data analysis, and advanced experimental biotechnology, to uncover and evaluate new treatment options. Photo: Shutterstock.

Molecular genetic studies uncover promising genetic loci (which are used to pinpoint and study particular parts of an organism`s DNA), enriched in brain pathways and proteins, and offering potential therapeutic targets for neuropsychiatric disorders. At the same time, we see breakthroughs in computational and experimental molecular life sciences. This has paved the way for the NeuroConvergence project, an ambitious initiative that aims for systematic identification and testing of new molecular targets in the treatment of neuropsychiatric disorders.

However, the complexity of discovering brain disorder treatment targets requires collaboration across various disciplines. To tackle this challenge, an interdisciplinary research team has been assembled at the University of Bergen, consisting of experts in machine learning, biotechnology, human genomics, protein structural biology, medicinal chemistry, molecular pharmacology and clinical neuropsychiatry expertise.

The project requires substantial storage capacity to accommodate datasets ranging from 10-100nGB, essential for storing training data, deep learning models, and various outputs from multiple training sessions. This storage infrastructure is crucial for running GPU-based analyses on simulated and publicly available genomic datasets using the NIRD toolkit.

Simultaneously, the project explores the likelihood and therapeutic potential of targets identified in preliminary studies. The project is led by Tom Michoel, Professor at the Department of Informatics at the University of Bergen. The collaboration extends beyond the Norwegian border, including outstanding institutions in Germany, the USA, Spain, and Sweden.

Source: New technologies for target discovery in neuropsychiatric disorders (Prosjektbanken — Forskningsrådet) and the project application.

Secure services for sensitive data

This section will highlight the top 3 projects in terms of CPU and storage allocations, all of which utilise Sensitive Data Services to drive their impactful research. These projects span diverse areas, from mental disorders and cancer genome sequencing to genetic factors related to mental disorders and neurological disorders. Sensitive Data Services (TSD) is a service delivered by the University of Oslo and provided as a national service by Sigma2.

Top 3 in terms of awarded CPU hours

1. NORMENT infrastructure for sensitive data in severe mental disorders

The Norwegian Centre for Mental Disorders Research (NORMENT) is a Center of Excellence (CoE) funded by the Research Council of Norway. NORMENT researchers have extensive access to sensitive data for diverse studies, necessitating robust storage and computational resources at TSD to ensure security and data access.

Project leader: Ole Andreassen
Institution: Faculty of Medicine, UiO

This project has been awarded 1,800,000 CPU hours.

2. IMPRESS-Norway

This project focuses on cancer genome sequencing using NGS methods in clinical research. TSD serves as the primary computational data platform, as the project aims to analyse hundreds of cancer genomes to identify DNA variations impacting diagnoses, prognosis, and treatment options, guiding future cancer treatments and clinical trials. International quality monitoring benchmarks are part of the project.

Project leader: Eivind Hovig
Institution: Department of Informatics, UiO

This project has been awarded 850,000 CPU hours.

NORMENT-MOBA infrastructure for sensitive data

This dedicated TSD project, a collaboration between the Norwegian Mother, Father, and Child Cohort Study (MoBa) and The Norwegian Centre for Mental Disorders Research (NORMENT), focuses on identifying genetic factors related to mental disorders using population cohorts and innovative analysis methods.

  • Project leader: Ole Andreassen
  • Institution: Faculty of Medicine, UiO

This project has been awarded 800,000 CPU hours.

Top 3 in terms of storage allocations

1. Computational analysis of human cancers for biomarker identification

This research project involves sequencing and analysing DNA, RNA, and epigenetic modifications from various human cancers, with a focus on methodological developments to identify novel cancer-specific variants for diagnostics and treatment. Primary areas of study include prostate cancer, colorectal cancer, and bladder cancer, along with other cancer types.

Comparative gene sequence analyses between tumours and healthy tissue aim to uncover genetic changes, which may serve as diagnostic, prognostic, and treatment-effectiveness markers. Sequencing is performed at a deep level to detect rare gene variants with clinical relevance and integrate results with diverse data types, including clinical and gene expression profiles.

Project leader: Rolf Skotheim
Institution: Oslo University Hospital

This project has been allocated 308 TB of storage resources.

2. Sensitive data storage, ELIXIR Norway infrastructure for life science

ELIXIR Norway aims to establish a systematic and secure data storage solution for bioscience users. A dedicated steering group, representing national and major user groups like BioMedData, grants life science researchers access to NIRD's storage space in compliance with Sigma2/NIRD policies. The project strives to provide seamless access to storage, integrated with data analysis and processing services while ensuring data security and privacy.

Project leader: Inge Jonassen
Institution: Faculty of Mathematics and Natural Sciences, UiB

This project has been allocated 220 TB of storage resources.

3. Congenital brain malformations

This project investigates severe childhood-onset neurological disorders, often caused by genetic mutations. Traditional DNA sequencing methods fall short, so the researchers use advanced techniques to analyse all genes and non-coding regions in patients lacking a diagnosis. By conducting functional studies on potential disease-causing variants, new disease-related genes are identified, improving the understanding of these disorders and providing genetic diagnoses for affected patients.

Project leader: Eirik Frengen
Institution: Faculty of Medicine, UiO

This project has been allocated 71 TB of storage resources.

Machine Learning system for scalable domain adaptation of Language Models

While numerous research projects are presently underway in the university sector, let's shift our focus to an industrial project within machine learning. The Iris.ai [EIC Accelerator] project is dedicated to developing a Machine Learning system for scalable domain adaptation of Language Models designed specifically for scientific text.
Close up concept of a neural network.

Researchers encounter challenges in keeping up with the vast amount of new scientific knowledge generated daily. A "Researcher Workspace" is proposed to address this issue, aiming to process scientific knowledge efficiently. This workspace includes content-based search, smart filters, machine-generated summaries, automatic data extraction, and the ability to interact with information. Machine learning models can adapt to different research fields, making scalable solutions with interdisciplinary applications.

The project leader is Anita Schjøll Abildgaard at IRIS.ai.

Bilde
Illustration of a human neural network.

These are just a few examples of the projects that use the national e-infrastructure services this autumn. We follow the outcomes of these projects with great interest and wish all project managers and users the best of luck with their research.