HPC and storage systems

Access to the national supercomputing system empowers Norwegian research and innovation across various scientific fields. These high-performance machines accelerate progress in climate modelling, pharmaceutical research, astrophysics, materials science, and more. Supercomputers (HPC systems) not only perform tasks faster but also tackle once-impossible problems. They enable deeper exploration of cosmic mysteries, enhance climate prediction accuracy, facilitate innovative materials design, and advance life-saving drug development. In essence, these machines drive the frontiers of knowledge and technology.

HPC systems

Norway's e-infrastructure comprises 3 robust supercomputer systems and a state-of-the-art storage solution. Additionally, as part-owner of Europe's most powerful supercomputer, LUMI Sigma2 enriches the landscape for the benefit of Norwegian researchers.

Each of the HPC facilities consists of a compute resource (several compute nodes, each with several processors and internal shared memory, plus an interconnect that connects the nodes), a central storage resource that is accessible by all the nodes, and a secondary storage resource for back-up (and in few cases also for archiving). All facilities use variants of the UNIX operating system (Linux, AIX, etc.).

Storage systems

The NIRD provides data storage capacity for research projects with a data-centric architecture. Hence, it is also used for storage capacity for the HPC systems for the national data archive and other services requiring a storage backend. It consists of two geographically separated storage systems. The leading storage technology combined with the powerful network backbone underneath allows the two systems to be geo‐replicated asynchronously, thus ensuring high availability and security of the data.

The NIRD storage facility, differently from its predecessor NorStore, is strongly integrated with the HPC‐ systems, thus facilitating the computation of large datasets. Furthermore, the NIRD storage offers high-performance storage for post-processing, visualization, GPU‐computing and other services on the NIRD Service Platform.

Systems in production

Betzy — BullSequana XH2000

Betzy is Norway`s most powerful supercomputer of all times.

The supercomputer is named after Mary Ann Elizabeth (Betzy) Stephansen, the first Norwegian woman with a PhD in mathematics. Betzy is a BullSequana XH2000, provided by Atos, with a theoretical peak performance of 6.2 PetaFlops.

Betzy is placed at NTNU in Trondheim and has been in production since 24 November 2020.

Betzy offers mainly CPU-compute capacity and some GPU compute capacity. The latest GPU and AI compute capacity are provided through NVIDIA accelerators. While the system is mainly suited for highly parallel MPI jobs, utilising from 512 cores up to 65 536 cores, it also offers smaller pre-and post-processing capabilities through dedicated nodes. Compared with Saga and Fram, Betzy is the system best suited for highly parallel jobs.

Technical specifications

Some common applications running on Betzy:

  • OpenFoam
  • NorESM
  • Bifrost
  • FluTAS
  • MGLET
  • ABINIT

The most common users on the machine (in descending order) are from the following fields of science:

  • Geosciences
  • Computational Fluid Dynamics (CFD)
  • Physics
  • Chemistry, and
  • Marine technology

Betzy use cases

Some examples of use cases conducted on Betzy:

Fram — Lenovo NeXtScale nx360

Named after the Norwegian arctic expedition ship Fram, this machine started production on 1 November 2017. The computer is hosted at the UiT The Arctic University of Norway and is provided by Lenovo. 

For technical details, please refer to our technical documentation. 

This distributed memory system offers CPU-compute capacity interconnected with a high-bandwidth low-latency Infiniband network. The interconnect network is organized in an island topology, with 9216 cores on each island. The machine also has some nodes with more memory, enabling support for jobs demanding up to 512 GiB per core. 

The machine is well suited for distributed memory jobs using between 32 and 512 cores by using MPI. Fram serves as our “mid-range” system in terms of recommended job size. 

Techical specifications

Some common applications run on Fram are:

  • VASP
  • CESM
  • ROMS
  • WRF
  • Python scripts
  • LAMMPS
  • Gaussian

The most common users on the machine (in descending order) are from the following fields of science:

  • Geosciences
  • Material science
  • Chemistry
  • Physics
  • Computational Fluid Dynamics (CFD), and
  • Marine technology
Saga — Apollo 2000/6500 Gen10

The national supercomputer Saga is named after the goddess in Norse mythology associated with wisdom. Saga is also a term for Icelandic epic prose literature.

This supercomputer open to users in 2019, and is located at NTNU in Trondheim.

Technical specifications

Saga offers the latest GPU and AI compute capacity through NVIDIA accelerators. This machine has several large memory nodes and can serve jobs requiring up to 6 TiB of RAM per core. This machine is well suited for running single-core applications, shared memory applications (OpenMP) and applications utilising up to 256 cores.

Saga serves the majority of our HPC projects. Some common applications which run on Saga are:

  • Orca
  • Python scripts
  • VASP
  • Gaus ian
  • LAMMPS

The primary users of saga are within the following fields of science:

  • Chemistry
  • Material Science
  • Biosciences
  • Geosciences
  • Physics
  • Medical Science

Saga use cases

Some examples of research activities conducted on Saga:

LUMI — HPE Cray EX supercomputer

LUMI (Large Unified Modern Infrastructure) is the first of three pre-exascale supercomputers built to ensure that Europe is among the world leaders in computing capacity. Norway, through Sigma2, owns part of the LUMI supercomputer which is funded by the EU and consortium countries.

Key figures (Sigma2`s share):

CPU-core hours34 003 333
GPU-hours 1 771 000
TB-hours16 862 500
Central diskA share of the total 117 PB
Theoretical Performance (Rpeak)~11 PFLOPS

Using 1 core of LUMI-C for 1 hour costs 1 CPU-core-hour, and using 1 GPU of the LUMI-G partition for 1 hour costs 1 GPU-hour. Storing 1 terabyte on LUMI-F consumes 10 terabyte-hours in 1 hour, on LUMI-P 1 terabyte-hour per hour, and on LUMI-O 0.5 terabyte-hours per hour.

NIRD — National Infrastructure for Research Data

NIRD offers storage services, archiving services, cloud services, and processing capabilities for stored data. It provides these services and capacities to scientific disciplines requiring access to advanced, large-scale, or high-end resources for storage, data processing, research data publication, or digital database and collection searches. NIRD is a high-performance storage system capable of supporting AI and analytics workloads, enabling simultaneous multi-protocol access to the same data.

NIRD provides storage resources with yearly capacity upgrades, data security through backup services and adaptable application services, multiple storage protocol support, migration to third-party cloud providers and much more. Alongside the national high-performance computing resources, NIRD forms the backbone of the national e-infrastructure for research and education in Norway, connecting data and computing resources for efficient provisioning of services.
Technical Specifications
Hardware

NIRD consists of two separate storage systems, namely Tiered Storage (NIRD TS) and Data Lake (NIRD DL). The total capacity of the system is 49 PB (24 PB on NIRD TS and 25 PB on NIRD DL).

NIRD TS has several tiers spanned by single filesystem and designed for performance and used mainly for active project data.

NIRD DL has a flat structure, designed mainly for less active data. NIRD DL provides a unified access, i.e., file- and object storage for sharing data across multiple projects, and interfacing with external storages.

NIRD is based on the IBM Elastic Storage System, built using ESS3200, ESS3500 and ESS5000 building blocks. I/O performance is ensured with IBM POWER9 servers for I/O operations, having dedicated data movers, protocol nodes and more.

System information

CategoryDescriptionDetails
SystemBuilding blocksIBM ESS3200
IBM ESS3500
IBM ESS5000
IBM POWER9
ClustersTwo physically separated clustersNIRD TS
NIRD DL
Storage mediaNIRDS TS
NIRD DL
NVMe SSD & NL-SAS
NL-SAS
CapacityTotal capacity:
49 PB
NIRD TS: 24 PB
NIRD DL: 25 PB
PerformanceAggregated I/O throughputNIRD TS: 209 GB/s
NIRD DL: 66 GB/s
Interconnect100 Gbit/s EthernetNIRD TS: balanced 400 Gbit/s
NIRD DL: balanced 200 Gbit/s
Protocol nodesNFS S34 x 200 Gbit/s
5 x 50 Gbit/s

Software

IBM Storage Scale (GPFS) is deployed on NIRD, providing a software-defined high-performance file- and object storage for AI and data-intensive workloads.
Insight into data is ensured by IBM Storage Discover.
Backup services and data integrity are ensured with IBM Storage Protect.

NIRD Service Platform Hardware

The NIRD Service Platform is a Kubernetes-based cloud platform providing persistent services such as web services, domain- and community specific portals, as well as on-demand services through the NIRD Toolkit.

The cloud solution on the NIRD Service Platform enables researchers to run microservices for pre/post-processing, data discovery and analysis as well as data sharing, regardless of dataset sizes stored on NIRD.

The NIRD Service Platform was designed with high-performance computing and artificial intelligence capabilities in mind, to be robust, scalable and having the ability of running AI and ML workloads.

The technical specifications of the NIRD Service Platform are listed below:

Workers12
CPUs2368 cores8 workers with 256 cores
4 workers with 80 cores
GPUs30Nvidia V100
RAM9 TiB4 workers with 512 GiB
4 workers with 1024 GiB
4 workers with 768 GiB
InterconnectEthernet8 workers with 2 x 100 Gbit/s
4 workers with 2 x 10 Gbit/s

To the NIRD Service Platform service description

Betzy and Saga are located at NTNU in Trondheim, and Fram at UiT in Tromsø. LUMI is located in a data centre facility in Kajaani, Finland. NIRD (Norwegian Infrastructure for Research Data) is located in Lefdal Mine Datacenter, where Sigma2`s future HPC systems will also be placed.

Decommissioned systems

Gardar (2012-2015)

Gardar was an HP BladeCenter cluster consisting of one frontend (management, head) node, 2 login nodes and 288 compute nodes running Centos Linux managed by Rocks. Each node contains two Intel Xeon Processors with 24 GB memory. The compute nodes were located in HP racks. Each HP rack contained three c7000 Blade enclosures and each enclosure contained 16 compute nodes.

Gardar had a separate storage system. The X9320 Network storage system was available to the entire cluster and used the IBRIX Fusion software. The total usable storage of X9320 was 71.6TByte. The storage system was connected to the cluster with an Infiniband QDR network.

Technical details

SystemHP BI280cG6 Servers
Number of cores 3456
Number of nodes288
CPU typeIntel Xeon E5649 (2.53GHz) - Westmere -EP
Number of teraflops35TFlops
Total storage capacity71.6TByte
Hexagon (2008-2017)

Significant use of Hexagon came traditionally from areas of science such as computational chemistry, computational physics, computational biology, geosciences and mathematics. Hexagon was installed at the High Technological Center in Bergen (HiB) and was managed and operated by the University of Bergen. Hexagon was upgraded from Cray XT4 in March 2012.

Technical details

SystemCray XE6-200
Number of cores22272
Number of nodes696
Cores per node32
CPU typeCray Gemini Interconnect
TFlops peak performance204.9 Teraflops/s
Operative systemCray Linux Environment
Abel (2012-2020)

Named after the famous Norwegian mathematician Niels Henrik Abel, the Linux cluster at the University of Oslo was a shared resource for research computing capable of 258 TFLOP/s theoretical peak performance. At the time of installation on 1 October 2012, Abel reached position 96 on the Top500 list of the most powerful systems in the world.

Abel was an all-around, all-purpose cluster designed to handle multiple concurrent jobs/users with varying requirements. Instead of massively parallel applications, the primary application profile was for moderately to smaller parallel applications with high IO and/or memory demand.

Technical details

SystemMEGWARE MiriQuid 2600
Number of cores10000+
Number of nodes650+
CPU type Intel E5 2670
Max Floating-point performance, double258 Teraflops/s
Total memory 40 TiB
Total disc capasity400 TiB
Stallo (2007-2021)

The Linux Cluster Stallo was a compute cluster at the University of Tromsø, which was installed on 1 December 2007, and included in NOTUR on 1 January 2008. The supercomputer was upgraded in 2013.

Stallo was intended for a distributed-memory MPI application with low communication requirements between the processors, a shared-memory OpenMP application using up to eight processor cores, and parallel applications with moderate memory requirements (2-4 GB per core) and embarrassingly parallel applications.

Technical details

SystemHP BL 460c Gen 8
Number of cores14116
Number of nodes518
CPU typeIntel E5 2670
Peak performance104 Teraflops/s
Total memory12.8 TB
Total disc capacity2.1 PB
Vilje (2012-2021)

Vilje was a cluster system procured by NTNU, in cooperation with the Norwegian Meteorological Institute and Sigma in 2012. Vilje was used for numerical weather prediction in operational forecasting by Meteorologisk institiutt as well as for research in a broad range of topics at NTNU and other Norwegian universities, colleges and research institutes. The name Vilje was taken from Norse Mythology.

Vilje was a distributed memory system that consisted of 1440 nodes interconnected with a high-bandwidth low-latency switch network (FDR Infiniband). Each node had two 8-core Intel Sandy Bridge (2.6 Ghz) and 32 GB memory. The total number of cores was 23040.

The system was well-suited (and intended) for large-scale parallel MPI applications. Access to Vilje was in principle only allowed for projects that had parallel applications that used a relatively large number of processors (≥ 128).

Technical details

SystemSGI Altix 8600
Number of cores22464
Number of nodes1404
CPU typeIntel Sandy Bridge
Total memory44 TB
Total disc capacity