High Performance Computing

Hardware

Hazel Linux cluster

The Hazel cluster is a heterogeneous cluster that includes state-of-the-art CPUs, GPUs, and networking architecture while maintaining older resources as long as feasible. The cluster consists of x86-64 processors (Intel Xeon and AMD EPYC) in dual-socket servers, some with attached GPUs.

On the order of 400 compute nodes with over 14,000 cores
Majority of nodes connected with InfiniBand
Several nodes with attached GPUs including A100, H100, H200, L40S, and other models
Most nodes have more than 128 GB of memory; standard configuration is 512 GB with some 1024 GB nodes

See Compute Resources for full details on available hardware, partitions, and scheduling.

Cluster status monitoring

Real-time availability of resources can be monitored on the cluster status pages:

News, updates, and maintenance schedules

Slurm

LSF

Storage

Storage is provided at no cost to all HPC accounts:

Home directory (/home) — 1 GB per account for source code, scripts, and small executables
Scratch space (/share) — 20 TB per project for running applications and working with large data
Application storage (/usr/local/usrapps) — 100 GB per project by request, backed-up space for installing larger applications and conda environments

See Storage for details on directory locations, size limits, and backup policies.

Partner Program

If existing compute resources are inadequate for the needs of a project, there is an opportunity to purchase additional compute nodes or storage under the HPC Partner Program.

Software

Officially supported applications

HPC staff maintain a core set of commonly used applications including commercial, open-source, and community-supported packages. These are available to all users via the module system.

User-installed software

Users can install their own software in their project directories under /usr/local/usrapps. HPC staff will assist with installation questions, but users are asked to first consult the documentation and make efforts to do installs themselves when feasible.

Conda and containers

Conda is available for managing Python environments and installing packages. Apptainer (formerly Singularity) is provided for running containerized applications, which helps with reproducibility and ease of deployment.

Staff

HPC has a team of computational scientists responsible for maintaining the cluster, supporting users, and helping researchers run their applications effectively.

Consulting

HPC staff are available for consultations to help with:

Getting started on the cluster
Optimizing job scripts and resource requests
Porting and scaling applications
Software installation and environment setup
Troubleshooting job failures and performance issues

Contact HPC to schedule a consultation.

Training

HPC staff provide training workshops and resources for users at all levels:

Introductory course on using the Hazel cluster
Quick Start Tutorial for self-paced learning
User Forums covering topics of interest to the HPC community
Group-specific training available on request

Documentation

Extensive documentation is available covering all aspects of using the HPC cluster, from connecting to running parallel jobs. See Getting Started for a full index.

Faculty Oversight

The HPC Faculty Oversight Committee advises on matters related to the HPC service, including acceptable use policy, resource allocation, and strategic planning.

HPC Resources