Testing Phase: Slurm is currently in testing with limited hardware. Production deployment will follow successful testing.

Current Test Environment

The following resources are available for Slurm testing:

Compute Nodes

Node Type CPU Architecture Quantity Constraint
CPU Node Intel Xeon E5 v3 (Haswell) 4 --constraint=haswell
CPU Node Intel Xeon E5 v4 (Broadwell) 4 --constraint=broadwell

GPU Nodes

Node Type GPUs Quantity Request
GPU Node 4x NVIDIA RTX 2080 (8 GB each) 1 --partition=gpu --gres=gpu:rtx_2080:N

MPI Modules with Slurm Integration

The following modules provide MPI compilers with Slurm integration, allowing use of srun in place of mpirun:

Environment Compiler MPI Library Module
GNU + OpenMPI GCC 11.5 OpenMPI 4.1.8 openmpi-gcc/openmpi4.1.8-gcc11.5.0-slurm
Intel + Intel MPI Intel 2025.3 Intel MPI PrgEnv-intel/2025.3-slurm
NVIDIA HPC SDK NVHPC 26.1 OpenMPI PrgEnv-nvidia/26.1-slurm

These modules are configured for Slurm's Process Management Interface (PMI), enabling efficient job launch with srun. See Parallel Jobs Guide for usage details.

Test Job Examples

CPU job on Haswell node

#!/bin/bash
#SBATCH --job-name=test_haswell
#SBATCH --output=test.out.%j
#SBATCH --ntasks=4
#SBATCH --constraint=haswell
#SBATCH --time=00:30:00

hostname
echo "Running on Haswell"

CPU job on Broadwell node

#!/bin/bash
#SBATCH --job-name=test_broadwell
#SBATCH --output=test.out.%j
#SBATCH --ntasks=4
#SBATCH --constraint=broadwell
#SBATCH --time=00:30:00

hostname
echo "Running on Broadwell"

GPU job

#!/bin/bash
#SBATCH --job-name=test_gpu
#SBATCH --output=test_gpu.out.%j
#SBATCH --partition=gpu
#SBATCH --gres=gpu:rtx_2080:1
#SBATCH --ntasks=1
#SBATCH --time=00:30:00

module load cuda
nvidia-smi

Checking Availability

View the current state of test nodes:

# Show all nodes
sinfo -N -l

# Show GPU availability
sinfo -p gpu -o "%N %G %t"

Known Limitations

  • Limited node count - queuing may occur even for small jobs
  • Single GPU node - multi-node GPU jobs not available during testing
  • Testing environment - configurations may change without notice

Reporting Issues

If you encounter issues during testing, please report them to HPC support with:

  • Job ID
  • Batch script used
  • Error messages or unexpected behavior
  • Output from squeue -j JOBID or sacct -j JOBID

Documentation

While testing with limited resources, you can familiarize yourself with Slurm using our documentation: