Slurm Deployment Status
Current status of Slurm testing and deployment.
Testing Phase: Slurm is currently in testing with limited hardware. Production deployment will follow successful testing.
Current Test Environment
The following resources are available for Slurm testing:
Compute Nodes
| Node Type | CPU Architecture | Quantity | Constraint |
|---|---|---|---|
| CPU Node | Intel Xeon E5 v3 (Haswell) | 4 | --constraint=haswell |
| CPU Node | Intel Xeon E5 v4 (Broadwell) | 4 | --constraint=broadwell |
GPU Nodes
| Node Type | GPUs | Quantity | Request |
|---|---|---|---|
| GPU Node | 4x NVIDIA RTX 2080 (8 GB each) | 1 | --partition=gpu --gres=gpu:rtx_2080:N |
MPI Modules with Slurm Integration
The following modules provide MPI compilers with Slurm integration, allowing use of srun in place of mpirun:
| Environment | Compiler | MPI Library | Module |
|---|---|---|---|
| GNU + OpenMPI | GCC 11.5 | OpenMPI 4.1.8 | openmpi-gcc/openmpi4.1.8-gcc11.5.0-slurm |
| Intel + Intel MPI | Intel 2025.3 | Intel MPI | PrgEnv-intel/2025.3-slurm |
| NVIDIA HPC SDK | NVHPC 26.1 | OpenMPI | PrgEnv-nvidia/26.1-slurm |
These modules are configured for Slurm's Process Management Interface (PMI), enabling efficient job launch with srun. See Parallel Jobs Guide for usage details.
Test Job Examples
CPU job on Haswell node
#!/bin/bash #SBATCH --job-name=test_haswell #SBATCH --output=test.out.%j #SBATCH --ntasks=4 #SBATCH --constraint=haswell #SBATCH --time=00:30:00 hostname echo "Running on Haswell"
CPU job on Broadwell node
#!/bin/bash #SBATCH --job-name=test_broadwell #SBATCH --output=test.out.%j #SBATCH --ntasks=4 #SBATCH --constraint=broadwell #SBATCH --time=00:30:00 hostname echo "Running on Broadwell"
GPU job
#!/bin/bash #SBATCH --job-name=test_gpu #SBATCH --output=test_gpu.out.%j #SBATCH --partition=gpu #SBATCH --gres=gpu:rtx_2080:1 #SBATCH --ntasks=1 #SBATCH --time=00:30:00 module load cuda nvidia-smi
Checking Availability
View the current state of test nodes:
# Show all nodes sinfo -N -l # Show GPU availability sinfo -p gpu -o "%N %G %t"
Known Limitations
- Limited node count - queuing may occur even for small jobs
- Single GPU node - multi-node GPU jobs not available during testing
- Testing environment - configurations may change without notice
Reporting Issues
If you encounter issues during testing, please report them to HPC support with:
- Job ID
- Batch script used
- Error messages or unexpected behavior
- Output from squeue -j JOBID or sacct -j JOBID
Documentation
While testing with limited resources, you can familiarize yourself with Slurm using our documentation:
- Running Jobs with Slurm - Getting started guide
- Batch Script Template - Generic job script
- Submission FAQ - Common questions
- Migrating from LSF - For current LSF users