What Are Array Jobs?

Array jobs let you submit many similar jobs with a single script. Each array task runs independently with a unique index that you can use to process different input files, parameters, or data subsets.

Use array jobs when you need to:

  • Process many input files with the same program
  • Run parameter sweeps
  • Perform Monte Carlo simulations
  • Execute embarrassingly parallel workloads

Basic Syntax

#!/bin/bash
#SBATCH --job-name=my_array
#SBATCH --output=output_%A_%a.out    # %A = array job ID, %a = task index
#SBATCH --error=error_%A_%a.err
#SBATCH --array=1-100                # Create 100 tasks with indices 1-100
#SBATCH --ntasks=1
#SBATCH --time=01:00:00

# $SLURM_ARRAY_TASK_ID contains the current task index (1-100)
echo "Processing task $SLURM_ARRAY_TASK_ID"

./myprogram input_${SLURM_ARRAY_TASK_ID}.dat

Array Index Patterns

PatternIndices CreatedUse Case
--array=1-1001, 2, 3, ... 100Simple sequential range
--array=0-990, 1, 2, ... 99Zero-based indexing
--array=1,3,5,71, 3, 5, 7Specific values
--array=1-100:21, 3, 5, ... 99Step of 2 (odd numbers)
--array=0-100:100, 10, 20, ... 100Step of 10
--array=1-1000%501-1000, max 50 runningLimit concurrent tasks

Environment Variables

VariableDescriptionExample Value
$SLURM_ARRAY_JOB_IDMain array job ID123456
$SLURM_ARRAY_TASK_IDCurrent task index42
$SLURM_ARRAY_TASK_COUNTTotal number of tasks100
$SLURM_ARRAY_TASK_MINMinimum task index1
$SLURM_ARRAY_TASK_MAXMaximum task index100

Output File Naming

Use these placeholders in --output and --error:

  • %A - Array job ID (same for all tasks)
  • %a - Array task index (unique per task)
  • %j - Individual job ID (unique per task)
#SBATCH --output=results/job_%A_task_%a.out
#SBATCH --error=results/job_%A_task_%a.err

Example: Processing Multiple Input Files

#!/bin/bash
#SBATCH --job-name=process_files
#SBATCH --output=logs/process_%A_%a.out
#SBATCH --array=1-50
#SBATCH --ntasks=1
#SBATCH --time=02:00:00

# Process file corresponding to this task index
INPUT_FILE="data/sample_${SLURM_ARRAY_TASK_ID}.csv"
OUTPUT_FILE="results/output_${SLURM_ARRAY_TASK_ID}.csv"

./analyze.py --input $INPUT_FILE --output $OUTPUT_FILE

Example: Parameter Sweep

#!/bin/bash
#SBATCH --job-name=param_sweep
#SBATCH --output=sweep_%A_%a.out
#SBATCH --array=0-99
#SBATCH --ntasks=1
#SBATCH --time=01:00:00

# Define parameter values
TEMPS=(100 200 300 400 500 600 700 800 900 1000)
PRESSURES=(1 2 3 4 5 6 7 8 9 10)

# Calculate which temp and pressure to use
TEMP_IDX=$((SLURM_ARRAY_TASK_ID / 10))
PRESS_IDX=$((SLURM_ARRAY_TASK_ID % 10))

TEMP=${TEMPS[$TEMP_IDX]}
PRESSURE=${PRESSURES[$PRESS_IDX]}

echo "Running simulation: T=$TEMP, P=$PRESSURE"
./simulate --temp $TEMP --pressure $PRESSURE --output result_${TEMP}_${PRESSURE}.dat

Example: Using a File List

#!/bin/bash
#SBATCH --job-name=filelist
#SBATCH --output=logs/%A_%a.out
#SBATCH --array=1-100
#SBATCH --ntasks=1
#SBATCH --time=01:00:00

# Read the Nth line from a file list
INPUT_FILE=$(sed -n "${SLURM_ARRAY_TASK_ID}p" file_list.txt)

echo "Processing: $INPUT_FILE"
./process.py "$INPUT_FILE"

Create file_list.txt with one filename per line:

$ ls data/*.dat > file_list.txt
$ wc -l file_list.txt    # Check how many files
100 file_list.txt

Limiting Concurrent Tasks

Use %N to limit how many array tasks run simultaneously:

#SBATCH --array=1-1000%50    # Max 50 tasks running at once

This is useful when:

  • Tasks share a limited resource (database, license server)
  • You don't want to dominate the queue
  • Tasks write to shared storage and you want to limit I/O

Managing Array Jobs

Check status

# View all tasks
squeue -j JOBID

# View specific task
squeue -j JOBID_42

Cancel array jobs

# Cancel entire array
scancel JOBID

# Cancel specific task
scancel JOBID_42

# Cancel range of tasks
scancel JOBID_[1-50]

Hold/release tasks

# Hold remaining tasks
scontrol hold JOBID

# Release held tasks
scontrol release JOBID

Rerunning Failed Tasks

If some tasks fail, you can rerun only those tasks:

# Check which tasks failed
sacct -j JOBID --format=JobID,State | grep FAILED

# Resubmit only failed tasks
#SBATCH --array=5,17,42,89    # List only the failed indices

Combining with Job Dependencies

Run a job after all array tasks complete:

# Submit array job
$ sbatch array_job.sh
Submitted batch job 123456

# Submit post-processing job that waits for all tasks
$ sbatch --dependency=afterok:123456 postprocess.sh

Best Practices

  • Create output directories first: Array jobs may fail if output directories don't exist
    mkdir -p logs results
    sbatch array_job.sh
    
  • Use meaningful output filenames: Include both %A and %a to identify tasks
  • Test with small arrays first: Before submitting --array=1-10000, test with --array=1-5
  • Limit concurrent tasks: Use %N if tasks stress shared resources
  • Handle missing inputs gracefully:
    INPUT_FILE="data/input_${SLURM_ARRAY_TASK_ID}.dat"
    if [ ! -f "$INPUT_FILE" ]; then
        echo "Input file not found: $INPUT_FILE"
        exit 1
    fi
    

Further Resources