Automating Job Submissions
Techniques for generating and submitting multiple jobs programmatically.
When to Automate
Consider automation when you need to:
- Submit many jobs with different input files or parameters
- Generate batch scripts dynamically based on data
- Create complex job workflows with dependencies
- Run the same analysis on multiple datasets
Note: For simple parameter sweeps, array jobs are often easier than scripted submissions.
Shell Script Loops
The simplest automation is a shell loop that submits multiple jobs:
Submit jobs for multiple input files
#!/bin/bash
# submit_all.sh - Submit a job for each input file
for file in data/*.csv; do
filename=$(basename "$file" .csv)
sbatch --job-name="process_${filename}" \
--output="logs/${filename}.out" \
--error="logs/${filename}.err" \
process_job.sh "$file"
done
The batch script process_job.sh receives the filename as $1:
#!/bin/bash #SBATCH --ntasks=1 #SBATCH --time=01:00:00 INPUT_FILE=$1 OUTPUT_FILE="results/$(basename $INPUT_FILE .csv)_result.csv" ./analyze.py --input "$INPUT_FILE" --output "$OUTPUT_FILE"
Parameter sweep with loops
#!/bin/bash
# submit_sweep.sh - Parameter sweep
for temp in 100 200 300 400 500; do
for pressure in 1 5 10 50 100; do
sbatch --job-name="sim_T${temp}_P${pressure}" \
--output="logs/sim_T${temp}_P${pressure}.out" \
simulation.sh $temp $pressure
done
done
Generating Batch Scripts
For more complex jobs, generate the entire batch script dynamically:
Shell script generator
#!/bin/bash
# generate_and_submit.sh
for i in $(seq 1 10); do
# Generate batch script
cat > job_${i}.sh << EOF
#!/bin/bash
#SBATCH --job-name=run_${i}
#SBATCH --output=logs/run_${i}.out
#SBATCH --error=logs/run_${i}.err
#SBATCH --ntasks=1
#SBATCH --time=02:00:00
echo "Running iteration ${i}"
./myprogram --iteration ${i} --seed $((RANDOM))
EOF
# Submit the generated script
sbatch job_${i}.sh
done
Python script generator
#!/usr/bin/env python3
# generate_jobs.py
import subprocess
import os
parameters = [
{'name': 'small', 'size': 100, 'time': '01:00:00'},
{'name': 'medium', 'size': 1000, 'time': '04:00:00'},
{'name': 'large', 'size': 10000, 'time': '12:00:00'},
]
os.makedirs('generated_scripts', exist_ok=True)
os.makedirs('logs', exist_ok=True)
for param in parameters:
script_content = f"""#!/bin/bash
#SBATCH --job-name={param['name']}
#SBATCH --output=logs/{param['name']}.out
#SBATCH --error=logs/{param['name']}.err
#SBATCH --ntasks=4
#SBATCH --time={param['time']}
./simulation --size {param['size']} --output results/{param['name']}.dat
"""
script_path = f"generated_scripts/{param['name']}.sh"
with open(script_path, 'w') as f:
f.write(script_content)
# Submit the job
result = subprocess.run(['sbatch', script_path], capture_output=True, text=True)
print(f"Submitted {param['name']}: {result.stdout.strip()}")
Job Dependencies
Chain jobs together so they run in sequence:
Linear pipeline
#!/bin/bash # submit_pipeline.sh # Submit first job, capture job ID JOB1=$(sbatch --parsable preprocess.sh) echo "Submitted preprocessing: $JOB1" # Submit second job, depends on first JOB2=$(sbatch --parsable --dependency=afterok:$JOB1 analyze.sh) echo "Submitted analysis: $JOB2" # Submit third job, depends on second JOB3=$(sbatch --parsable --dependency=afterok:$JOB2 postprocess.sh) echo "Submitted postprocessing: $JOB3"
Fan-out, fan-in pattern
#!/bin/bash
# submit_fanout.sh - Run multiple jobs, then merge results
# Submit parallel processing jobs
JOBS=""
for i in $(seq 1 10); do
JOB=$(sbatch --parsable process_chunk.sh $i)
JOBS="${JOBS}:${JOB}"
done
# Remove leading colon
JOBS=${JOBS#:}
echo "Submitted processing jobs: $JOBS"
# Submit merge job that waits for all processing jobs
MERGE_JOB=$(sbatch --parsable --dependency=afterok:$JOBS merge_results.sh)
echo "Submitted merge job: $MERGE_JOB"
Dependency types
| Option | Meaning |
|---|---|
| --dependency=afterok:JOBID | Run after JOBID completes successfully |
| --dependency=afterany:JOBID | Run after JOBID completes (success or failure) |
| --dependency=afternotok:JOBID | Run only if JOBID fails |
| --dependency=after:JOBID | Run after JOBID starts |
| --dependency=singleton | Run only one job with this name at a time |
Reading Parameters from Files
For large parameter sets, read from a configuration file:
CSV parameter file
Create parameters.csv:
name,temperature,pressure,iterations run1,100,1.0,1000 run2,200,1.5,2000 run3,300,2.0,3000
Submit jobs from CSV:
#!/bin/bash
# submit_from_csv.sh
# Skip header line, read each row
tail -n +2 parameters.csv | while IFS=, read -r name temp pressure iters; do
sbatch --job-name="$name" \
--output="logs/${name}.out" \
--export=ALL,TEMP=$temp,PRESSURE=$pressure,ITERATIONS=$iters \
simulation.sh
done
Python with CSV
#!/usr/bin/env python3
import csv
import subprocess
with open('parameters.csv') as f:
reader = csv.DictReader(f)
for row in reader:
cmd = [
'sbatch',
f'--job-name={row["name"]}',
f'--output=logs/{row["name"]}.out',
f'--export=ALL,TEMP={row["temperature"]},PRESSURE={row["pressure"]}',
'simulation.sh'
]
result = subprocess.run(cmd, capture_output=True, text=True)
print(f'{row["name"]}: {result.stdout.strip()}')
Conditional Submissions
Submit jobs based on conditions:
#!/bin/bash
# submit_missing.sh - Only submit jobs for missing output files
for input in data/*.dat; do
base=$(basename "$input" .dat)
output="results/${base}_result.dat"
if [ ! -f "$output" ]; then
echo "Submitting job for $base (output missing)"
sbatch --job-name="$base" process.sh "$input"
else
echo "Skipping $base (output exists)"
fi
done
Tracking Submitted Jobs
Keep a log of submitted jobs:
#!/bin/bash
# submit_with_log.sh
LOGFILE="submission_log_$(date +%Y%m%d_%H%M%S).txt"
echo "Submission started: $(date)" > "$LOGFILE"
for file in data/*.csv; do
JOBID=$(sbatch --parsable process.sh "$file")
echo "$JOBID,$file,$(date +%Y-%m-%d_%H:%M:%S)" >> "$LOGFILE"
done
echo "Submission complete: $(date)" >> "$LOGFILE"
echo "Jobs logged to $LOGFILE"
Rate Limiting
Avoid overwhelming the scheduler with too many submissions at once:
#!/bin/bash
# submit_with_delay.sh
for file in data/*.csv; do
sbatch process.sh "$file"
sleep 0.5 # Half-second delay between submissions
done
For very large submissions (1000+ jobs), consider using array jobs with a concurrency limit (--array=1-1000%50) instead.
Best Practices
- Create directories first: Ensure log and output directories exist before submitting
mkdir -p logs results ./submit_all.sh
- Test with one job: Verify your script works before submitting hundreds of jobs
- Use --parsable: When capturing job IDs, use sbatch --parsable for clean output
- Quote variables: Always quote file paths and parameters to handle spaces correctly
- Prefer array jobs: For simple parameter sweeps, array jobs are more efficient than scripted loops
- Check queue limits: Be aware of QOS limits on concurrent jobs
- Keep submission scripts: Save your submission scripts for reproducibility
Further Resources
- Array Jobs - Simpler approach for many similar jobs
- Submission FAQ
- Batch Script Template