Many workflows involve submitting multiple compute jobs with slightly different parameters. Users can create a more efficient and reproducible workflow by using LSF job arrays or shell scripts to automate submission to LSF. Some basic scripts are provided as examples.
bkill 0
LSF job arrays allow a user to submit multiple jobs to LSF as defined by an array of integers. If the workflow allows for inputs, outputs, and parameters to be fully characterized by a single number, job arrays are the most efficient way of submitting multiple jobs.
In the sample batch script below, LSF will spawn 25 serial jobs, and each will execute the linesource ./echo_hostname.sh $LSB_JOBINDEX
. Here, $LSB_JOBINDEX is the job array index (an integer from 1 to 25) for each job.
#!/bin/bash #BSUB -J My_array[1-25] #job name AND job array #BSUB -n 1 #number of cores #BSUB -W 00:10 #walltime limit: hh:mm #BSUB -o Output_%J_%I.out #output - %J is the job-id %I is the job-array index #BSUB -e Error_%J_%I.err #error - %J is the job-id %I is the job-array index source ./echo_hostname.sh $LSB_JOBINDEX
The script job_array_serial.sh submits 25 jobs that run the program echo_hostname.sh, which echoes the hostname of the node that it is running on and which job of the job array it is. To use, type bsub < job_array_serial.sh
. The scripts and the resulting output can be viewed here:
job_array_serial.sh
echo_hostname.sh
output
To avoid copy/paste errors when using, please copy these from the apps directory:
/usr/local/apps/examples/scripts/job_arrays
The script job_array.sh is to demonstrate that multiple parallel jobs may be submitted in the same manner as for multiple serial jobs. The sample code hello_omp.F90 is a hybrid MPI-OpenMP example which echoes the hostname of the node that each thread is running on. To use, first compile the sample code, and then type bsub < job_array.sh
. The job_array.sh contains instructions for compiling the sample code. The scripts and the resulting output can be viewed here:
job_array.sh
hello_omp.F90
output
To avoid copy/paste errors when using, please copy these from the apps directory:
/usr/local/apps/examples/scripts/job_arrays
If automating job submissions requires something more complex than is available via the use of job arrays as described above, job submission may be automated with batch scripts.
The script multiple_jobs.sh uses bsub to run the program run.sh, which echoes the hostname of the node that it is running on. This can also be used as a test as to whether an LSF batch script will distribute jobs to the intended hosts. The scripts and the resulting output can be viewed here:
multiple_jobs.sh
run.sh
output
To avoid copy/paste errors when using, please copy these from the apps directory:
/usr/local/apps/examples/scripts/basic/
The script R_loops.sh uses bsub and Rscript to define various years and models and then run an R script codehpc.R for each scenario. The scripts and the resulting output can be viewed here:
R_loops.sh
codehpc.R
output
To avoid copy/paste errors when using, please copy these from the apps directory:
/usr/local/apps/examples/scripts/R_loops/