This page lists how to use various resource strings to request the proper hardware from LSF, as well as other elements necessary to efficiently use the cluster.
See the following video segment on finding specs for nodes on the cluster.
Resource strings are used to ensure that a job will be placed on nodes that have the resources the job requires. There are resources that are native to LSF, such as the number of cores or memory needed, and there are resources that have been defined locally for the Hazel cluster, such as model name or node type.
Resource strings may contain a number of sections:
To specify a queue, use -q queue_name
.
In general, users should not specify a queue. When no queue is specified, LSF will choose the most appropriate queue based on the number of cores and time requested from the set of default queues.
The exceptions are partner queues and specialty queues, which are queues with special resources.
The queues available to a user can be displayed by using bqueues -u user_name
.
The properties of a queue can be displayed by using bqueues -l queue_name
.
Job priority is determined by several factors including fair share priority, queue priority, and time of submission.
Shared memory jobs must be placed on a single node, or host. Some memory intensive MPI jobs or hybrid parallel jobs must limit the number of tasks per node.
To specify a span type, use -R "span[span_type=#number]"
.
Span type | Description |
---|---|
hosts | Maximum number of hosts to confine tasks |
ptile | Maximum number of tasks per host |
Set a usage case, such as higher memory. Usage is per host. See the generic LSF template for syntax information.
To specify a usage type, use -R "rusage[usage_type=#number]"
.
Usage type | Description |
---|---|
mem | Memory requirements |
See also:
To specify a resource, use -R "select[Resource_type]"
.
LSF will not show an error if the user specifies a combination of resources that do not exist. For example, -R "select[hc model==Gold6130]" would result in job pending indefinitely as the Gold6130 model processors have 16 cores and hc requests processors with six cores.
The following is a list of the types of resources available on Hazel and a description of each.
The required cores per node is specified by the number of cores per processor. Each node has two of the same model Intel Xeon processors. A specification of stc would select a node with two sixteen-core processors, i.e., a node with 32 cores.
Resource type | Description |
---|---|
oc | Processor model with eight (octa) cores |
tc | Processor model with ten cores |
twc | Processor model with twelve cores |
stc | Processor model with sixteen cores |
ttc | Processor model with thirty-two cores |
Software compiled on one type of architecture may not run on another type of architecture, resulting in an error of illegal instruction. LSF resources may be used to specify the instruction set architecture (ISA).
Resource type | Description |
---|---|
sse | Processor model with SSE instructions |
sse2 | Processor model with SSE2 instructions |
ssse3 | Processor model with SSSE3 instructions |
sse4_1 | Processor model with SSE4 v1 instructions |
sse4_2 | Processor model with SSE4 v2 instructions |
avx | Processor model with AVX instructions |
avx2 | Processor model with AVX2 instructions |
Similar to the ISA compatibility issues described above, a given software may not be compatible with all models of GPU.
Resource type | Description |
---|---|
rtx2080 | Node with attached Nvidia RTX 2080 GPU |
gtx1080 | Node with attached Nvidia GTX 1080 GPU |
p100 | Node with attached Nvidia P100 GPU |
a10 | Node with attached Nvidia A10 GPU |
a30 | Node with attached Nvidia A30 GPU |
a100 | Node with attached Nvidia A100 GPU |
l40 | Node with attached Nvidia L40 GPU |
l40s | Node with attached Nvidia L40S GPU |
h100 | Node with attached Nvidia H100 GPU |
The type of interconnect may be specified.
LSF will not show an error if a job is placed in a queue not containing the specified interconnect. For example, when using ib, the job must be placed on a queue containing nodes with InfiniBand. Queues available to all users that have ib include standard_ib and mixed_ib.
Resource type | Description |
---|---|
ib | InfiniBand |
e10G | 10G ethernet |
Model definitions used for Hazel nodes. These correspond to Intel Xeon model numbers of the processors on the nodes. Each node has two of the same model Intel Xeon processors. Here is a site with filter and search capabilities that lists processor model specifications.
To specify a specific model of processor, use -R "select[model==model_number]"
.
The following is a list of the model numbers currently available on Hazel.
E52650v3 | E52650v4 | Gold6130 | Gold6226R | Gold6326 |
Plat8358 | Plat8462Y |
Run a job with 128 tasks (-n 128) on four nodes with 32 cores per
node.
bsub -n 128 -W 120 -R "select[stc] span[ptile=32]" < job_script_name
or using special select syntax
bsub -n 128 -W 120 -R "stc span[ptile=32]" < job_script_name
Nodes have two processors and the resource name defined for nodes
with 32-core processors is stc. This job would fully occupy 4 nodes.
Run a job with 50 tasks with the tasks distributed 10 per node.
bsub -n 50 -W 200 -R "span[ptile=25]" < job_script_name
This resource string does not specify anything about the node selection
criteria beyond needing 25 cores on each node. If the job were scheduled
on nodes with 32 cores per node it is possible that LSF would schedule
other jobs on the nodes being used for this job to occupy the remaining
cores. In general, it is desirable to fully utilize nodes to avoid potential
contention from other jobs.
More examples
See the generic template for creating a detailed batch script for more information and examples about how to specify LSF resources.
uname -a # OS and kernel info cat /etc/redhat-release # Linux distribution lscpu # CPU info cat /proc/cpuinfo # processor info cat /proc/meminfo # memory info nvidia-smi # GPU info