Job Priority and Fairshare
How Slurm determines which jobs run first.
How Job Priority Works
When multiple jobs are waiting for resources, Slurm uses a priority score to decide which job runs first. Higher priority jobs are scheduled before lower priority jobs.
The priority score is calculated from several factors:
- Fairshare: Your recent usage compared to your allocation
- QOS priority: Bonus from Quality of Service settings
- Job age: How long the job has been waiting
- Job size: Number of resources requested
- Partition priority: Priority assigned to the partition
Fairshare
Fairshare is the primary factor in job priority. It ensures that all users get their fair portion of cluster resources over time.
How fairshare works
- Each account has a share allocation representing their portion of the cluster
- Slurm tracks your recent usage (CPU-hours consumed)
- If you've used less than your share, your priority increases
- If you've used more than your share, your priority decreases
- Usage decays over time, so heavy past usage won't penalize you forever
Fairshare formula
Your fairshare factor (between 0 and 1) is calculated as:
Fairshare Factor = 2^(-EffectiveUsage / ShareAllocation)
Where:
- EffectiveUsage is your recent resource consumption (with decay)
- ShareAllocation is your account's assigned share
A fairshare factor of 1.0 means you haven't used any resources recently (highest priority). A factor near 0 means you've used significantly more than your share (lowest priority).
Check your fairshare
$ sshare -u $USER
Account User RawShares NormShares RawUsage EffectvUsage FairShare
-------------------- ---------- ---------- ----------- ----------- ------------- ----------
myproject unityID 100 0.005000 150000 0.003500 0.750000
Key columns:
- NormShares: Your share as a fraction of total cluster shares
- EffectvUsage: Your effective usage as a fraction of total usage
- FairShare: Your fairshare factor (higher is better)
View detailed fairshare
# Show full account hierarchy sshare -a # Show specific account sshare -A myproject # Show all users in your account sshare -A myproject -a
QOS Priority
Each QOS has a priority value that adds to your job's base priority:
| QOS | Priority Bonus | Effect |
|---|---|---|
| normal | 0 | Standard priority |
| short | 10 | Higher priority for short jobs |
| gpu | 0 | Standard priority for GPU jobs |
| short_gpu | 10 | Higher priority for short GPU jobs |
| Partner QOS | Varies | Priority based on partner allocation |
The short and short_gpu QOS provide a priority boost for jobs under 2 hours, plus access to idle partner hardware.
Job Age
Jobs gain priority the longer they wait in the queue. This prevents starvation where low-priority jobs never run.
The age factor increases linearly up to a maximum. After reaching maximum age priority, the job won't gain additional priority from waiting.
Checking Job Priority
View priority of pending jobs
$ sprio -u $USER JOBID USER PRIORITY AGE FAIRSHARE QOS 123456 unityID 10500 1000 9000 500 123457 unityID 10200 700 9000 500
The PRIORITY column shows the total priority score. Higher values run first.
View priority breakdown
sprio -j JOBID -l
View all priority factors
sprio -w
This shows the weights assigned to each priority factor.
Why Is My Job Waiting?
Use squeue to see why your job is pending:
$ squeue -u $USER -o "%.10i %.9P %.20j %.8u %.2t %.10M %.6D %R"
JOBID PARTITION NAME USER ST TIME NODES REASON
123456 compute analysis unityID PD 0:00 4 (Priority)
123457 compute preprocess unityID PD 0:00 1 (Resources)
Common pending reasons
| Reason | Meaning | What to do |
|---|---|---|
| Priority | Other jobs have higher priority | Wait; use short QOS for jobs under 2 hours |
| Resources | Waiting for requested resources | Wait; consider reducing resource request |
| QOSMaxCpuPerUserLimit | Hit your CPU limit for this QOS | Wait for running jobs to finish |
| QOSMaxJobsPerUserLimit | Hit your job count limit | Wait for running jobs to finish |
| AssocGrpCPURunMinutesLimit | Account allocation exhausted | Contact HPC support |
Estimate start time
squeue -j JOBID --start
Note: Start time estimates are approximate and change as other jobs complete or are submitted.
Improving Your Priority
Use appropriate QOS
For jobs under 2 hours, use short or short_gpu for a priority boost and access to more resources.
Request only what you need
Smaller jobs are easier to schedule and consume less of your fairshare allocation:
- Request only the cores your application can use
- Set accurate time limits (jobs ending early return resources)
- Request appropriate memory, not maximum
Use job arrays efficiently
For many similar jobs, array jobs are more efficient than individual submissions and count as a single job against limits.
Spread usage over time
Running many large jobs at once rapidly consumes your fairshare. Spreading jobs over time keeps your priority higher.
Backfill Scheduling
Slurm uses backfill scheduling to improve cluster utilization. Even if your job has lower priority, it may start earlier if:
- It fits in a gap before higher-priority jobs can start
- It won't delay higher-priority jobs
- It has an accurate time limit
Tip: Accurate time limits help backfill scheduling. If your job requests 4 days but only runs 2 hours, you lose backfill opportunities.
Partner Priority
Research groups that have purchased hardware for the cluster receive:
- Dedicated fairshare allocation for their purchased resources
- Partner QOS with higher priority on their hardware
- Access to general cluster resources at standard priority
See HPC Partner Program for details.
Account Hierarchy
Fairshare is calculated hierarchically:
Root
└── Institution
└── College
└── Department
└── Project (Account)
└── User
Usage rolls up through the hierarchy. If your department has used more than its share, all projects in that department may have reduced priority compared to projects in other departments.
Useful Commands Summary
| Command | Description |
|---|---|
| sshare -u $USER | Show your fairshare |
| sprio -u $USER | Show priority of your pending jobs |
| squeue -u $USER | Show your jobs and pending reasons |
| squeue -j JOBID --start | Estimate job start time |
| sacctmgr show assoc user=$USER | Show your account associations |