Slurm scheduler
Roar is a shared system used by many researchers and uses Slurm for scheduling. The Slurm scheduler is the cluster's essential resource manager, responsible for fairly and efficiently distributing compute resources (like CPUs, memory, and GPUs) to all users. It acts as the system's workload manager, preventing conflicts and ensuring orderly access to the hardware.
When you submit a job, you specify the resources you need. Slurm then performs several key function like job queueing, resource allocation, policy enforcement, execution and monitoring.
Resource directives
Resource directives are used to specify how a job behaves and the resources to be allocated. Directives can be used to define resources such as cores memory, and time needed. But they can also be used to set job options such as email alerts, job dependencies, and more.
They are required for all jobs including both interactive jobs and batch jobs.
Interactive jobs via the Portal also use resource directives, though they are often specified through the request forms.
The most common directives are:
| Short option | Long option | Description |
|---|---|---|
-J |
--job-name |
name the job |
-A |
--account |
charge to an account |
-p |
--partition |
request a partition |
-N |
--nodes |
number of nodes |
-n |
--ntasks |
number of tasks (cores) |
| NA | --ntasks-per-node |
number of tasks per node |
| NA | --mem |
memory per node |
| NA | --mem-per-cpu |
memory per core |
-t |
--time |
maximum run time |
| NA | --gres |
GPU request |
-C |
--constraint |
required node features |
-e |
--error |
direct standard error to a file |
-o |
--output |
direct standard output to a file |
Custom output file names
By default, batch job standard output and standard error are both directed to
slurm-%j.out(where %j is the jobID). But output and error filenames can be
customized by using --output=filename to redirect output to a specified file.
If only --output is specified, both standard output and error will be directed to
the file. Specifying --error=filename will direct standard error to its own file.
Specifying resource directives
You provide these directives at the top of a batch script using #SBATCH, or as options on the command line (e.g., with salloc or srun).
On the portal, you can use resource directives to further customize your job requests.
Note on Tasks vs. Cores
For most jobs, you can think of one task as one CPU core. So, --ntasks=8
requests 8 cores.
Note on Memory
Be careful when requesting memory!
--mem=16G requests 16 GB of memory for the entire node.
--mem-per-cpu=4G requests 4 GB of memory for each core you've requested. If you
requested 4 cores, this would total 16 GB.
Environment variables
Slurm defines environment variables within the scope of a job:
| Environment Variable | Description |
|---|---|
SLURM_JOB_ID |
ID of the job |
SLURM_JOB_NAME |
Name of job |
SLURM_NNODES |
Number of nodes |
SLURM_NODELIST |
List of nodes |
SLURM_NTASKS |
Total number of tasks |
SLURM_NTASKS_PER_NODE |
Number of tasks per node |
SLURM_QUEUE |
Queue (partition) |
SLURM_SUBMIT_DIR |
Directory of job submission |
These can be used in your submit script to make them more dynamic to the resources allocated.
Use $SLURM_NTASKS for parallel jobs
Using $SLURM_NTASKS to specify the number of parallel tasks in your submit script
allows your job to dynamically adapt to the jobs resource request without having to
modify your script in multiple locations.
Replacement symbols
Replacement symbols can be used in Slurm directives, to build job names and filenames with information specific to the job being run:
| Symbol | Description |
|---|---|
%j |
Job ID |
%x |
Job name |
%u |
Username |
%N |
Hostname where the job is running |
For more information on Slurm directives, environment variables, and replacement symbols, see Slurm sbatch documentation for batch jobs and Slurm salloc documentation for interactive jobs.
Replacement symbols to create unique output files
Using replacement symbols in your resource requests can be used to create unique output
file names for each run of a job. For example --output=myjob.%j.out will create a
different output file for each job using the Slurm Job ID to replace %j