NOTE: To see what differs in scheduling read the Bi to Freja migration guide
Freja uses fairshare to prioritise between jobs from various groups. Each user group on Freja is assigned a number of shares of the cluster. The more resources a group has used, relative to its assigned share of the total available resources, the lower priority the group’s jobs get in the queue.
To improve utilisation the scheduler also uses backfill, which allows jobs to be started out of priority order when this does not delay the predicted start time of higher priority jobs.
The number of shares is also used to calculate node limits for high priority jobs. Values as of 2024-02-23:
Group (Slurm Account) | Shares |
---|---|
rossby | 186 |
sm_sb | 8 |
sm_sp | 8 |
sm_fouh | 9 |
sm_fouo | 186 |
sm_foum | 183 |
sm_guest | 20 |
sufm | 2 |
There are two types of jobs on Freja:
There is also a tool available to all users to change the priority of jobs themselves:
The vast majority of jobs should be submitted as normal jobs.
Jobs are prioritised using fairshare.
Low priority jobs have lower priority than normal and high priority jobs, and will only be started if no other jobs need the requested resources.
Low priority jobs have a max allowed walltime of 4 hours.
Usage of low priority jobs is ‘‘free’’ and is NOT included when calculating fairshare priority for normal jobs.
To submit low priority jobs, use --qos=low
.
If you are a member of more than one group, you should always use an
option like -A rossby
, -A sm_fouo
etc. to sbatch/interactive to
tell Slurm what account to run under.
If you are only part of one group you do not need to use the -A option for normal job submission. You might have to use it under special circumstances, such as cron jobs.
The maximum wall time for a job is 7 days (except for low priority jobs which have a 4 hour limit). The default time limit (if you do use a “-t” flag) is 2 hours. Please use the “-t” flag to set a time limit that is appropriate for each job!
Avoid running long jobs if the work can be split into several shorter jobs without losing performance. Several shorter jobs can improve the overall scheduling on the cluster. However, there are limits as Freja is not optimised for very short jobs. For example, splitting a 30 minute job into 30 1-minute jobs is not recommended.
Freja have 3 fat nodes with extra memory.
To use them, add -C fat
to your job specification. Do not use
--mem
or similar options to request fat nodes or to specify that you
do not need fat nodes.
Use of the fat nodes counts towards fairshare usage at double the cost of normal nodes. Jobs not requesting fat nodes can be scheduled on fat nodes if no other nodes are available, but will then not be hit with the extra cost.
All job types can request fat nodes.
Node sharing is available on Freja. The idea behind node sharing is
that you do not have to allocate a full compute node in order to run a
small job. Thus, if you request a job like sbatch -n 1 ...
the job
may share the node with other jobs smaller than 1 node. Jobs using a
full node or more will not experience this (that is, we will not pack
two 70-core jobs into 3 nodes). You can turn off node-sharing for
otherwise eligible jobs using the --exclusive
flag.
Using node sharing is highly recommended on Freja since there are a 64 cores per node, but only 78 nodes.
Warning: If you do not include -n
, -N
or --exclusive
to commands like sbatch
and interactive
, you will get
a single core, not a full node.
When you allocate less than a full node, you get a proportional share of the node’s memory. On a thin node with 384 GiB, that means that you get sligtly less than 6 GiB per allocated core.
Note: you cannot request a fat node on Freja by passing a --mem
or
--mem-per-cpu
option too large for thin nodes. You need to use the
-C fat
option discussed above.
Each compute node has a local hard disk with approximately 865 GiB
available for user files backed by local flash storage. The
environment variable $SNIC_TMP
in the job script environment points
to a writable directory on the local disk that you can use. Each job
has private copies of the following directories used for temporary
storage:
/scratch/local (`$SNIC_TMP`)
/tmp
/var/tmp
This means that one job cannot read files written by another job running on the same node. This applies even if it is two of your own jobs running on the same node!
Please note that anything stored on the local disk is deleted when your job ends. If some temporary or output files stored there needs to be preserved, copy them to project storage at the end of your job script.
Guides, documentation and FAQ.
Applying for projects and login accounts.