Monitoring and Optimisation

Before you submit a batch of jobs you have to optimise the resource request. This might be difficult but there are tools like myjobs that can help you determining the resources. If resource usage is significantly higher than 100% of what was requested, it can happen that a job gets killed because not enough resources are free on the node. If the usage is significantly lower (less than 50%), you should adjust the job submission. Otherwise CPUs will be idle or memory is wasted.

The Slurm Job WebGUI is very usful tool to get an overall about recently finished jobs. In this section will have a look on a specific submission scripts that we want to optimze.

If you waste too many resources, the Cluster Support will contact you. If this happens more than 2 times, we will put you in ‘sustained mode’ until you comply with our rules.

Tools to monitore
Running jobs: myjobs
Finished jobs: Slurm job webGUI or reportseff --user $USER --since d=1 for your jobs since 24 houres.
Summary over a week: get_inefficient_jobs

bbjobs

Real time figures (myjobs) of a running job where the requested resources are optimally used. For example, job with ID 24765510 in the normal.24h queue requested 12 CPUs (Requested cores) and 98.8% of them are used (CPU utilisation) suggesting that the requested CPUs are efficiently used. The same with the requested memory (2 Gb per core or 24 Gb in total), 89.2 % is currently used (Resident memory utilisation). Furthermore, CPU time is roughly 12 times Wall-clock time (Total CPU time ~ Requested CPUs X Wall-clock), which means that the 12 CPUs are currently in use.

For some tools resource usage can fluctuate over time while others are extremely difficult to predict (e.g. BLAST, Trinity). Please avoid using complex pipelines with completely different requirements during the run (see below). Always keep that requested and not the actually used resources are considered and that the submission priority drops for all GDC users if CPUs or memory are wasted.

myjobs is your best friend except you use a lot of cache memory (Nik Zemp).

For tools where you need a lot of cache memory (tools like BWA, bbmaps, fastp, GATK) the effectively used memory can be much lower than what the slurm is reporting. In case you like to submit a batch of such jobs it make always senes to reduced the requested memory 20-50% to see if it is still sufficient.

Optimisation needs time as you need to monitor the job regularly using myjobs. Please be very careful when optimising jobs over nights or on weekends. You need to be able to kill jobs in a timely manner if they do not behave. The following example illustrates how you can find optimal CPU, RAM and time requirements.

We would like to do a SNP calling and have split the reference genome in 120 chunks. Now we need to find out how many CPU and how much memory is needed for each of the 120 jobs.

  1. Run a test with 5 representative chunks (do not ust take the first 5) with 4 CPUs and 4 X 0.5 Gb RAM and 24 hours run time.

    #SBATCH --job-name=fb                #Name of the job   
    #SBATCH --array=1,10,15,20,50%15   
    #SBATCH --ntasks=1                   #Requesting 1 node (is always 1)
    #SBATCH --cpus-per-task=4            #Requesting 4 CPU
    #SBATCH --mem-per-cpu=500            #Requesting 0.5 Gb memory per core, 2 Gb in total 
    #SBATCH --time=24:00:00              #Requesting 24 hours run time
    
  2. The jobs getting killed after 2 min because of exceeding memory limits. Let's increase the requested memory to 4X2G.

    #SBATCH --job-name=fb               #Name of the job   
    #SBATCH --array=1,10,15,20,50%15   
    #SBATCH --ntasks=1                  #Requesting 1 node (is always 1)
    #SBATCH --cpus-per-task=4           #Requesting 4 CPU
    #SBATCH --mem-per-cpu=2G            #Requesting 2 Gb memory per core, 8 Gb in total 
    #SBATCH --time=24:00:00             #Requesting 24 hours run time     
    
  3. With myjobs you get the real-time resource figures, use it regularly during the run. If the resource usage is completely off just kill and restart the job with more (or less) resources.

  4. A summary of the resources used is not provided by slurm but you can use reportseff <JOB-ID> to get a summary of the finished jobs.

    source /cluster/project/gdc/shared/stack/GDCstack.sh
    module load reportseff/2.7.6
    reportseff  24765510
    
    JobID State Elapsed TimeEff CPUEff MemEff
    24765510_1 COMPLETED 00:49:10 0.09% 30% 43%
    24765510_10 COMPLETED 00:49:10 0.09% 31% 41%
    24765510_15 COMPLETED 00:49:10 0.10% 35% 42%
    24765510_20 COMPLETED 00:58:10 0.10% 29% 32%
    24765510_50 COMPLETED 00:59:00 0.10% 32% 45%

    In this example the run time is around 1 hour. The memory usage is around 30%. Of the requested 4 CPUs only about 1 were used.

  5. Based on the test run we can now set the settings for the other 120 jobs (1 CPU and 1x3 Gb RAM for 4 hours).

    #SBATCH --job-name=freebayes        #Name of the job   
    #SBATCH --array=1-120%15   
    #SBATCH --ntasks=1          #Requesting 1 node (is always 1)
    #SBATCH --cpus-per-task=1       #Requesting 1 CPU
    #SBATCH --mem-per-cpu=3G        #Requesting 3 Gb memory per core
    #SBATCH --time=4:00:00          #Requesting 4 hours runtime
    
  6. After all jobs are finished you can check which jobs have failed and rerun them again with 1X 5Gb memory.

    #SBATCH --job-name=freebayes            #Name of the job   
    #SBATCH --array=11,99%10
    #SBATCH --ntasks=1                      #Requesting 1 node (is always 1)
    #SBATCH --cpus-per-task=1               #Requesting 1 CPU
    #SBATCH --mem-per-cpu=5G                #Requesting 5 Gb memory per core 
    #SBATCH --time=4:00:00                  #Requesting 4 hours run time