Running Jobs

When you login on Euler you arrive on a login-node. The actual computation is done on the compute nodes. The batch system, called slurm, manages and schedules your job(s). In order to use the cluster, you also need to define the resources (CPUs, memory, sometimes disk space) and the run-time in advance. A typical workflow is shown below.

workflow

Copy your data to your scratch. As the drives on the scratch are faster you can often speed up the analysis.
If you know which tool you like to use check out if there is a submission script available and make sure that you are using the most recent version of the tool.
In an interactive node you can then test the command.
In the next step you run some test runs with a submission script to find the optimal resources. Information about running jobs can be obtained through the monitoring tools, as explained below.
Afterwards you can run all jobs. Check the log file.
Transfer only the output that you like to keep to the GDC home or GDC project folder. Files should be compressed.

You should avoid running demanding commands on the command-line of the login node! This might render the login node unusable for other users. Basic Linux commands (e.g. ls, cp, mv or e.g., a quick grep) are not a problem for the login node. Please also see “interactive jobs” below.

When submitting a job you should approximately know:

📁 Where do I ouput my files?
🔋 How much memory the tool will use?
⚙️ How many CPU-cores you need ?
⏱️ How long it will run?

🔋 Memory allocation (`--mem-per-cpu`)

Memory is the most expensive apart of disk space on Euler and it is not unlimited therefore you need to be considered.

For tools where you process sequentially data such as read mapping, quality trimming but also SNP calling, slurm reports the used memory including cache. The figures provided by slurm, thus, can be somethings much higher than the effectively used memory. For the tools listed below you can request less memory than what myjobs reports.

Before you submit a batch of jobs make sure if you can reduced the memory of 20-50%. The less memory you request per job the more jobs you can run in parallele.

BWA
bbmap
angsd
GATK
Graphtyper
Freebayes
Fastp
prinseq
Samtools
Kaiju

Java tools

For java tools (GATK, picard, beagle, bbmap) make sure that you define the enviromental variables correctly.

You can set the environment variables such as for picard tools:

export JAVA_TOOL_OPTIONS=-Xmx4G
picard FixMateInformation -h

or for tools like GATK the memory can be limited as following:

gatk --java-options "-Xmx4G" HaplotypeCaller -h

Here the options for bbmap:

bbmap.sh -Xmx4G -h

⚙️ CPU allocation (`--cpus-per-task`)

If you can use multiple CPUs or not depends on architecture of the tool. Thus you cannot speed up a tool by just requesting more CPUs. As soon as you know that the tool can run on multiple CPUs you normally need to set this number in your command (such as --threads, -p or cpus). However, you can normally not increase this number unlimited. Many bioinformatic tools do not scale anymore at some point (e.g. 4 CPUs speed up your processes 4 times compared to 1 CPU). As a general rule, you should only request > 4 CPUs per job if you know that the program scales well.

⏱️ Running time (`--time`)

Short queues have highest priorities therefore you should not request 120 hours for a 3 hours job. You get the best performance when requesting the running time as close as possible. You need to avoid to send jobs that last less than 5 min. Please use larger units or loop within a longer job.

📝 Script templates and workflows

This manual contains template scripts that can be used as a starting point for your own script. If you need help, we are happy to provide more specific submission scripts on request.

Please avoid to script long pipelines which use many different tools with different memory or CPU requirements. Often one cannot optimise entire pipelines as for example a lot of memory is only used in a certain stage. Please use job chaining or run the different tools independently, where you can optimise the resources much easier. It is also less error prone.

Software wrappers

Software wrappers are easy to use but a black box (e.g. snpArcher, ATLAS-Pipeline, DeepARG, Qime, R workflows like dada2). We do not recommend the use of such wrappers as they are usually extremely inefficient and sometimes impossible to optimise, which would be essential for use on Euler. We are happy to help you build your own workflow step by step.

Workflow managers

Workflow managers such as snakemake or Nexflow are tools for creating repeatable workflows.The problem is that the language is rather complex and it takes time to optimise all the sub-jobs.Both tools are available on Euler, but the GDC doesn't support them. If you want to use them yourself, you need to make sure that the jobs run efficiently. We recommend using simple bash scripts instead.

Job submission

A normal simple command can be submitted in this way. For more complex commands or tools, we recommend using a submission script.

sbatch --cpus-per-task=1 --time=04:00:00 --wrap="my_command"

Job controlling

Since the cluster is very large and jobs are always lost or getting killed, it is important to carefully check whether all jobs have been completed successfully. Counting the output files and sorting by size are also useful. But also take a look at the standard output file (e.g. slurm*.out).

By using squeue you get an overview about the submitted jobs

squeue --all

This tables shows the meaning of slurm job state codes

Abbreviation	Status	Description
PD	PENDING	Job is awaiting resource allocation
R	RUNNING	Job currently has an allocation
CD	COMPLETED	Job has terminated all processes on all nodes with an exit code of zero
F	FAILED	Job terminated with non-zero exit code or other failure condition
OOM	OUT_OF_MEMORY	Job experienced out of memory error
TO	TIMEOUT	Job terminated upon reaching its time limit

A summary about the finished jobs you get with reportseff.

reportseff --format JobID,ReqCPUS,CPUEff,ReqMem,MaxVMSize,MemEff,Timelimit,elapsed,TimeEff <Job-ID>/<Array-ID>

The standard output of the jobs can be found in slurm-*.out. Here you will also find possible error messages. Inspect it carefully as every tool might output errors differently (e.g. error, kill, Error, Killed, exit, Exited, ..). echo statements in the submission script, can be used as a checkmark for your job control. A further sanity check is always to count the number of output files and check their size.

Below you will find some typical error messages and troubleshooting solutions

Error	Reason	Solution
`slurmstepd: error: poll(): Bad address`	Insufficient memory	Moderatly increase the requested memory

Interactive jobs

The load scheduling system also offers interactive jobs. Once it is allocated to the system you can work on the command-line as if you were on a regular server. As with other job submissions you can request resources. If you do not use more CPUs or RAM than requested, you can work on the command-line as long as you have asked for. This is especially useful when testing new scripts or tools, or when running short but computational demanding commands, where it would be tedious if you always had to submit them to the cluster. Submitting an interactive job is easy:

srun --pty bash

This will give you a shell on a compute node for 1 hours. This way, you can quickly run a script (without using sbatch again), check the results, edit the script, run it again, etc., all inside the same batch job. You can of course request more processors if you want to test a parallel program.

If you need more than 20 GB of memory and/or more than 4 CPUs, submit a job rather than using an interactive node.

Submission scripts

Single jobs

Let’s consider the SNP-caller bcftools. When you want to submit a single bcftools job you can prepare a submission script as shown below.

#!/bin/bash   
#SBATCH --job-name=bcf           #Name of the job   
#SBATCH --ntasks=1               #Requesting 1 node (is always 1)
#SBATCH --cpus-per-task=1        #Requesting 1 CPU
#SBATCH --mem-per-cpu=1G         #Requesting 2 Gb memory per core 
#SBATCH --time=4:00:00           #Requesting 4 hours running time 
#SBATCH --output bcf.log         #Log

##########################################
echo "Nik Zemp, GDC, 02/10/23"
echo "$(date) start ${SLURM_JOB_ID}"
##########################################

#Source the GDC stack
source /cluster/project/gdc/shared/stack/GDCstack.sh

#Load the needed modules
module load bcftools/1.20

#define in and outputs
OUT=SNPs
if [ ! -e ${OUT} ]  ; then mkdir ${OUT} ; fi
REF=Ref/Ref.fasta

#The bcftools command, for a list of bam files
echo "Run bcf"
bcftools mpileup -f ${REF} --skip-indels -b bam.lst -a 'FORMAT/AD,FORMAT/DP' | \
bcftools call -mv -Ob -o ${OUT}/raw.bcf

##############################################
##Get a summary of the job 
myjobs -j ${SLURM_JOB_ID}
##############################################

To run this submission script, you would type:

sbatch < submit.bcf.slurm.sh

Job arrays

Many computing tasks can be split up into smaller tasks that can run in parallel (e.g BLAST, read mapping, SNP-calling, data filtering). A cluster is extremely well suited for such jobs and the sbatch command has an option specifically for this. Let’s consider again bcftools, a SNP-caller example. This time we have much more data and would like to speed it up. The bcftools tool itself is not parallelised (even if it is mentioned in the manual), but we can easily split the job into sub-jobs by splitting the data into subsets (e.g. chromosomes). We can then feed bcftools with these data subsets (bcftools option -r) and have many instances of bcftools run in parallel.

Below is the submit script (submit.bcftools.array.slurm.sh) you would use to run such a bcftools job array.

#!/bin/bash
#SBATCH --job-name=bcf       #Name of the job
#SBATCH --array=1-20%10      #Array with 20 Jobs, always 10 running in parallel
#SBATCH --ntasks=1           #Requesting 1 node for each job (always 1)
#SBATCH --cpus-per-task=1    #Requesting 1 CPU for each job
#SBATCH --mem-per-cpu=2G     #Requesting 2 Gb memory per core and job
#SBATCH --time=4:00:00       #4 hours run-time per job
#SBATCH --output=bcf_%a.log  #Log files


##########################################
echo "Nik Zemp, GDC, 02/10/23"
echo "$(date) start ${SLURM_JOB_ID}"
##########################################

#Source the GDC stack
source /cluster/project/gdc/shared/stack/GDCstack.sh

#Load the modules needed
module load bcftools/1.16

#define input and outputs
OUT=SNPs
if [ ! -e ${OUT} ]  ; then mkdir ${OUT} ; fi

##The internal variable of slurm (1-20 in our case; see header slurm) can be used to extract the names of the chromosomes. 
IDX=${SLURM_ARRAY_TASK_ID}
SAMPLE=$(sed -n ${IDX}p chrom.lst)

REF=Ref/Ref.fasta

#The bcftools command
bcftools mpileup -f ${REF} --skip-indels -b bam.lst -r ${SAMPLE} -a 'FORMAT/AD,FORMAT/DP' | \
bcftools call -mv -Ob -o ${OUT}/raw.${SAMPLE}.bcf

##############################################
##Get a summary of the job
myjobs -j ${SLURM_JOB_ID}
##############################################

Usage:

sbatch < submit.bcf_array.slurm.sh

Below is the submit script (submit.map.slurm.sh). We use in this example the node scratch to speed up the analysis and output only the finally sorted bam file to the output directory. You should have a list with all samples (1 per line in total 20 lines) and we will iterate through this list (sample.lst).

#!/bin/bash
#SBATCH --job-name=map         #Name of the job
#SBATCH --array=1-20%5         #Array with 20 Jobs, always 5 running in parallel
#SBATCH --ntasks=1             #Requesting 1 node for each job (always 1)
#SBATCH --cpus-per-task=4      #Requesting 4 CPU for each job, 20 CPUs in total
#SBATCH --mem-per-cpu=2G       #Requesting 8 Gb memory per core and job, 40G in total
#SBATCH --time=4:00:00         #4 hours run-time per job
#SBATCH --tmp=20G              #Request 20G local scratch
#SBATCH --output=map_%a.log    #Log files


##########################################
echo "Nik Zemp, GDC, 02/10/23"
echo "$(date) start ${SLURM_JOB_ID}"
##########################################

#Source the GDC stack
source /cluster/project/gdc/shared/stack/GDCstack.sh

#Load the needed modules
module load samtools/1.20 bwa-mem2/2.2.1

#Running variable 
IDX=${SLURM_ARRAY_TASK_ID}
SAMPLE=$(sed -n ${IDX}p sample.lst)

##provide path infos
IN=raw
OUT=mapping
REF=Ref/Ref.fas

#Let's extract the number of requested CPUs and save it as variable cpu.
CPU=${SLURM_CPUS_ON_NODE}
#for the pip we need twice memory as requested per CPU. Let's use a bit less then 0.5 of the requested 2 Gb. 
MEM=800M 

##generate output folder if not present
if [ ! -e ${OUT} ]  ; then mkdir ${OUT} ; fi
if [ ! -e ${OUT}/stats ]  ; then mkdir ${OUT}/stats ; fi
if [ ! -e ${OUT}/statsQ20 ]  ; then mkdir ${OUT}/statsQ20 ; fi

echo "Start processing ${SAMPLE}" 

echo "Run bwa-mem2"
bwa-mem2 mem ${REF} ${IN}/${SAMPLE}_R1.fq.gz ${IN}/${SAMPLE}_R2.fq.gz -t ${CPU} > ${TMPDIR}/${SAMPLE}.sam

echo "Sam2bam and sort"
samtools sort ${TMPDIR}/${SAMPLE}.sam -T ${TMPDIR} -@ ${CPU} -o ${TMP}/${SAMPLE}_sort.bam

echo "Get MappingStats"
samtools flagstat ${TMPDIR}/${SAMPLE}_sort.bam > ${OUT}/stats/${SAMPLE}

echo "Clean mappings" 
samtools view -hb -q 20 -F 0x800 -F 0x100 -@ ${CPU} -o ${OUT}/${SAMPLE}_sort_Q20.bam ${TMPDIR}/${SAMPLE}_sort.bam
samtools index ${OUT}/${SAMPLE}_sort_Q20.bam

echo "Remove PCRdups"
samtools collate -@ ${CPU} -O -u ${TMPDIR}/${SAMPLE}_sort_Q20.bam | samtools fixmate -@ ${CPU} -m -u - - | samtools sort -@ ${CPU} -m ${MEM} -T ${TMPDIR}  -u - | samtools markdup -@ ${CPU} -T ${TMPDIR} -r -f ${OUT}/statsDup/${SAMPLE} - ${OUT}/${SAMPLE}_sort_Q20_fix_nodup.bam
samtools index ${OUT}/${SAMPLE}_sort_Q20_fix_nodup.bam

echo "Get MappingStats"
samtools flagstat ${OUT}/${SAMPLE}_sort_Q20_fix_nodup.bam  > ${OUT}/statsQ20/${SAMPLE}
samtools coverage ${OUT}/${SAMPLE}_sort_Q20_fix_nodup.bam  > ${OUT}/statsQ20/${SAMPLE}.cov

echo "Convert bam file to cram to increase compression"
samtools view -T ${REF} -C -o ${OUT}/${SAMPLE}_sort_Q20_fix_nodup.cram ${OUT}/${SAMPLE}_sort_Q20_fix_nodup.bam

##############################################
##Get a summary of the job
myjobs -j ${SLURM_JOB_ID}
##############################################

Usage:

sbatch < submit.map.slurm.sh

Job chaining

Jobs can be "chained", meaning a job is waiting until another job has finished. For example, you can submit two jobs at the same time but jobB will start only once jobA has finished.

sbatch < jobA.slurm.sh

Get jobs id using squeue > 9999

And submit the second job

sbatch --dependency=afterok:"9999" < jobB.slurm.sh

Arrays can also be chained.

sbatch < job_array.slurm.sh

Get jobs id (not array id) using squeue > 6666_[1-20]

Start second job B when 6666_[1-10] have been finished

sbatch --dependency=afterok:"6666_1:6666_10" < jobB.slurm.sh

R scripts

Running Rstudio on Euler can be sometimes a bit annoying, especially if you like to visualise your data. If you have entire workflows you can submit Rscripts via sbatch, save the data as Rdata and do e.g. the plotting or the filtering locally on your computer.

Here an example of an R script that you use for just read a huge data file and save it as a Rdata object.

Let's copy the lines below in a file and call it Reformat.R.

#!/usr/bin/env Rscript

## Use this argument to provide file names for the commands below, order must be consistent
args <-   commandArgs(trailingOnly=TRUE)

####load package, compare to the base functions very fast
library(tidyverse)

## Read table
samples <- read_csv(args[1], header = F)

## Save it as a RData file
name<- paste(args[2], "RData")
save(samples, name)

Let's make the script executable.

chmod +x Reformat.R

As we normally don't want to load user specific data we use --vanilla option and sent the script via sbatch.

sbatch --cpus-per-task=1 --mem-per-cpu=2G --time 4:00:00 --wrap="Rscript --vanilla Reformat.R Fst_chromosomes1.txt Fst_chromosomes1_reduced"

This command will then output the file Fst_chromosomes1_reduced.RData

Graphical-User-Interface tools

Programs with graphical user interfaces (GUIs) such as Rstudio can be run over JupyterHub, however, the GDC is not supporting them. Finde more information here.