A: RNA-seq
- High rRNA contaminations: Ribosomal RNA Depletion did not worked well.
- Few adapters: Remove them for assemblies
bbduk.sh -in=example_A.fq.gz -out=example_A_trim.fq.gz trimq=15 literal=AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT ktrim=r k=11 mink=11
B: DNA-Seq (Host parasite system)
- Adapters: remove them
- Extremely low GC content -> species specific
- Few reads are contaminates (GC peak is not so sharp)
bbduk.sh -in=example_B.fq.gz -out=example_B_trim.fq.gz qtrim=rl trimq=15 literal=AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT ktrim=r k=11 mink=11 minlength=100
C: RAD-Seq
- Few adapters: Remove them
- Restriction site produce over-representation of bases
- Low quality at the restriction site: not a problem
bbduk.sh -in=example_C.fq.gz -out=example_C_trim.fq.gz qtrim=rl trimq=15 literal=AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT ktrim=r k=11 mink=11 minlength=100
D: Ampli-Seq
- Many small reads: filtering needed
bbduk.sh -in=example_D.fq.gz -out=example_D_trim.fq.gz qtrim=rl trimq=15 minlength=100
E: Ampli-Seq
- Many low quality bases: trimming and filtering needed
- Problem with the sequencing run
bbduk.sh -in=example_E.fq.gz -out=example_E_trim.fq.gz qtrim=rl trimq=15 minlength=50
F: DNA-Seq
- Few adapters : trim and filtering needed
- Strange GC peak: will be remove when you are removing the adapters.
bbduk.sh -in=example_F.fq.gz -out=example_F_trim.fq.gz qtrim=rl trimq=15 literal=GATCGGAAGAGCACACGTCTGAACTCCAGTCACACAGTGATCTCGTATGCC
G: ONT data
- Good run
- No filtering needed
H: Metagenome data
- Wide GC peak: Expected if you have species with different GC contents.
- Remove low complexity reads for assemblies
I: 10X data
- Internal barcode at the beginning of the sequence: will be removed by the tools later
- Quality not too bad for 10X data reverse
J: DNA-seq
- PolyG tail: trimming and filtering maybe needed
bbduk.sh -in=example_J.fq.gz -out=example_J_trim.fq.gz qtrim=rl trimq=15 literal=GGGGGGGGGGGGGGGGGGGGGGG minlength=100