34
How to perform adapter trimming and quality control on paired-end Illumina reads with Trimmomatic
I have paired-end Illumina RNA-seq data and I need to trim adapters and low-quality bases before alignment. I've been told to use Trimmomatic but I'm not sure what parameters to use. How do I set up a Trimmomatic command for PE reads, and how do I check the trimming worked?
9 views
1 Answer
29
✓
✓ Accepted Answer
**Basic Trimmomatic PE command:**
```bash
trimmomatic PE -threads 8 -phred33
sample_R1.fastq.gz sample_R2.fastq.gz
sample_R1_paired.fastq.gz sample_R1_unpaired.fastq.gz
sample_R2_paired.fastq.gz sample_R2_unpaired.fastq.gz
ILLUMINACLIP:TruSeq3-PE-2.fa:2:30:10:2:keepBothReads
LEADING:3
TRAILING:3
SLIDINGWINDOW:4:15
MINLEN:36
```
**Parameters explained:**
- `ILLUMINACLIP:TruSeq3-PE-2.fa:2:30:10`: adapter file, seed mismatch tolerance, palindrome score, simple clip score
- `LEADING:3` / `TRAILING:3`: trim bases with quality <3 from ends
- `SLIDINGWINDOW:4:15`: trim when 4-base window average quality drops below 15
- `MINLEN:36`: discard reads shorter than 36 bp after trimming
**Adapter files** are bundled with Trimmomatic at `$CONDA_PREFIX/share/trimmomatic/adapters/`:
- `TruSeq3-PE-2.fa` — Illumina TruSeq stranded kits (most common)
- `NexteraPE-PE.fa` — Nextera XT, Nextera Flex
**Quality check before and after:**
```bash
# Before
fastqc sample_R1.fastq.gz sample_R2.fastq.gz -o fastqc_before/
# After
fastqc sample_R1_paired.fastq.gz sample_R2_paired.fastq.gz -o fastqc_after/
# Aggregate all samples
multiqc fastqc_before/ fastqc_after/ -o multiqc_report/
```
**Modern alternative: fastp**
```bash
fastp -i R1.fastq.gz -I R2.fastq.gz
-o R1_clean.fastq.gz -O R2_clean.fastq.gz
--detect_adapter_for_pe
-j report.json -h report.html -w 8
```
fastp auto-detects adapters and is 3× faster than Trimmomatic.