36

How to sort, index, filter, and extract reads from BAM files using samtools

I have an unsorted BAM file from a BWA-MEM alignment. I need to: (1) sort and index it, (2) filter out unmapped and low-quality reads, (3) extract reads from a specific genomic region, and (4) check alignment statistics. What are the samtools commands for each step?
14 views asked 3 weeks ago by Admin
1 Answer
30
✓ Accepted Answer
Here is a full samtools workflow: ```bash # 1. Sort and index samtools sort -@ 8 -o sorted.bam unsorted.bam samtools index sorted.bam # 2. Filter reads # -F 4 exclude unmapped reads # -F 256 exclude secondary alignments # -F 2048 exclude supplementary alignments # -q 30 minimum MAPQ score samtools view -@ 4 -b -F 2308 -q 30 sorted.bam > filtered.bam samtools index filtered.bam # 3. Extract a specific region samtools view -b filtered.bam chr1:1000000-2000000 > region.bam samtools index region.bam # 4. Alignment statistics samtools flagstat filtered.bam samtools stats filtered.bam | grep ^SN | cut -f 2- # 5. Coverage per position samtools depth -a filtered.bam > coverage.txt # 6. Quick quality check (if samtools mpileup needed) samtools mpileup -q 20 -Q 20 filtered.bam | head -100 ``` **Common -F flag values:** ``` 4 = read unmapped 8 = mate unmapped 256 = not primary alignment 512 = read fails platform QC 1024 = PCR duplicate 2048 = supplementary alignment ``` To filter multiple flags combine with bitwise OR: `-F 2308` = `-F 4|256|2048` (unmapped + secondary + supplementary) For duplicate marking use `samtools markdup` (part of samtools 1.10+) or GATK MarkDuplicates.
answered 2 weeks ago by Admin