29
How to use Docker and Singularity to containerize bioinformatics tools for reproducibility
I want to make my bioinformatics analysis fully reproducible using containers. My HPC cluster doesn't allow Docker (requires root), but Singularity is available. How do I: (1) build a Docker image with my tools, (2) convert it to Singularity, and (3) run Singularity containers on an HPC cluster?
8 views
1 Answer
24
✓
✓ Accepted Answer
**Step 1 — Create a Dockerfile**
```dockerfile
FROM condaforge/mambaforge:latest
LABEL maintainer="[email protected]"
# Install tools via conda
RUN mamba install -c conda-forge -c bioconda --yes
star=2.7.11
samtools=1.19
fastqc
trimmomatic
multiqc
&& conda clean -afy
# Set working directory
WORKDIR /data
```
**Step 2 — Build and push Docker image**
```bash
# Build image
docker build -t myname/rnaseq-tools:1.0 .
# Test it works
docker run --rm myname/rnaseq-tools:1.0 STAR --version
# Push to Docker Hub (free)
docker push myname/rnaseq-tools:1.0
```
**Step 3 — Convert to Singularity on HPC**
```bash
# Pull from Docker Hub and convert (run on HPC)
singularity pull docker://myname/rnaseq-tools:1.0
# Creates: rnaseq-tools_1.0.sif
# Or pull pre-made images from BioContainers
singularity pull docker://biocontainers/blast:2.12.0--h3289130_3
```
**Step 4 — Run Singularity on HPC**
```bash
# Run a command
singularity exec rnaseq-tools_1.0.sif STAR --version
# Bind mount your data directories
singularity exec
--bind /scratch/myproject:/data
--bind /ref:/reference
rnaseq-tools_1.0.sif
STAR --genomeDir /reference/star_index
--readFilesIn /data/R1.fastq.gz /data/R2.fastq.gz
# SLURM job script using Singularity
cat > run_star.sh << 'EOF'
#!/bin/bash
#SBATCH -J star_align
#SBATCH -c 8
#SBATCH --mem 32G
singularity exec --bind $SCRATCH:/data rnaseq-tools_1.0.sif
STAR --runThreadN 8 --genomeDir /data/star_index ...
EOF
sbatch run_star.sh
```
**Quick alternative — Singularity with conda (no Docker needed):**
```bash
# Build Singularity from conda environment
cat > rnaseq.def << 'EOF'
Bootstrap: conda
From: environment.yml
EOF
singularity build rnaseq.sif rnaseq.def
```
For Snakemake users: `snakemake --use-singularity` automatically pulls containers per rule.