29

How to use Docker and Singularity to containerize bioinformatics tools for reproducibility

I want to make my bioinformatics analysis fully reproducible using containers. My HPC cluster doesn't allow Docker (requires root), but Singularity is available. How do I: (1) build a Docker image with my tools, (2) convert it to Singularity, and (3) run Singularity containers on an HPC cluster?
7 views asked 4 days ago by Admin
1 Answer
24
✓ Accepted Answer
**Step 1 — Create a Dockerfile** ```dockerfile FROM condaforge/mambaforge:latest LABEL maintainer="[email protected]" # Install tools via conda RUN mamba install -c conda-forge -c bioconda --yes star=2.7.11 samtools=1.19 fastqc trimmomatic multiqc && conda clean -afy # Set working directory WORKDIR /data ``` **Step 2 — Build and push Docker image** ```bash # Build image docker build -t myname/rnaseq-tools:1.0 . # Test it works docker run --rm myname/rnaseq-tools:1.0 STAR --version # Push to Docker Hub (free) docker push myname/rnaseq-tools:1.0 ``` **Step 3 — Convert to Singularity on HPC** ```bash # Pull from Docker Hub and convert (run on HPC) singularity pull docker://myname/rnaseq-tools:1.0 # Creates: rnaseq-tools_1.0.sif # Or pull pre-made images from BioContainers singularity pull docker://biocontainers/blast:2.12.0--h3289130_3 ``` **Step 4 — Run Singularity on HPC** ```bash # Run a command singularity exec rnaseq-tools_1.0.sif STAR --version # Bind mount your data directories singularity exec --bind /scratch/myproject:/data --bind /ref:/reference rnaseq-tools_1.0.sif STAR --genomeDir /reference/star_index --readFilesIn /data/R1.fastq.gz /data/R2.fastq.gz # SLURM job script using Singularity cat > run_star.sh << 'EOF' #!/bin/bash #SBATCH -J star_align #SBATCH -c 8 #SBATCH --mem 32G singularity exec --bind $SCRATCH:/data rnaseq-tools_1.0.sif STAR --runThreadN 8 --genomeDir /data/star_index ... EOF sbatch run_star.sh ``` **Quick alternative — Singularity with conda (no Docker needed):** ```bash # Build Singularity from conda environment cat > rnaseq.def << 'EOF' Bootstrap: conda From: environment.yml EOF singularity build rnaseq.sif rnaseq.def ``` For Snakemake users: `snakemake --use-singularity` automatically pulls containers per rule.
answered 3 weeks ago by Admin