33

How to handle batch effects in scRNA-seq data using Seurat?

I'm integrating scRNA-seq datasets from 3 different batches (different labs, same tissue type). After merging in Seurat, the UMAP clusters by batch rather than by cell type. How do I correct for batch effects? I've tried `RunHarmony()` but I'm not sure if I'm applying it correctly.
7 views asked 1 day ago by Admin
1 Answer
28
✓ Accepted Answer
Harmony is a good choice. Here's the correct workflow: ```r library(Seurat) library(harmony) # Merge your objects combined <- merge(batch1, y = list(batch2, batch3), add.cell.ids = c('batch1','batch2','batch3')) # Standard preprocessing combined <- NormalizeData(combined) combined <- FindVariableFeatures(combined, nfeatures = 3000) combined <- ScaleData(combined) combined <- RunPCA(combined, npcs = 50) # Run Harmony on PCA embeddings combined <- RunHarmony( combined, group.by.vars = 'orig.ident', # batch label column reduction = 'pca', dims.use = 1:30, assay.use = 'RNA' ) # Use Harmony embeddings for downstream steps combined <- RunUMAP(combined, reduction = 'harmony', dims = 1:30) combined <- FindNeighbors(combined, reduction = 'harmony', dims = 1:30) combined <- FindClusters(combined, resolution = 0.5) ``` **Key point**: use `reduction = 'harmony'` for UMAP and clustering, NOT `reduction = 'pca'`. If Harmony doesn't work well (e.g. very different protocols), try Seurat's native CCA integration (`IntegrateData`) or scVI (Python).
answered 4 days ago by Admin