How to handle batch effects in scRNA-seq data using Seurat?

I'm integrating scRNA-seq datasets from 3 different batches (different labs, same tissue type). After merging in Seurat, the UMAP clusters by batch rather than by cell type. How do I correct for batch effects? I've tried `RunHarmony()` but I'm not sure if I'm applying it correctly.

batch-correction r-programming scrna-seq seurat single-cell

414 views asked 2 months ago by

Admin

1 Answer

28 ✓

✓ Accepted Answer

Harmony is a good choice. Here's the correct workflow: ```r library(Seurat) library(harmony) # Merge your objects combined <- merge(batch1, y = list(batch2, batch3), add.cell.ids = c('batch1','batch2','batch3')) # Standard preprocessing combined <- NormalizeData(combined) combined <- FindVariableFeatures(combined, nfeatures = 3000) combined <- ScaleData(combined) combined <- RunPCA(combined, npcs = 50) # Run Harmony on PCA embeddings combined <- RunHarmony( combined, group.by.vars = 'orig.ident', # batch label column reduction = 'pca', dims.use = 1:30, assay.use = 'RNA' ) # Use Harmony embeddings for downstream steps combined <- RunUMAP(combined, reduction = 'harmony', dims = 1:30) combined <- FindNeighbors(combined, reduction = 'harmony', dims = 1:30) combined <- FindClusters(combined, resolution = 0.5) ``` **Key point**: use `reduction = 'harmony'` for UMAP and clustering, NOT `reduction = 'pca'`. If Harmony doesn't work well (e.g. very different protocols), try Seurat's native CCA integration (`IntegrateData`) or scVI (Python).

answered 2 months ago by

Admin