56
Complete single-cell RNA-seq analysis pipeline in Seurat from CellRanger output to cell type annotation
I have 10x Genomics scRNA-seq data processed through CellRanger. I now have the filtered_feature_bc_matrix output. What is the complete Seurat workflow from loading data to annotating cell types — including QC, normalization, clustering, and marker identification?
3 views
1 Answer
48
✓
✓ Accepted Answer
Here is the complete Seurat v5 pipeline:
```r
library(Seurat)
library(dplyr)
library(ggplot2)
# ── 1. Load CellRanger output ──
counts <- Read10X(data.dir = 'filtered_feature_bc_matrix/')
obj <- CreateSeuratObject(counts, project = 'my_scrnaseq',
min.cells = 3, min.features = 200)
# ── 2. QC filtering ──
obj[['percent.mt']] <- PercentageFeatureSet(obj, pattern = '^MT-')
# View QC violin plots to decide thresholds
VlnPlot(obj, features = c('nFeature_RNA','nCount_RNA','percent.mt'), ncol=3)
# Filter: adjust thresholds for your data
obj <- subset(obj, subset =
nFeature_RNA > 200 &
nFeature_RNA < 6000 &
percent.mt < 20
)
# ── 3. Normalization and HVG selection ──
obj <- NormalizeData(obj)
obj <- FindVariableFeatures(obj, nfeatures = 3000)
obj <- ScaleData(obj, vars.to.regress = 'percent.mt') # regress mt content
# ── 4. Dimensionality reduction ──
obj <- RunPCA(obj, npcs = 50)
ElbowPlot(obj, ndims = 50) # choose number of significant PCs
obj <- RunUMAP(obj, dims = 1:30)
# ── 5. Clustering ──
obj <- FindNeighbors(obj, dims = 1:30)
obj <- FindClusters(obj, resolution = 0.5) # increase for more clusters
DimPlot(obj, reduction='umap', label=TRUE)
# ── 6. Find marker genes per cluster ──
markers_all <- FindAllMarkers(
obj,
only.pos = TRUE,
min.pct = 0.25,
logfc.threshold = 0.25
)
top_markers <- markers_all %>%
group_by(cluster) %>%
top_n(n=10, wt=avg_log2FC)
DoHeatmap(obj, features = top_markers$gene) + NoLegend()
# ── 7. Annotate cell types manually ──
# After reviewing markers against known marker databases (CellMarker, PanglaoDB)
new_ids <- c(
'0' = 'T cells',
'1' = 'B cells',
'2' = 'NK cells',
'3' = 'Monocytes',
'4' = 'Dendritic cells'
# etc.
)
obj <- RenameIdents(obj, new_ids)
DimPlot(obj, label=TRUE, reduction='umap')
```
**Automated annotation (faster):**
```r
library(SingleR)
library(celldex)
# Reference dataset (Human Primary Cell Atlas)
ref <- HumanPrimaryCellAtlasData()
pred <- SingleR(test = GetAssayData(obj, layer='data'),
ref = ref, labels = ref$label.main)
obj[['singler_labels']] <- pred
```