What is the difference between TPM, FPKM, and RPKM in RNA-seq gene expression normalization?

Question

I keep seeing TPM, FPKM, and RPKM used in RNA-seq papers and tools. What is the mathematical difference between them? Which should I use for between-sample comparisons, and which do tools like DESeq2 and edgeR actually want?

Admin · Accepted Answer

These are all length- and depth-normalized expression units, but they differ critically in normalization order:

**RPKM** (Reads Per Kilobase per Million mapped reads)
- Normalize for sequencing depth first, then gene length
- Formula: `RPKM = (read_count × 10^9) / (total_reads × gene_length_bp)`
- Problem: sum of RPKM values differs between samples → NOT comparable across samples

**FPKM** = RPKM for paired-end reads. Same formula, same problem.

**TPM** (Transcripts Per Million)
- Normalize for gene length FIRST, then sequencing depth
- Formula: `TPM = (read_count / gene_length_kb) / sum_of_all_RPK × 10^6`
- TPM values in each sample always sum to 1 million → comparable across samples

```python
import pandas as pd
import numpy as np

def counts_to_tpm(counts_df, gene_lengths):
    """counts_df: genes × samples, gene_lengths: Series indexed by gene"""
    rpk = counts_df.div(gene_lengths / 1000, axis=0)  # per kilobase
    scale = rpk.sum(axis=0) / 1e6                     # per million scaling
    return rpk.div(scale, axis=1)

tpm = counts_to_tpm(raw_counts, gene_length_series)
```

**Which to use:**
- **DESeq2 / edgeR**: raw integer counts only — they do their own normalization internally
- **Between-sample comparison**: TPM (not FPKM/RPKM)
- **Publication figures**: TPM
- **Cross-study comparison**: none of the above; use ComBat-seq or similar

**Rule of thumb**: If you pre-normalize before DESeq2 or edgeR, your DE results will be wrong.