45

How to predict protein structure with AlphaFold2 using ColabFold locally or on Google Colab

I have 50 novel protein sequences from a genome annotation and want to predict their 3D structures. I don't have access to expensive cloud GPU instances. What is the most practical way to run AlphaFold2 predictions — is ColabFold better than the full AlphaFold2 pipeline for most researchers?
12 views asked 1 month ago by Admin
1 Answer
38
✓ Accepted Answer
**ColabFold is the recommended approach for most researchers.** It uses the same AlphaFold2 model but with faster MSA generation via MMseqs2 instead of JackHMMER/HHblits — typically 10× faster per prediction. **Option 1: ColabFold on Google Colab (free GPU)** Open: `https://colab.research.google.com/github/sokrypton/ColabFold/blob/main/AlphaFold2.ipynb` - Paste sequence, click Run All - Free T4 GPU (~15 min per protein) - Limited to ~1500 aa and ~5 free GPU hours/day **Option 2: ColabFold locally (if you have a GPU)** ```bash # Install pip install 'colabfold[standard]' # Run batch prediction colabfold_batch sequences.fasta output_dir/ --num-recycle 3 --num-models 5 --model-type alphafold2_multimer_v3 # for complexes ``` **Option 3: Full AlphaFold2 (for maximum accuracy, requires ~2 TB database)** ```bash # After installing AlphaFold2 and downloading databases python run_alphafold.py --fasta_paths=sequence.fasta --max_template_date=2024-01-01 --model_preset=monomer --output_dir=./output --data_dir=/databases/alphafold ``` **Interpreting output:** - `ranked_0.pdb`: top-ranked model (usually best) - `pLDDT score`: per-residue confidence (>90 = very high, 70–90 = high, <50 = disordered) - `PAE plot`: predicted aligned error — low values between domains = reliable inter-domain arrangement For new proteins, also check ESMFold (Meta) via API — no MSA needed, good for rapid screening: ```python import requests response = requests.post( 'https://api.esmatlas.com/foldSequence/v1/pdb/', headers={'Content-Type': 'application/x-www-form-urlencoded'}, data=sequence ) print(response.text) # returns PDB format ```
answered 1 week ago by Admin