24

How do I perform a local BLAST search against a custom protein database in Python?

I have a set of protein sequences in a FASTA file and I want to run a local BLAST search against a custom database I built. I'm using Biopython. Here's what I have so far: ```python from Bio.Blast.Applications import NcbiblastpCommandline blastp_cline = NcbiblastpCommandline( query='my_sequences.fasta', db='my_custom_db', evalue=0.001, outfmt=5, out='results.xml' ) stdout, stderr = blastp_cline() ``` I get the error: `BLAST Database error: No alias or index file found for protein database`. What am I missing?
4 views asked 3 days ago by Admin
2 Answers
18
✓ Accepted Answer
You need to first create the BLAST database using `makeblastdb` before you can query it. Here's the full workflow: ```python from Bio.Blast.Applications import NcbimakeblastdbCommandline, NcbiblastpCommandline # Step 1: Create the database makeblastdb_cline = NcbimakeblastdbCommandline( input_file='my_protein_sequences.fasta', dbtype='prot', out='my_custom_db', title='My Custom Protein DB' ) stdout, stderr = makeblastdb_cline() # Step 2: Run BLAST blastp_cline = NcbiblastpCommandline( query='my_sequences.fasta', db='my_custom_db', evalue=0.001, outfmt=5, out='results.xml' ) stdout, stderr = blastp_cline() ``` Make sure BLAST+ is installed and accessible from your PATH. You can verify with `which blastp` on Linux/Mac.
answered 2 days ago by Admin
7
Also worth noting: if you're running this on a large database, consider using Diamond BLAST instead — it's orders of magnitude faster for large-scale searches: ```bash diamond makedb --in my_proteins.fasta -d my_custom_db diamond blastp -q query.fasta -d my_custom_db -o results.tsv ```
answered 5 days ago by Admin