This is an old revision of the document!
Here is a shell to run PLAST:
#!/bin/bash #$ -S /bin/bash . /etc/profile #$ -o logo #$ -cwd #$ -pe threaded 10 cd $PWD CPUs=10 DB=/db1/nr-nt-fasta-oct-2020/nt QF=yourquery.fasta plast -e 1e-10 -max-hit-per-query 1 -outfmt 1 -a $CPUs -p plastn -max-database-size 10000000000 -i $QF -d $DB -o $QF.plout -force-query-order 1000
to parse the output see http://129.173.88.134:81/dokuwiki/doku.php?id=dayana_salas_-_utility_scripts_taxonomy_coloring_trees_phylogenetics_mixture_models_domain_architecture_and_more
Here is a shell example to run BLAST:
#!/bin/bash #$ -S /bin/bash . /etc/profile #$ -pe threaded 1 #$ -cwd source activate blast export BLASTDB=/db1/nr-nt-oct-2020-v5/ DB=nt query=your_query.fasta blastn -db $DB -query $query -out yourqueryresults.blout -num_threads 1 -outfmt "6 qseqid sseqid evalue pident qcovs length slen qlen qstart qend sstart send stitle" source deactivate
Both shells using NCBI nt database (/db1/nr-nt-jan-2019/nt.nal), but the formats for specifying DB are different for BLAST and PLAST.
Guide for BLAST usage
- blastp:search protein database(e.g., SwissProt db, NCBI-nr) using protein sequence query
- blastn:search nucleotide database(e.g., NCBI-nt, MMETSP_DB_clean.v2018.fa)using nucleotide sequence query
- blastx:search protein database with translated nucleotide sequence query
- tblastn:search translated nucleotide database with protein sequence query
- tblastx:search translated nucleotide database with translated nucleotide sequence query
Note: blastp and blastx can usually provide better hit alignments than blastn, especially for distantly related species.This is because amino acids sequences are more conserved than nucleotides (Koonin and Galperin, 2002).
General bugs
when mistakenly use blast options(e.g., blastn or blastp) or query sequence (amino acids or nucleotides sequences):
Error 1: FASTA-Reader: Ignoring invalid residues at position(s): On line 7: 4, 8, 10, 13, 27-29, 32, 42, 45, 51, 53, 56, 63, 66-67, 70, 78 FASTA-Reader: Ignoring invalid residues at position(s): On line 8: 6, 9, 15, 19-20, 22, 28, 34-39, 45-48, 52
Solve : This is due to mistakenly using the blast options.
Error 2: BLAST Database error: No alias or index file found for protein database [XXX.fa] in search path [/misc/scratch2/XXX:]
Solve 2: This is due to mistakenly treating nucleotide database as protein database.
