User Tools

Site Tools


blast_and_plast

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
blast_and_plast [2021/09/01 16:35] 38.20.199.40blast_and_plast [2024/11/04 10:14] (current) 110.239.172.216
Line 10: Line 10:
 cd $PWD cd $PWD
 CPUs=10 CPUs=10
-DB=/db1/nr-nt-fasta-oct-2020/nt+DB=/scratch3/rogerlab_databases/other_dbs/nr_March252023/plast/nr.fasta
 QF=yourquery.fasta QF=yourquery.fasta
-plast -e 1e-10 -max-hit-per-query 1 -outfmt 1 -a $CPUs -p plastn -max-database-size 10000000000 -i $QF -d $DB -o $QF.plout -force-query-order 1000+plast -e 1e-10 -max-hit-per-query 1 -outfmt 1 -a $CPUs -p plastp -max-database-size 10000000000 -i $QF -d $DB -o $QF.plout -force-query-order 1000
 </code> </code>
  
-to parse the output see http://129.173.88.134:81/dokuwiki/doku.php?id=dayana_salas_-_utility_scripts_taxonomy_coloring_trees_phylogenetics_mixture_models_domain_architecture_and_more+to parse the output see https://perun.biochem.dal.ca/user-wiki/doku.php?id=taxonomy_recovery
  
  
Line 26: Line 26:
 #$ -S /bin/bash #$ -S /bin/bash
 . /etc/profile . /etc/profile
-#$ -pe threaded 1+#$ -pe threaded 10
 #$ -cwd #$ -cwd
 source activate blast source activate blast
-export BLASTDB=/db1/nr-nt-oct-2020-v5/+export BLASTDB=/db1/blast-may-2024/
 DB=nt DB=nt
 query=your_query.fasta query=your_query.fasta
-blastn -db $DB -query $query -out yourqueryresults.blout -num_threads -outfmt "6 qseqid sseqid evalue pident qcovs length slen qlen qstart qend sstart send stitle" +blastn -db $DB -query $query -out yourqueryresults.blout -num_threads 10 -outfmt "6 qseqid sseqid evalue pident qcovs length slen qlen qstart qend sstart send stitle" 
-source deactivate+conda deactivate
 </code> </code>
  
-Both shells using NCBI nt database (/db1/nr-nt-jan-2019/nt.nal), but the formats for specifying DB are different for BLAST and PLAST.+Both shells use NCBI nt database, but PLAST doesn't support new v5 NCBI nr and nt databases and can cause Segmentation fault error.
  
-Guide for **BLAST** usage +<Last updated by Dandan Zhao on Jun 112024>
-blastp:search protein database(e.g.SwissProt db, NCBI-nr) using protein sequence query +
-blastn:search nucleotide database(e.g., NCBI-nt, MMETSP_DB_clean.v2018.fa)using nucleotide sequence query +
-blastx:search protein database with translated nucleotide sequence query +
-tblastn:search translated nucleotide database with protein sequence query +
-tblastx:search translated nucleotide database with translated nucleotide sequence query+
  
-Note: blastp and blastx can usually provide better alignments than blastn, especially for distantly related species.This is because amino acids sequences are more conserved than nucleotides (Koonin and Galperin, 2002).  
  
-General bugs when mistakenly use blast options(e.g., blastn or blastp) or query sequence (amino acids or nucleotides sequences): 
- 
-Error 1: 
-FASTA-Reader: Ignoring invalid residues at position(s): On line 7: 4, 8, 10, 13, 27-29, 32, 42, 45, 51, 53, 56, 63, 66-67, 70, 78 
-FASTA-Reader: Ignoring invalid residues at position(s): On line 8: 6, 9, 15, 19-20, 22, 28, 34-39, 45-48, 52 
- 
-Solve : 
-This is due to mistakenly use the blast options.  
- 
-Error 2: 
-BLAST Database error: No alias or index file found for protein database [XXX.fa] in search path [/misc/scratch2/XXX:] 
- 
-Solve 2: 
-This is due to mistakenly treat nucleotide database as protein database.  
blast_and_plast.1630524935.txt.gz · Last modified: by 38.20.199.40