blast_and_plast
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| blast_and_plast [2021/09/02 16:00] – 38.20.199.40 | blast_and_plast [2024/11/04 10:14] (current) – 110.239.172.216 | ||
|---|---|---|---|
| Line 10: | Line 10: | ||
| cd $PWD | cd $PWD | ||
| CPUs=10 | CPUs=10 | ||
| - | DB=/db1/nr-nt-fasta-oct-2020/ | + | DB=/scratch3/ |
| QF=yourquery.fasta | QF=yourquery.fasta | ||
| - | plast -e 1e-10 -max-hit-per-query 1 -outfmt 1 -a $CPUs -p plastn | + | plast -e 1e-10 -max-hit-per-query 1 -outfmt 1 -a $CPUs -p plastp |
| </ | </ | ||
| - | to parse the output see http://129.173.88.134:81/dokuwiki/ | + | to parse the output see https://perun.biochem.dal.ca/user-wiki/ |
| Line 26: | Line 26: | ||
| #$ -S /bin/bash | #$ -S /bin/bash | ||
| . / | . / | ||
| - | #$ -pe threaded | + | #$ -pe threaded |
| #$ -cwd | #$ -cwd | ||
| source activate blast | source activate blast | ||
| - | export BLASTDB=/ | + | export BLASTDB=/ |
| DB=nt | DB=nt | ||
| query=your_query.fasta | query=your_query.fasta | ||
| - | blastn -db $DB -query $query -out yourqueryresults.blout -num_threads | + | blastn -db $DB -query $query -out yourqueryresults.blout -num_threads |
| - | source | + | conda deactivate |
| </ | </ | ||
| - | Both shells | + | Both shells |
| - | **Guide for BLAST usage** | + | <Last updated by Dandan Zhao on Jun 11, 2024> |
| - | - blastp: | ||
| - | - blastn: | ||
| - | - blastx: | ||
| - | - tblastn: | ||
| - | - tblastx: | ||
| - | //Note: blastp and blastx can usually provide better hit alignments than blastn, especially for distantly related species.This is because amino acids sequences are more conserved than nucleotides (Koonin and Galperin, 2002).// | ||
| - | |||
| - | **General bugs** | ||
| - | |||
| - | when mistakenly use blast options(e.g., | ||
| - | |||
| - | < | ||
| - | Error 1: | ||
| - | FASTA-Reader: | ||
| - | FASTA-Reader: | ||
| - | </ | ||
| - | |||
| - | Solve : | ||
| - | This is due to mistakenly using the blast options. | ||
| - | |||
| - | < | ||
| - | Error 2: | ||
| - | BLAST Database error: No alias or index file found for protein database [XXX.fa] in search path [/ | ||
| - | </ | ||
| - | |||
| - | Solve 2: | ||
| - | This is due to mistakenly treating nucleotide database as protein database. | ||
| - | |||
| - | **Parsing Blast results** | ||
| - | |||
| - | Using BLASTP search option to blast the amino acid sequences against uniport_db database. | ||
| - | < | ||
| - | > ./blastp -query XXX.fasta -db uniprot_db -out BLASTP_XXX_uniprot.xml -evalue 1e-5 -outfmt 5 | ||
| - | </ | ||
| - | |||
| - | |||
| - | The **BLAST XML file** (-outfmt 5) can include useful information comparing to the BLAST Tabular file (-outfmt 6), such as the aligned sequence, the sequence of the hit, and the description of hits into the database. However, the XML format is not human-readable. | ||
| - | |||
| - | Users will need to employ a commonly used parser (// | ||
| - | |||
| - | < | ||
| - | |||
| - | python blastxml_to_tabular.py -c qseqid, | ||
| - | |||
| - | </ | ||
| - | |||
| - | " | ||
| - | |||
| - | < | ||
| - | 1 qseqid | ||
| - | 2 sseqid | ||
| - | 3 pident | ||
| - | 4 length | ||
| - | 5 mismatch | ||
| - | 6 gapopen | ||
| - | 7 qstart | ||
| - | 8 qend End of alignment in query | ||
| - | 9 sstart | ||
| - | 10 send End of alignment in subject (database hit) | ||
| - | 11 evalue | ||
| - | 12 bitscore | ||
| - | 13 sallseqid | ||
| - | 14 score Raw score | ||
| - | 15 nident | ||
| - | 16 positive | ||
| - | 17 gaps Total number of gaps | ||
| - | 18 ppos Percentage of positive-scoring matches | ||
| - | 19 qframe | ||
| - | 20 sframe | ||
| - | 21 qseq Aligned part of query sequence | ||
| - | 22 sseq Aligned part of subject sequence | ||
| - | 23 qlen Query sequence length | ||
| - | 24 slen Subject sequence length | ||
| - | 25 salltitles | ||
| - | |||
| - | $ python blastxml_to_tabular.py -o output.tabular -c std input.xml | ||
| - | $ python blastxml_to_tabular.py -o output.tabular -c ext input.xml | ||
| - | $ python blastxml_to_tabular.py -o output.tabular -c qseqid, | ||
| - | </ | ||
blast_and_plast.1630609250.txt.gz · Last modified: by 38.20.199.40
