blast_protocol
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| blast_protocol [2021/09/02 16:19] – 38.20.199.40 | blast_protocol [2022/09/06 14:49] (current) – 134.190.232.106 | ||
|---|---|---|---|
| Line 7: | Line 7: | ||
| - __tblastx__: | - __tblastx__: | ||
| - | //Note: blastp | + | {{: |
| + | |||
| + | //**blastp** can usually provide better hit alignments than blastn, especially for distantly related species.This is partially | ||
| + | |||
| + | // | ||
| + | |||
| + | // | ||
| + | Courtesy of the web source: https:// | ||
| **General bugs** | **General bugs** | ||
| Line 34: | Line 41: | ||
| Using BLASTP search option to blast the amino acid sequences against uniport_db database. | Using BLASTP search option to blast the amino acid sequences against uniport_db database. | ||
| < | < | ||
| - | > ./blastp -query XXX.fasta -db uniprot_db -out BLASTP_XXX_uniprot.xml -evalue 1e-5 -outfmt 5 | + | ./blastp -query XXX.fasta -db uniprot_db -out BLASTP_XXX_uniprot.xml -evalue 1e-5 -outfmt 5 |
| </ | </ | ||
| Line 77: | Line 84: | ||
| 25 salltitles | 25 salltitles | ||
| - | | + | |
| - | | + | |
| - | | + | |
| </ | </ | ||
| - | **V5 database** | + | #This is another way to parse BLAST outputs via using -outfmt '6 qseqid sseqid ...' |
| + | |||
| + | < | ||
| + | # | ||
| + | #$ -S /bin/bash | ||
| + | . / | ||
| + | #$ -pe threaded 2 | ||
| + | #$ -cwd | ||
| + | source activate blast | ||
| + | export BLASTDB= / | ||
| + | DB=nr | ||
| + | query=ATCG00670.1.fasta | ||
| + | blastp -db $DB -query $query -out / | ||
| + | source deactivate | ||
| + | </ | ||
| + | |||
| + | Sep 6th,2022 Since Diamond is faster on BLASTP and BLASTx, this is another way using Diamond | ||
| + | |||
| + | < | ||
| + | # | ||
| + | #$ -S /bin/bash | ||
| + | . / | ||
| + | #$ -pe threaded 40 | ||
| + | #$ -cwd | ||
| + | source activate / | ||
| + | #DB=nr | ||
| + | while read line | ||
| + | do | ||
| + | |||
| + | diamond blastp -p 40 -k 5 -e 1e-10 -f 6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore stitle salltitles --header -d / | ||
| + | |||
| + | done <$1 | ||
| + | |||
| + | conda deactivate | ||
| + | |||
| + | </ | ||
| + | |||
| + | |||
| + | **V5 NCBI database** | ||
| + | |||
| + | The latest blast+ package can be found via https:// | ||
| + | {{: | ||
| + | |||
| + | The V5 NCBI database can be found via https:// | ||
| + | |||
| + | In order to limit your BLAST+ search by taxonomy, you’ll need to obtain the taxid(s) for your organism(s). Two options can be used here: " | ||
| + | |||
| + | This is to acquire the taxid list for your interested organism e.g., | ||
| + | < | ||
| + | ./ | ||
| + | </ | ||
| + | get_species_taxids.sh script is from the blast+ package under the bin directory. | ||
| + | Taxid for bacteria is 2. Then acquire a list of taxonomy ids from bacteria species. | ||
| + | |||
| + | < | ||
| + | ./ | ||
| + | </ | ||
| + | |||
| + | Using 2.txids to limit the NCBI v5 database search scope is far more efficient. | ||
| + | |||
| + | < | ||
| + | ./blastp –db nr –query QUERY –taxidlist 2.txids –outfmt 5 –out OUTPUT.tab | ||
| + | ./blastp –db nr –query QUERY –taxids 1117, | ||
| + | </ | ||
| + | |||
| + | If use " | ||
| + | |||
| + | |||
| + | Note: Please refer to the guide for the most updated information. https:// | ||
| + | |||
| + | {{: | ||
| + | <Last updated by Xi Zhang on Sep 3rd, | ||
blast_protocol.1630610391.txt.gz · Last modified: by 38.20.199.40
