bioinformatics_tools2
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| bioinformatics_tools2 [2021/10/08 16:53] – 134.190.232.9 | bioinformatics_tools2 [2022/02/28 11:53] (current) – 134.190.232.106 | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| Usually Perun can be used to submit jobs via: qsub -q 768G-batch script.sh or qsub -q 256-batch script.sh; However, what if you have thousands of scripts waiting for running, are you going to submit thousands of shell script manually? That is definitely terrible. Here will introduce two approaches to realize submitting batch scripts/ | Usually Perun can be used to submit jobs via: qsub -q 768G-batch script.sh or qsub -q 256-batch script.sh; However, what if you have thousands of scripts waiting for running, are you going to submit thousands of shell script manually? That is definitely terrible. Here will introduce two approaches to realize submitting batch scripts/ | ||
| - | **Approach One: submit array jobs** | + | |
| + | **Approach One: submit for loop shell script** | ||
| + | |||
| + | < | ||
| + | #script: shell.sh | ||
| + | |||
| + | # | ||
| + | #$ -S /bin/bash | ||
| + | . / | ||
| + | #$ -cwd | ||
| + | #$ -o logfile | ||
| + | #$ -pe threaded 20 | ||
| + | #export PATH=/ | ||
| + | |||
| + | while read line | ||
| + | do | ||
| + | |||
| + | mafft --auto --thread 20 / | ||
| + | |||
| + | / | ||
| + | |||
| + | FastTree / | ||
| + | |||
| + | done <$1 | ||
| + | </ | ||
| + | |||
| + | This script need you have a list of sequence name and sensitive with only ID. Run the script like this: | ||
| + | |||
| + | Note: $line.ko.txt VS $line_ko.txt, | ||
| + | |||
| + | < | ||
| + | #pure name_list file of your fasta, e.g. | ||
| + | Gene1 | ||
| + | Gene2 | ||
| + | Gene3 | ||
| + | |||
| + | #This can be easily acquired via : | ||
| + | grep '>' | ||
| + | |||
| + | # If your FASTA seq includes gene descriptions e.g., directly retrieved from NCBI | ||
| + | > gen1 hypothetical protein balabalala | ||
| + | TAGTTAGTCGATCGTACGTA | ||
| + | |||
| + | Simply run: awk ' | ||
| + | |||
| + | #Then run the shell script. | ||
| + | chmod +x shell.sh | ||
| + | ./shell.sh name_list.txt | ||
| + | </ | ||
| + | |||
| + | # must leave one line break for the list.txt file, otherwise the last line will not be proceeded. | ||
| + | |||
| + | |||
| + | **Approach Two: submit array jobs** | ||
| Below is a real case to BLAST thousands of genes against NCBI-nr database. However, it could take weeks running if we BLAST whole gene against the nr database directly. | Below is a real case to BLAST thousands of genes against NCBI-nr database. However, it could take weeks running if we BLAST whole gene against the nr database directly. | ||
| Line 43: | Line 96: | ||
| If you are familiar with ${SGE_TASK_ID}, | If you are familiar with ${SGE_TASK_ID}, | ||
| - | - Method one: using ' | + | * Method one: using ' |
| < | < | ||
| Line 51: | Line 104: | ||
| # So in this case: -query / | # So in this case: -query / | ||
| # will be renamed to | # will be renamed to | ||
| - | # -query / | + | # -query / |
| # Technically, | # Technically, | ||
| </ | </ | ||
| Line 88: | Line 141: | ||
| </ | </ | ||
| - | | + | * Method two: Run shell script split.sh |
| < | < | ||
bioinformatics_tools2.1633722799.txt.gz · Last modified: by 134.190.232.9
