handy_custom_functions
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| handy_custom_functions [2021/05/12 14:19] – 168.91.18.151 | handy_custom_functions [2023/07/25 12:08] (current) – 134.190.232.186 | ||
|---|---|---|---|
| Line 11: | Line 11: | ||
| I will discuss here some more custom functions that I found are very useful in my daily workflow. To add these functions to your system, simply add them to your '' | I will discuss here some more custom functions that I found are very useful in my daily workflow. To add these functions to your system, simply add them to your '' | ||
| - | ===Selecting or removing sequences from a FASTA file=== | + | |
| + | ===Reformatting | ||
| + | |||
| + | For most of my analyses, the header format of NCBI FASTA files is very annoying. This function will convert the annoying format into ''> | ||
| < | < | ||
| - | # fish out a sequence from a fasta file | + | # format NCBI headers to something readable |
| - | function | + | function |
| - | | + | |
| - | | + | |
| - | case "$1" in | + | sed -i -r -e '/ |
| - | -s) seqtk subseq $fasta < | + | sed -i -r -e '/ |
| - | -l) seqtk subseq $fasta <(grep -f " | + | |
| - | | + | |
| - | esac | + | |
| } | } | ||
| </ | </ | ||
| - | This is essentially a wrapper | + | ===Replacing work names with final names for publication=== |
| - | < | + | In my experience I do my analyses with new genomes / transcriptomes etc I work with ' |
| - | # select any sequence that has < | + | |
| - | # useful if you want to find a single sequence | + | |
| - | $ grabseq -s < | + | |
| - | + | ||
| - | # select a particular set of sequences | + | |
| - | # useful if you want to multiple sequences | + | |
| - | $ grabseq -l < | + | |
| - | </ | + | |
| < | < | ||
| - | # remove a particular entry from a fasta file | + | # replace taxanames in trees, |
| - | function | + | function |
| - | | + | |
| - | | + | |
| - | | + | |
| - | -s) seqtk subseq | + | |
| - | -l) seqtk subseq $fasta <( grep ">" | + | |
| - | *) echo -e "rmseq -s < | + | |
| - | | + | |
| } | } | ||
| </ | </ | ||
| - | This function does the opposite of grabseq. It will remove particular sequences from a FASTA file. | + | ===Some other functions=== |
| < | < | ||
| - | # remove a particular sequence that has < | + | # fasta to phylip |
| - | $ rmseq -s < | + | # depends on trimal |
| + | function fa2phy { | ||
| + | trimal -in $1 -out ${1%.*}.phylip -phylip | ||
| + | } | ||
| - | # remove a particular set of sequences that have < | + | # reverse complement function |
| - | $ rmseq -l < | + | function revcomp { |
| + | tr " | ||
| + | } | ||
| + | |||
| + | # sum up all numbers | ||
| + | function total { | ||
| + | tr ' | ||
| + | } | ||
| </ | </ | ||
handy_custom_functions.1620839968.txt.gz · Last modified: by 168.91.18.151
