handy_custom_functions
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| handy_custom_functions [2021/05/12 14:20] – 168.91.18.151 | handy_custom_functions [2023/07/25 12:08] (current) – 134.190.232.186 | ||
|---|---|---|---|
| Line 11: | Line 11: | ||
| I will discuss here some more custom functions that I found are very useful in my daily workflow. To add these functions to your system, simply add them to your '' | I will discuss here some more custom functions that I found are very useful in my daily workflow. To add these functions to your system, simply add them to your '' | ||
| - | ===Selecting or removing sequences from a FASTA file=== | + | |
| + | ===Reformatting | ||
| + | |||
| + | For most of my analyses, the header format of NCBI FASTA files is very annoying. This function will convert the annoying format into ''> | ||
| < | < | ||
| - | # fish out a sequence from a fasta file | + | # format NCBI headers to something readable |
| - | function | + | function |
| - | | + | |
| - | | + | |
| - | case "$1" in | + | sed -i -r -e '/ |
| - | -s) seqtk subseq $fasta < | + | sed -i -r -e '/ |
| - | -l) seqtk subseq $fasta <(grep -f " | + | |
| - | | + | |
| - | esac | + | |
| } | } | ||
| </ | </ | ||
| - | This is essentially a wrapper | + | ===Replacing work names with final names for publication=== |
| - | < | + | In my experience I do my analyses with new genomes / transcriptomes etc I work with ' |
| - | # select any sequence that has < | + | |
| - | # useful if you want to find a single sequence | + | |
| - | $ grabseq -s < | + | |
| - | # select a particular set of sequences that have <pattern> in their header and are in the < | + | <code> |
| - | # useful if you want to multiple sequences | + | # replace taxanames in trees, fasta, etc |
| - | $ grabseq | + | function replace_names { |
| + | input=$1 | ||
| + | mappingfile=$2 | ||
| + | cp $input $input.nms | ||
| + | cat $mappingfile | while read SEARCH REPLACE; do | ||
| + | sed -i -r " | ||
| + | done | ||
| + | } | ||
| </ | </ | ||
| - | The next function does the opposite of grabseq. It will remove particular sequences from a FASTA file. | + | ===Some other functions=== |
| < | < | ||
| - | # remove a particular entry from a fasta file | + | # fasta to phylip |
| - | function | + | # depends on trimal |
| - | | + | function |
| - | to_rmv=$2 | + | |
| - | case "$1" in | + | |
| - | | + | |
| - | -l) seqtk subseq $fasta <( grep ">" | + | |
| - | | + | |
| - | esac | + | |
| } | } | ||
| - | </ | ||
| - | < | + | # reverse complement function |
| - | # remove a particular sequence that has < | + | function revcomp { |
| - | $ rmseq -s < | + | tr " |
| + | } | ||
| - | # remove a particular set of sequences that have < | + | # sum up all numbers |
| - | $ rmseq -l < | + | function total { |
| + | tr ' | ||
| + | } | ||
| </ | </ | ||
handy_custom_functions.1620840036.txt.gz · Last modified: by 168.91.18.151
