bioinformatics_tools
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| bioinformatics_tools [2022/05/01 20:12] – 134.190.232.106 | bioinformatics_tools [2022/08/21 10:35] (current) – 173.212.112.187 | ||
|---|---|---|---|
| Line 10: | Line 10: | ||
| - HSDFinder: a BLAST-based strategy for identifying highly similar duplicated genes in eukaryotic genomes (2021) | - HSDFinder: a BLAST-based strategy for identifying highly similar duplicated genes in eukaryotic genomes (2021) | ||
| - HSDatabase – Identification and functional annotation of highly similar duplicated genes in eukaryotic genomes(2022) | - HSDatabase – Identification and functional annotation of highly similar duplicated genes in eukaryotic genomes(2022) | ||
| - | - Comprehensive analysis | + | - An overview |
| + | - HSDicipher: A downstream anaylysis package of hsdfiner and hsdatabase(2023) | ||
| So far, the first step via designing the HSDFinder tool has been reached after so many trails, the selected eukaryotic species have been collected into the HSDatabase. The comprehensive analysis is on the way. | So far, the first step via designing the HSDFinder tool has been reached after so many trails, the selected eukaryotic species have been collected into the HSDatabase. The comprehensive analysis is on the way. | ||
| Line 33: | Line 34: | ||
| - | **How to analyze the data from HSDFinder? | + | **How to analyze the data from HSDFinder? |
| Although there is no golden rule to distinguish partial duplicates from more complete ones, it is believed that the relative complete duplicates turn to have at least less than 50% amino acid length difference and same number and function of conserved domain. | Although there is no golden rule to distinguish partial duplicates from more complete ones, it is believed that the relative complete duplicates turn to have at least less than 50% amino acid length difference and same number and function of conserved domain. | ||
| Line 45: | Line 46: | ||
| * HSD_Categories.py is to calculate the gene copies within each group, i.e., 2-group is the HSD group only has two gene copies. | * HSD_Categories.py is to calculate the gene copies within each group, i.e., 2-group is the HSD group only has two gene copies. | ||
| * HSD_add_on.py is to merge a series of combo thresholds based on the formula:E + (D + (C + (B +A))) | * HSD_add_on.py is to merge a series of combo thresholds based on the formula:E + (D + (C + (B +A))) | ||
| - | |||
| * A = 90%_100aa+(90%_70aa+(90%_50aa+(90%_30aa+90%_10aa))) | * A = 90%_100aa+(90%_70aa+(90%_50aa+(90%_30aa+90%_10aa))) | ||
| * B = 80%_100aa+(80%_70aa+(80%_50aa+(80%_30aa+80%_10aa))) | * B = 80%_100aa+(80%_70aa+(80%_50aa+(80%_30aa+80%_10aa))) | ||
| Line 52: | Line 52: | ||
| * E = 50%_100aa+(50%_70aa+(50%_50aa+(50%_30aa+50%_10aa))) | * E = 50%_100aa+(50%_70aa+(50%_50aa+(50%_30aa+50%_10aa))) | ||
| - | <Last updated by Xi Zhang on Oct 6th, | + | <Last updated by Xi Zhang on Oct 6th, |
| + | <Last updated by Xi Zhang on May 1st, | ||
bioinformatics_tools.1651446746.txt.gz · Last modified: by 134.190.232.106
