What is HSDatabase?

Backgrounds:

The idea of HSDatabase project comes from several important milestone articles in the psychrophilic algae research. The aims of the database is to develop a platform for comprehensive analysis of the eukaryotic species, particularly those who can survive in harsh environment, to unravel the role of gene duplication in genome adaptation.

  1. Draft genome sequence of the Antarctic green alga Chlamydomonas sp. UWO241 (2021)
  2. Adaptation to Extreme Antarctic Environments Revealed by the Genome of a Sea Ice Green Alga (2020)
  3. A constitutive stress response is a result of low temperature growth in the Antarctic green alga Chlamydomonas sp. UWO241 (2021)
  4. Photosynthetic Adaptation to Polar Life: Energy Balance, Photoprotection and Genetic Redundancy (2021)
  5. HSDFinder: a BLAST-based strategy for identifying highly similar duplicated genes in eukaryotic genomes (2021)
  6. HSDatabase – Identification and functional annotation of highly similar duplicated genes in eukaryotic genomes(2022)
  7. An overview of online resources for gene duplication detection within species: Mini review(2022)
  8. HSDicipher: A downstream anaylysis package of hsdfiner and hsdatabase(2023)

So far, the first step via designing the HSDFinder tool has been reached after so many trails, the selected eukaryotic species have been collected into the HSDatabase. The comprehensive analysis is on the way.

How to prepare the species files?

As for the HSDatabase itself, we have offered the species request option which allows users to submit the species within users' interest.

How to document that data in HSDatabase?

There are several necessary files to be documented into the database. Request a new species If you wish that a new species would be added in HSDatabase, please use the following form. The new species have to meet the following requirement:

The HSDatabase is based on the data provided by the NCBI FTP site. If your species is stored in the FTP site, it will be a valuable help to provide us the FTP links to the peptide database. At least, a link to the species information is required.

How to analyze the data from HSDFinder? HSDicipher https://github.com/zx0223winner/HSDicipher

Although there is no golden rule to distinguish partial duplicates from more complete ones, it is believed that the relative complete duplicates turn to have at least less than 50% amino acid length difference and same number and function of conserved domain.

<Last updated by Xi Zhang on Oct 6th,2021> <Last updated by Xi Zhang on May 1st,2022>