/db1/ /scratch3/rogerlab_databases/other_dbs/ /scratch4/db/ /scratch5/db/
/db1/blast-may-2024: nt, nr (updated May 2024) /db1/alphafold3 (updated Mar 2025) /db1/funannotate (updated Jan 2026) /db1extra-data-sets/eggnog_5.0 (updated Sep 2016)
/db1/blast-may-2024/nr.pal (updated May 2024) /db1/blast-may-2024/refseq_protein.pal (updated May 2024) /db1/blast-may-2024/nt.nal (updated May 2024)
/scratch3/rogerlab_databases/other_dbs/blast_protein_database/ (updated Apr 2024) /scratch3/rogerlab_databases/other_dbs/nr_010621 (updated Jun 2021) /opt/perun/share/extra-data-sets/uniprot/ (updated Jul 25, 2017) /db1/extra-data-sets/MMETSP/MMETSP_db/ (2018, Kolisko et al. cleaned up with WinstoneCleaner see README file in he directory for more information) /opt/perun/share/extra-data-sets/CAM_P_0001000.nt.fa (old marine metagenome database)
/db1/extra-data-sets/nr-fasta.jun2024/nr.fasta (updated Jun, 2024) /scratch3/rogerlab_databases/other_dbs/nt_Jun2024/nt/nt.fasta (updated Jun, 2024)
Note: Plast need to use the fasta file to run, and cannot use new format of v5 NCBI nr and nt databases
/scratch5/db/Eukfinder/Diamond/nr.dmnd (updated Nov 2025) /scratch5/db/Eukfinder/nt_Jun2024/nr/nr.dmnd (updated Dec 2024) /scratch3/rogerlab_databases/other_dbs/nr_March252023/diamond/nr.dmnd (updated Mar 2023) /scratch3/rogerlab_databases/other_dbs/nr_032121/nr.dmnd (updated Mar 2021) /scratch3/rogerlab_databases/other_dbs/nr_06162020/nr.dmnd (updated Jun 2020)
/scratch2/software/centrifuge-1.0.3/ (updated 2019)
To use: the base index is “nt”, which corresponds to the large nucleotide ncbi database in centrifuge format.
Centrifuge Database for Eukfinder:
/scratch3/Eukfinder/DB/Centrifuge_DB/ The base index is Centrifuge_NewDB_Sept2020
/scratch5/db/checkm2/CheckM2_database/ (updated Mar 2021)
Several databases that are available at
/opt/perun/share/extra-data-sets/eggnog/ /scratch4/db/eggnog-mapper-2.1.4/ /db1/extra-data-sets/eggnog_5.0/ (updated Dec 2024)
PRESSED databases
Full archaea hmm profiles : archaea_DB.hmmer Full bacteria hmm profiles : BACT_DB.hmmer Full eukaryotes hmm profiles : EUK_DB.hmmer
Virus and virus-like:
Picornavirales.hmmer, Retrotranscribing.hmmer, Retroviridae.hmmer, ssDNA.hmmer, ssRNA.hmmer,
ssRNA_negative.hmmer, ssRNA_positive.hmmer, Tymovirales.hmmer, Viruses.hmmer, Nidovirales.hmmer
if you need all domains together (bacteria, archaea and eukarya):
/opt/perun/share/extra-data-sets/eggnog/fulleggnogdb/fullEggNOGDB.hmmer
NOT PRESSED (individual profiles):
hmm for bacteria at /opt/perun/share/extra-data-sets/eggnog/bactNOG_hmm hmm for archaea at /opt/perun/share/extra-data-sets/eggnog/arNOG_hmm hmm for eukaryotes at /opt/perun/share/extra-data-sets/eggnog/euNOG_hmm hmm for bacteria-archaea-eukaryotes at /opt/perun/share/extra-data-sets/eggnog/NOG_hmm
EggNOG ANNOTATIONS are within each of the directories for individual profiles (Not pressed), except for NOG (NOG.annotations.tsv) which is in the
/opt/perun/share/extra-data-sets/eggnog
PANTHER 17 Classification:
/scratch3/rogerlab_databases/other_dbs/PANTHER17.0/
PANTHER fasta files by family:
/scratch3/rogerlab_databases/other_dbs/PANTHER17.0/books (updated Jan 2022)
/scratch3/rogerlab_databases/other_dbs/EukProt_V3/proteins/ (updated Mar 2022) /scratch4/db/EukProtv3/ (updated Aug 2022)
/scratch5/db/foldseek /scratch5/db/foldseek-gpu/
/scratch3/rogerlab_databases/other_dbs/kraken2/hash.k2d (updated Nov 2023)
/scratch4/db/kraken2/ (16S, EUK_SSU)
/scratch4/db/Kraken2PlusPFP/Kraken2_Standard_Jun2024/hash.k2d (Standard, updated Jun 2024)
/scratch4/db/Kraken2PlusPFP/Kraken2PlusPFP_Jun2024/hash.k2d (PlusPFP, updated Jun 2024)
/scratch4/db/Kraken2PlusPFP/hash.k2d (PlusPFP, updated July 2025)
The Most updated Kraken2 databases can be downloaded from here:
https://benlangmead.github.io/aws-indexes/k2
/db1/extra-data-sets/Acc2tax/ /db1/extra-data-sets/Acc2tax/Acc2Tax_04_01_2024 /scratch3/rogerlab_databases/other_dbs/Acc2Tax_March252023 /scratch5/db/Eukfinder/Acc2tax (Nov 2025)
gtdbtk-2.0.0: /scratch4/db/gtdbtk-2.0.0/ (Mar 2022) gtdbtk-2.0.0: /scratch4/db/gtdbtk-1.5.0/ /scratch5/db/gtdbtk/ (Apr 2025)
/scratch3/rogerlab_databases/other_dbs/Pfam_Feb2025
Updated Database Locations for Eukfinder (v1.2.4)
Centrifuge: /scratch3/Eukfinder/DB/Centrifuge_DB/ (Sept 2020, gut environment focused)
/scratch3/Eukfinder/DB/Centrifuge_DB_2024/ (Nov 2024, marine environment focused)
/scratch5/db/Eukfinder/Centrifuge/ABV/ (Apr 2025, Bacteria/Archaea/Virus refseq)
PLAST: /scratch3/Eukfinder/DB/PLAST_DB/ (Sept 2020, gut environment focused)
/misc/scratch3/Eukfinder/DB/PLAST_DB_2024/ (Nov 2024, marine environment focused)
<Last updated by Dandan Zhao on Jan 20, 2026>