User Tools

Site Tools


gene_prediction_just_augustus

This is an old revision of the document!


By Jason Shao & Joran Martijn (Last Edited: October 25th 2024)

Intro

Augustus is an ab initio gene predictor that employs Hidden Markov Models (HMMs) pre-trained on existing datasets. Training a custom HMM model with your own data is possible, but in this basic tutorial, we are only going to consider pre-existing models.

Example Usage

source activate augustus-3.5.0

export AUGUSTUS_CONFIG_PATH="/misc/scratch3/jasons/protist_gene_prediction/software/custom_augustus_config"

augustus \
    --species=generic \
    <your genome> \
    --gff3=on \
    --outfile=<outfile name>.gff3

conda deactivate

Note that the species here is set to generic to minimize biases for a divergent organism. If your organism is closely related to one of the pre-trained species below, you can specify that instead to yield a better prediction.

If you decide to use a pre-trained species, then you don't have to include the export line. The reason for including the export line is because augustus had stopped shipping probability files with generic species around 3.3.3. Why have they done that, you might ask? Well, unfortunately it's another great mystery for which science cannot explain.

Pre-trained Species

Identifier Species Major Lineage
human Homo sapiens Opisthokonta (Metazoa)
fly Drosophila melanogaster Opisthokonta (Metazoa)
arabidopsis Arabidopsis thaliana Archaeplastida (Plantae)
brugia Brugia malayi Opisthokonta (Metazoa)
aedes Aedes aegypti Opisthokonta (Metazoa)
tribolium Tribolium castaneum Opisthokonta (Metazoa)
schistosoma Schistosoma mansoni Opisthokonta (Metazoa)
tetrahymena Tetrahymena thermophila SAR (Alveolata)
galdieria Galdieria sulphuraria Archaeplastida (Plantae)
maize Zea mays Archaeplastida (Plantae)
toxoplasma Toxoplasma gondii SAR (Alveolata)
caenorhabditis Caenorhabditis elegans Opisthokonta (Metazoa)
aspergillus_fumigatus Aspergillus fumigatus Opisthokonta (Fungi)
aspergillus_nidulans Aspergillus nidulans Opisthokonta (Fungi)
aspergillus_oryzae Aspergillus oryzae Opisthokonta (Fungi)
aspergillus_terreus Aspergillus terreus Opisthokonta (Fungi)
botrytis_cinerea Botrytis cinerea Opisthokonta (Fungi)
candida_albicans Candida albicans Opisthokonta (Fungi)
candida_guilliermondii Candida guilliermondii Opisthokonta (Fungi)
candida_tropicalis Candida tropicalis Opisthokonta (Fungi)
chaetomium_globosum Chaetomium globosum Opisthokonta (Fungi)
coccidioides_immitis Coccidioides immitis Opisthokonta (Fungi)
coprinus Coprinus cinereus Opisthokonta (Fungi)
coyote_tobacco Nicotiana attenuata Archaeplastida (Plantae)
cryptococcus_neoformans_gattii Cryptococcus neoformans gattii Opisthokonta (Fungi)
cryptococcus_neoformans_neoformans_B Cryptococcus neoformans Opisthokonta (Fungi)
debaryomyces_hansenii Debaryomyces hansenii Opisthokonta (Fungi)
encephalitozoon_cuniculi_GB Encephalitozoon cuniculi Opisthokonta (Fungi)
eremothecium_gossypii Eremothecium gossypii Opisthokonta (Fungi)
fusarium_graminearum Fusarium graminearum Opisthokonta (Fungi)
histoplasma_capsulatum Histoplasma capsulatum Opisthokonta (Fungi)
kluyveromyces_lactis Kluyveromyces lactis Opisthokonta (Fungi)
laccaria_bicolor Laccaria bicolor Opisthokonta (Fungi)
lamprey Petromyzon marinus Opisthokonta (Metazoa)
leishmania_tarentolae Leishmania tarentolae Excavata
lodderomyces_elongisporus Lodderomyces elongisporus Opisthokonta (Fungi)
magnaporthe_grisea Magnaporthe grisea Opisthokonta (Fungi)
neurospora_crassa Neurospora crassa Opisthokonta (Fungi)
phanerochaete_chrysosporium Phanerochaete chrysosporium Opisthokonta (Fungi)
pichia_stipitis Pichia stipitis Opisthokonta (Fungi)
rhizopus_oryzae Rhizopus oryzae Opisthokonta (Fungi)
saccharomyces_cerevisiae_S288C Saccharomyces cerevisiae Opisthokonta (Fungi)
schizosaccharomyces_pombe Schizosaccharomyces pombe Opisthokonta (Fungi)
thermoanaerobacter_tengcongensis Thermoanaerobacter tengcongensis Bacteria
trichinella Trichinella spiralis Opisthokonta (Metazoa)
ustilago_maydis Ustilago maydis Opisthokonta (Fungi)
yarrowia_lipolytica Yarrowia lipolytica Opisthokonta (Fungi)
nasonia Nasonia vitripennis Opisthokonta (Metazoa)
tomato Solanum lycopersicum Archaeplastida (Plantae)
chlamydomonas Chlamydomonas reinhardtii Archaeplastida
amphimedon Amphimedon queenslandica Opisthokonta (Metazoa)
pneumocystis Pneumocystis jirovecii Opisthokonta (Fungi)
wheat Triticum aestivum Archaeplastida (Plantae)
chicken Gallus gallus Opisthokonta (Metazoa)
zebrafish Danio rerio Opisthokonta (Metazoa)
E_coli_K12 Escherichia coli Bacteria
s_aureus Staphylococcus aureus Bacteria
volvox Volvox carteri Archaeplastida
gene_prediction_just_augustus.1729886065.txt.gz · Last modified: by 134.190.221.230