gene_prediction_with_funannotate

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
gene_prediction_with_funannotate [2023/10/06 15:00] – [Predict] 134.190.232.90gene_prediction_with_funannotate [2026/02/26 12:11] (current) 129.173.242.70
Line 1: Line 1:
 ====== Gene prediction with the Funannotate pipeline ====== ====== Gene prediction with the Funannotate pipeline ======
  
-Joran Martijn (December 2022)+Created by Joran Martijn in December 2022 
 + 
 +Updated by Jason Shao on February 26th, 2026
  
 Funannotate is a genome prediction, annotation, and comparison software package. It was originally written to annotate fungal genomes (small eukaryotes ~ 30 Mb genomes), but has evolved over time to accommodate larger genomes.  Funannotate is a genome prediction, annotation, and comparison software package. It was originally written to annotate fungal genomes (small eukaryotes ~ 30 Mb genomes), but has evolved over time to accommodate larger genomes. 
  
-In my experience it seems to do quite a lot better in predicting gene models than the Braker2 pipeline with //Ergobibamus cyprionides// . In addition to gene prediction, it can also facilitate functional annotation (hence the name FUNctional - ANNOTATE)+In my experience it seems to do quite a lot better in predicting gene models than the Braker2 pipeline with //Ergobibamus cyprionides// . In addition to gene prediction, it can also facilitate functional annotation (hence the name FUNctional - ANNOTATE - though it may also refer to FUNgi, which was its original target clade)
  
 An additional advantage is that it has the capacity to prepare all the files necessary for a NCBI GenBank submission. An additional advantage is that it has the capacity to prepare all the files necessary for a NCBI GenBank submission.
Line 135: Line 137:
 NOTE also that this step generates the `funannotate_out` output directory, which can be used as an input argument in future funannotate jobs. NOTE also that this step generates the `funannotate_out` output directory, which can be used as an input argument in future funannotate jobs.
  
 +An esoteric error with funannotate 1.8.17 might happen at the PASA step. In which case, check:
 +
 +''funannotate_out/pasa-transdecoder.log''
 +<code>
 +...
 +CMD: cdna_alignment_orf_to_genome_orf.pl Blastocystis_ST2_pasa.assemblies.fasta.transdecoder.gff3 Blastocystis_ST2_pasa.pasa_assemblies.gff3 Blastocystis_ST2_pasa.assemblies.fasta > Blastocystis_ST2_pasa.assemblies.fasta.transdecoder.      genome.gff3
 +sh: 1: cdna_alignment_orf_to_genome_orf.pl: not found
 +Error, cmd: cdna_alignment_orf_to_genome_orf.pl Blastocystis_ST2_pasa.assemblies.fasta.transdecoder.gff3                Blastocystis_ST2_pasa.pasa_assemblies.gff3 Blastocystis_ST2_pasa.assemblies.fasta > Blastocystis_ST2_pasa.assemblies.   fasta.transdecoder.genome.gff3 died with ret 32512 at /home/jasons/.conda/envs/funannotate_jds/opt/pasa-2.5.3/scripts/  pasa_asmbls_to_training_set.dbi line 150.  
 +</code>
 +
 +However, ''cdna_alignment_orf_to_genome_orf.pl'' is indeed shipped with 1.8.17, twice no less!
 +
 +A simple fix would be to include this ''export'' statement in your submission script:
 +<code>
 +export PATH="$CONDA_PREFIX/opt/transdecoder/util:$PATH"
 +</code>
 ==== Predict ==== ==== Predict ====
  
Line 166: Line 184:
  
 NOTE: If you are running ''funannotate predict'' outside of the ''funannotate'' conda environment on Perun (for example if you are running it within a distinct contained environment as part of a Snakemake workflow), it may complain that the funannotate database is not properly configured. It is actually already available and properly configured at ''/scratch4/db/funannotate'', but your particular funannotate installation may not know about it. Do ''export FUNANNOTATE_DB="/scratch4/db/funannotate"'' prior to your execution and it should work now. NOTE: If you are running ''funannotate predict'' outside of the ''funannotate'' conda environment on Perun (for example if you are running it within a distinct contained environment as part of a Snakemake workflow), it may complain that the funannotate database is not properly configured. It is actually already available and properly configured at ''/scratch4/db/funannotate'', but your particular funannotate installation may not know about it. Do ''export FUNANNOTATE_DB="/scratch4/db/funannotate"'' prior to your execution and it should work now.
 +
 +To verify the versions of the databases:
 +<code>
 +funannotate database
 +</code>
 +
 +If for some reason you need to re-install the databases from scratch, you can do so with:
 +<code>
 +funannotate setup -d <your_dir>
 +</code>
 +
 +And if you do this on a shared system, you might receive this error:
 +<code>
 +urllib.error.HTTPError: HTTP Error 403: Forbidden
 +</code>
 +
 +This is known issue with GO or possibly other database hosts who deny institutional proxies as "automated scarping".
 +The fix is to make the following modifications to appear to be accessing through a regular browser:  
 +
 +''.conda/envs/funannotate/lib/python3.xx/site-packages/funannotate/setupDB.py''
 +<code>
 +9 from urlib.request import urlopen, Request
 +...
 +75 req = Request(url, headers={"User-Agent": "Mozilla/5.0"})
 +76 u = urlopen(req)
 +</code>
 +Make sure not to use tabs for whitespace.
  
 Many of the required inputs do not have to be explicitly specified, since they have been generated in the previous ''funannotate train'' step and they are available in the ''funannotate_out'' directory. Many of the required inputs do not have to be explicitly specified, since they have been generated in the previous ''funannotate train'' step and they are available in the ''funannotate_out'' directory.
gene_prediction_with_funannotate.1696615226.txt.gz · Last modified: by 134.190.232.90 · Currently locked by: 216.73.216.59