User Tools

Site Tools


functional_annotation_with_the_funannotate_pipeline

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
functional_annotation_with_the_funannotate_pipeline [2024/12/18 10:45] kathyfunctional_annotation_with_the_funannotate_pipeline [2025/12/09 13:02] (current) – [EggNOG mapping] 134.190.190.181
Line 12: Line 12:
  
 In my experience it was easiest to do a separate InterProScan search, EggNOG mapping and SignalP prediction, and then pointing to each of these resulting outfiles in the final ''funannotate annotate'' step. In my experience it was easiest to do a separate InterProScan search, EggNOG mapping and SignalP prediction, and then pointing to each of these resulting outfiles in the final ''funannotate annotate'' step.
 +
 +**Important note!!!**
 +
 +Before proceeding you need to check your gff3 file for errors by running ''validate_gene_models_in_gff3.py''. (Activate the gffutils environment to run this script, you may also have to comment out the ''import regex as re'' line if you're not checking Blastocystis genomes.  
 + 
 +This script will locate errors in your gff3 file that often occur due to manual editing (premature stop codons, incorrect exon numbering, missing start and stop codons, start or stop location not matching exon location, etc).  Incorrect phase designation for an exon can lead to premature stop codons so keep that in mind when looking for causes of premature stops.  
 +
  
 ==== Pre-step to prepare files ===== ==== Pre-step to prepare files =====
  
-To get the results from InterProScan, Eggnog mapper and SignalP to integrate into funannotate annotation results you need to prepare the gff3 file and protein data using the two funannotate scripts below+To get the results from InterProScan, EggNOG mapper and SignalP to integrate into funannotate annotate results you need to prepare the gff3 file and protein data using the two funannotate scripts below 
 + 
  
 <code> <code>
Line 40: Line 49:
 You will use the protein_file.faa generated in the below InterProScan ''--input'', EggNOG mapping ''-i'',  and SignalP ''--fastafile'' scripts, and the renamed.gff3 file in the funannotate annotate script below (see special note).   You will use the protein_file.faa generated in the below InterProScan ''--input'', EggNOG mapping ''-i'',  and SignalP ''--fastafile'' scripts, and the renamed.gff3 file in the funannotate annotate script below (see special note).  
  
 +If you skip the above steps your results from InterProScan, EggNOG mapping and SignalP will not appear in the final output from funannotate annotate!!
  
 ==== InterProScan ==== ==== InterProScan ====
Line 87: Line 97:
 #$ -cwd #$ -cwd
 #$ -q 256G-batch #$ -q 256G-batch
-#$ -m bea 
-#$ -M joran.martijn@dal.ca 
 #$ -pe threaded 40 #$ -pe threaded 40
  
functional_annotation_with_the_funannotate_pipeline.1734533112.txt.gz · Last modified: by kathy