User Tools

Site Tools


automatically_fixing_gene_models

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
automatically_fixing_gene_models [2024/07/16 11:05] – [Running fix_genes_with_false_introns.py] 134.190.232.164automatically_fixing_gene_models [2024/07/16 11:07] (current) – [Running fix_genes_with_false_introns.py] 134.190.232.164
Line 115: Line 115:
 Each newly created gene in the output GFF3 file will have in its source field ''fix_genes.py''. So, if you'd like to just see the newly created genes, you can extract them using ''awk '$2=="fix_genes.py"' <output_GFF3_file> > only_new_genes.gff3''. I'd recommend loading the original GFF3, along with the entire updated GFF3, and a GFF3 with just the new genes in IGV. This will allow you to see whether the newly updated genes actually make sense.  Each newly created gene in the output GFF3 file will have in its source field ''fix_genes.py''. So, if you'd like to just see the newly created genes, you can extract them using ''awk '$2=="fix_genes.py"' <output_GFF3_file> > only_new_genes.gff3''. I'd recommend loading the original GFF3, along with the entire updated GFF3, and a GFF3 with just the new genes in IGV. This will allow you to see whether the newly updated genes actually make sense. 
  
-Hopefully they do! If they don't, contact me (Joran) or try to see if you can update the code. So far I've tested the script on Ergobibamus cyprinoides (which has a relatively simple genome structure) and Meteora (which is a bit more complicated). I already had to adjust the code quite a bit to make yield sensible results with Meteora so chances are it may not work very well with another genome.+Hopefully they do! If they don't, contact me (Joran) or try to see if you can update the code yourself. So far I've tested the script on Ergobibamus cyprinoides (which has a relatively simple genome structure) and Meteora (which is a bit more complicated). I already had to adjust the code quite a bit to yield sensible results with Meteora so chances are it may not work very well with another genome.
  
 Remember that this script will not always yield correct genes. It is very simple and looks for just the longest ORFs in a defined region (while respecting supported introns, of course). Sometimes a shorter version of an ORF actually corresponds to a gene. It's up to you to detect these and curate them by hand. This script just does a lot of heavy lifting for you, but won't be perfect. Remember that this script will not always yield correct genes. It is very simple and looks for just the longest ORFs in a defined region (while respecting supported introns, of course). Sometimes a shorter version of an ORF actually corresponds to a gene. It's up to you to detect these and curate them by hand. This script just does a lot of heavy lifting for you, but won't be perfect.
automatically_fixing_gene_models.1721138734.txt.gz · Last modified: by 134.190.232.164