gene_prediction_just_genemark
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| gene_prediction_just_genemark [2023/02/23 12:57] – 134.190.232.186 | gene_prediction_just_genemark [2026/02/26 11:53] (current) – 129.173.242.70 | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| ====== Gene prediction with just GeneMark ====== | ====== Gene prediction with just GeneMark ====== | ||
| - | Joran Martijn | + | Created by Joran Martijn |
| + | |||
| + | Updated by Jason Shao on February 26th, 2026. | ||
| **GeneMark** is one of oldest gene prediction tools still in development, | **GeneMark** is one of oldest gene prediction tools still in development, | ||
| Line 40: | Line 42: | ||
| This is perhaps the most straightforward and pure //ab initio// gene prediction tool. Only the genome FASTA file is provided, and the algorithm will do its best without any external sources of evidence or training input (hence Self-training), | This is perhaps the most straightforward and pure //ab initio// gene prediction tool. Only the genome FASTA file is provided, and the algorithm will do its best without any external sources of evidence or training input (hence Self-training), | ||
| + | |||
| + | Create a conda environment for GeneMark-ES: | ||
| + | |||
| + | < | ||
| + | conda create -n genemark-es perl perl-mce perl-yaml perl-hash-merge perl-parallel-forkmanager | ||
| + | </ | ||
| + | |||
| + | Running GeneMark-ES: | ||
| < | < | ||
| + | source activate genemark-es | ||
| gmes_petap.pl --sequence < | gmes_petap.pl --sequence < | ||
| </ | </ | ||
| Line 72: | Line 83: | ||
| NOTE that '' | NOTE that '' | ||
| + | ==== Running GeneMark on perun ==== | ||
| + | There is no working environment on perun dedicated to GeneMark as far as I know, but braker2 calls GeneMark so the braker2 environment has all the necessary dependencies for running GeneMark as well | ||
| + | |||
| + | < | ||
| + | #!/bin/bash | ||
| + | #$ -S /bin/bash | ||
| + | #$ -cwd | ||
| + | #$ -m bea | ||
| + | #$ -pe threaded 20 | ||
| + | |||
| + | source activate braker2 | ||
| + | |||
| + | # add gmes_petap.pl installation location to the $PATH | ||
| + | export PATH="/ | ||
| + | |||
| + | # input | ||
| + | ORIGINAL_GENOME=' | ||
| + | RNASEQ=' | ||
| + | THREADS=20 | ||
| + | |||
| + | # if you have no transcriptome data and you just want to do ab initio gene prediction | ||
| + | gmes_petap.pl --sequence $ORIGINAL_GENOME --ES --cores=$THREADS | ||
| + | |||
| + | # if you have a fungal like genome, use genemark-ES with --fungus | ||
| + | gmes_petap.pl --sequence $ORIGINAL_GENOME --ES --cores=$THREADS --fungus | ||
| + | |||
| + | # if you have transcriptome data, use genemark-ET | ||
| + | ## get hints from rnaseq alignment bam file | ||
| + | bam2hints --intronsonly --minintronlen 20 --in=$RNASEQ --out=intron_hints.gff | ||
| + | ## process hints | ||
| + | cat intron_hints.gff | sort -n -k4,4 | sort -s -n -k5,5 | sort -s -n -k3,3 | sort -s -k1,1 > intron_hints.sort.gff | ||
| + | join_multiple_hints.pl < intron_hints.sort.gff > hintsfile.tmp.gff | ||
| + | filterIntronsFindStrand.pl < | ||
| + | ## run GeneMark-ET | ||
| + | gmes_petap.pl --verbose --sequence=$ORGINAL_GENOME --ET=hintsfile.gff --et_score 10 --cores=2 | ||
| + | |||
| + | </ | ||
gene_prediction_just_genemark.1677171427.txt.gz · Last modified: by 134.190.232.186
