gene_prediction_with_braker2_pipeline
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| gene_prediction_with_braker2_pipeline [2025/11/18 12:17] – [Braker2] 134.190.191.148 | gene_prediction_with_braker2_pipeline [2025/11/18 14:23] (current) – [Genome-guided transcriptome assembly] 134.190.191.148 | ||
|---|---|---|---|
| Line 130: | Line 130: | ||
| #$ -cwd | #$ -cwd | ||
| #$ -pe threaded 10 | #$ -pe threaded 10 | ||
| + | |||
| cd $PWD | cd $PWD | ||
| + | |||
| source activate trinity-2.11-with-workaround | source activate trinity-2.11-with-workaround | ||
| - | Trinity --CPU 10 --max_memory 100G --genome_guided_bam yourgenome.fasta.sambamsorted.bam --genome_guided_max_intron 1000 --SS_lib_type RF | + | |
| + | Trinity | ||
| + | | ||
| + | | ||
| + | | ||
| + | | ||
| + | | ||
| conda deactivate | conda deactivate | ||
| </ | </ | ||
| Line 149: | Line 158: | ||
| [[https:// | [[https:// | ||
| - | - 1. Intron start and end coordinates (//intron hints//) are extracted from the RNAseq BAM file | + | - Intron start and end coordinates (//intron hints//) are extracted from the RNAseq BAM file |
| - | - 2. These are then used along with the genome FASTA file to train GeneMarkET | + | - These are then used along with the genome FASTA file to train GeneMarkET |
| - | - 3. The trained GeneMarkET performs an "//ab initio//" | + | - The trained GeneMarkET performs an "//ab initio//" |
| - | - 4. Those predicted gene structures for which all introns are supported by the RNAseq data (//anchored introns//) are selected to train AUGUSTUS | + | - Those predicted gene structures for which all introns are supported by the RNAseq data (//anchored introns//) are selected to train AUGUSTUS |
| - | - 5. The trained AUGUSTUS now predicts gene structures using again the intron hints as " | + | - The trained AUGUSTUS now predicts gene structures using again the intron hints as " |
| {{:: | {{:: | ||
| Line 161: | Line 170: | ||
| If you only use RNAseq as extrinsic evidence, you essentially can only use //donor splice site// and //acceptor splice site// hints. If you also have protein homology information, | If you only use RNAseq as extrinsic evidence, you essentially can only use //donor splice site// and //acceptor splice site// hints. If you also have protein homology information, | ||
| + | The intron hints contain explicit location information and influence the optimal path through the GHMM machine (AUGUSTUS+ paper, Stanke et al 2006). It is important to note that since this is a probabilistic model, **hints can sometimes be ignored if the intrinsic information is strong enough!** | ||
| Predict genes using Genemark-ET and Augustus through braker2: | Predict genes using Genemark-ET and Augustus through braker2: | ||
gene_prediction_with_braker2_pipeline.1763482674.txt.gz · Last modified: by 134.190.191.148
