benchmarking_universal_single-copy_orthologs_busco
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| benchmarking_universal_single-copy_orthologs_busco [2024/12/10 13:59] – 129.173.94.151 | benchmarking_universal_single-copy_orthologs_busco [2024/12/10 14:49] (current) – 129.173.94.151 | ||
|---|---|---|---|
| Line 3: | Line 3: | ||
| Documentation by Dayana Salas-Leiva (last update by Dandan Zhao 12-10-2024) | Documentation by Dayana Salas-Leiva (last update by Dandan Zhao 12-10-2024) | ||
| - | Web: http:// | + | Web: http:// |
| - | User Guide: https:// | + | |
| - | You can check the completeness of your genome by identifying single-copy orthologs from the OrthoDB database. Newer versions of BUSCO utilize | + | You can check the completeness of your genome by identifying single-copy orthologs from the OrthoDB database. Newer versions of BUSCO utilize Metaeuk as a default gene predictor, can also be run with other tools like tBLASTn, AUGUSTUS, Prodigal, and HMMER3. |
| - | **tBLASTn** for eukaryotic genome and prokaryotic transcriptome | + | tBLASTn for eukaryotic genome and prokaryotic transcriptome modes |
| - | **Augustus** for eukaryotic genome mode | + | |
| - | **Metaeuk** for eukaryotic genome and eukaryotic transcriptome modes | + | |
| - | **Prodigal** for prokaryotic genome mode | + | |
| - | **HMMER3** for all modes | + | |
| - | There are two main dataset versions of busco: **odb9** (contains 303 orthologs only compatible with Busco3) and **odb10** (contains 255 orthologous only compatible with Busco4) | + | Augustus for eukaryotic genome mode |
| - | The following are examples of a shell script for genomic and proteomic search: | + | Metaeuk for eukaryotic genome and eukaryotic transcriptome modes |
| + | |||
| + | Prodigal for prokaryotic genome mode | ||
| + | |||
| + | HMMER3 for all modes | ||
| + | |||
| + | There are two main dataset versions of busco: odb9 (contains 303 orthologs only compatible with Busco3) and odb10 (contains 255 orthologous only compatible with Busco4) | ||
| + | |||
| + | The following are examples of a shell script for genomic and a proteomic search: | ||
| *** BUSCO 3 *** | *** BUSCO 3 *** | ||
| Line 29: | Line 32: | ||
| cd $PWD | cd $PWD | ||
| # | # | ||
| - | | ||
| | | ||
| - | | + | |
| - | | + | |
| conda deactivate | conda deactivate | ||
| Line 44: | Line 46: | ||
| cd $PWD | cd $PWD | ||
| # | # | ||
| - | | ||
| | | ||
| | | ||
| - | | + | |
| conda deactivate | conda deactivate | ||
| Line 60: | Line 61: | ||
| #$ -cwd | #$ -cwd | ||
| cd $PWD | cd $PWD | ||
| - | LINEAGEDB='/ | + | |
| - | source activate busco-4 | + | |
| # | # | ||
| - | | + | |
| - | busco -i < | + | busco -i < |
| Line 75: | Line 75: | ||
| #$ -cwd | #$ -cwd | ||
| cd $PWD | cd $PWD | ||
| - | LINEAGEDB='/ | + | |
| - | source activate busco-4 | + | |
| # | # | ||
| - | | + | |
| - | busco -i < | + | busco -i < |
| - | ** * BUSCO 5.2.2 * ** | + | ** * BUSCO 5.2.2 * ** |
| - | ** Genomic: default metaeuk** | + | ** Genomic: default metaeuk** |
| - | source activate busco-5 | + | |
| + | INPUT=' | ||
| + | OUTDIR=' | ||
| + | MODE=' | ||
| + | # setting the lineage db | ||
| + | ## the latest busco db for eukaryota is odb10 | ||
| + | LINEAGEDB='/ | ||
| + | ## busco v5 only works with odb10 | ||
| + | ## it will not work with odb9 | ||
| + | # run busco | ||
| + | ## do not specify output dir with a trailing slash, it will lead to a fatal error | ||
| + | ## modes are genome, proteins, transcriptome | ||
| + | ## the below command will use Metaeuk as gene predictor | ||
| + | busco \ | ||
| + | --in $INPUT \ | ||
| + | --out $OUTDIR \ | ||
| + | --mode $MODE \ | ||
| + | --lineage_dataset $LINEAGEDB \ | ||
| + | --cpu 8 | ||
| + | conda deactivate | ||
| - | | + | ** Proteomic: |
| - | | + | |
| - | | + | |
| - | + | ||
| - | # setting the lineage db | + | |
| - | ## the latest busco db for eukaryota is odb10 | + | |
| - | | + | |
| - | ## busco v5 only works with odb10 | + | |
| - | ## it will not work with odb9 | + | |
| - | + | ||
| - | + | ||
| - | # run busco | + | |
| - | ## do not specify output dir with a trailing slash, it will lead to a fatal error | + | |
| - | ## modes are genome, proteins, transcriptome | + | |
| - | ## the below command will use Metaeuk as gene predictor | + | |
| - | busco \ | + | |
| - | --in $INPUT \ | + | |
| - | --out $OUTDIR \ | + | |
| - | | + | |
| - | | + | |
| - | --cpu 8 | + | |
| - | + | ||
| - | conda deactivate | + | |
| - | + | ||
| - | + | ||
| - | **Proteomic: | + | |
| - | + | ||
| - | | + | |
| - | + | ||
| - | # in the busco-5 environment, | + | |
| - | # / | + | |
| - | # but we don't have writing permissions there | + | |
| - | # not sure why we need writing permissions but it doesnt work anyway | + | |
| - | # but we copied that dir to a place where we do have writing permissions: | + | |
| - | # you may want to copy it to your own home | + | |
| - | | + | |
| - | + | ||
| - | | + | |
| - | | + | |
| - | | + | |
| - | | + | |
| - | + | ||
| - | # setting the lineage db | + | |
| - | ## the latest version (as of writing) is odb10 | + | |
| - | | + | |
| - | ## busco v5 only works with odb10 | + | |
| - | ## it will not work with odb9 | + | |
| - | + | ||
| - | busco \ | + | |
| - | --in $INPUT \ | + | |
| - | --out $OUTDIR \ | + | |
| - | | + | |
| - | | + | |
| - | --cpu 8 \ | + | |
| - | | + | |
| - | | + | |
| - | # | + | |
| - | + | ||
| - | conda deactivate | + | |
| + | source activate busco-5 | ||
| + | # in the busco-5 environment, | ||
| + | # / | ||
| + | # but we don't have writing permissions there | ||
| + | # not sure why we need writing permissions but it doesnt work anyway | ||
| + | # but we copied that dir to a place where we do have writing permissions: | ||
| + | # you may want to copy it to your own home | ||
| + | export AUGUSTUS_CONFIG_PATH=" | ||
| + | INPUT=' | ||
| + | OUTDIR=' | ||
| + | MODE=' | ||
| + | AUGUSTUS_SPECIES=' | ||
| + | # setting the lineage db | ||
| + | ## the latest version (as of writing) is odb10 | ||
| + | LINEAGEDB='/ | ||
| + | ## busco v5 only works with odb10 | ||
| + | ## it will not work with odb9 | ||
| + | busco \ | ||
| + | --in $INPUT \ | ||
| + | --out $OUTDIR \ | ||
| + | --mode $MODE \ | ||
| + | --lineage_dataset $LINEAGEDB \ | ||
| + | --cpu 8 \ | ||
| + | --augustus \ | ||
| + | --augustus_species $AUGUSTUS_SPECIES \ | ||
| + | conda deactivate | ||
| + | | ||
| **Note**: Take out the mitochondrial genome before running this analysis. | **Note**: Take out the mitochondrial genome before running this analysis. | ||
benchmarking_universal_single-copy_orthologs_busco.1733853578.txt.gz · Last modified: by 129.173.94.151
