User Tools

Site Tools


benchmarking_universal_single-copy_orthologs_busco

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
benchmarking_universal_single-copy_orthologs_busco [2020/04/13 09:05] 24.138.68.92benchmarking_universal_single-copy_orthologs_busco [2024/12/10 14:49] (current) 129.173.94.151
Line 1: Line 1:
 **BUSCO: Benchmarking Universal Single-Copy Orthologs** **BUSCO: Benchmarking Universal Single-Copy Orthologs**
  
-Documentation by Dayana Salas-Leiva (last update 04-13-2020)+Documentation by Dayana Salas-Leiva (last update by Dandan Zhao 12-10-2024)
  
-Web:http://busco.ezlab.org/+Web: http://busco.ezlab.org/ User Guide: https://busco.ezlab.org/busco_userguide.html
  
-You can check the completeness of your genome by picking out single-copy orthologs. **BUSCO** runs **tBLASTn****AUGUSTUS**, and **HMMER 3** based on single-copy orthologs from the **OrthoDB** database+You can check the completeness of your genome by identifying single-copy orthologs from the OrthoDB databaseNewer versions of BUSCO utilize Metaeuk as a default gene predictor, can also be run with other tools like tBLASTn, AUGUSTUS, Prodigal, and HMMER3.
  
-There are two main dataset versions of busco: **odb9** (contains 303 orthologs only compatible with Busco3) and **odb10** (contains 255 orthologous only compatible with Busco4)+tBLASTn for eukaryotic genome and prokaryotic transcriptome modes 
 + 
 +Augustus for eukaryotic genome mode 
 + 
 +Metaeuk for eukaryotic genome and eukaryotic transcriptome modes 
 + 
 +Prodigal for prokaryotic genome mode 
 + 
 +HMMER3 for all modes 
 + 
 +There are two main dataset versions of busco: odb9 (contains 303 orthologs only compatible with Busco3) and odb10 (contains 255 orthologous only compatible with Busco4)
  
 The following are examples of a shell script for genomic and a proteomic search: The following are examples of a shell script for genomic and a proteomic search:
Line 71: Line 81:
  
  
 +** * BUSCO 5.2.2 * ** 
 +
 +** Genomic: default metaeuk** 
 +
 +    source activate busco-5
 +    INPUT='contigs_clean.fasta'
 +    OUTDIR='busco5_out'
 +    MODE='genome'
 +    # setting the lineage db
 +    ## the latest busco db for eukaryota is odb10
 +    LINEAGEDB='/scratch5/db/Eukfinder/BUSCO/busco_downloads/lineages/eukaryota_odb10/'
 +    ## busco v5 only works with odb10
 +    ## it will not work with odb9
 +    # run busco
 +    ## do not specify output dir with a trailing slash, it will lead to a fatal error
 +    ## modes are genome, proteins, transcriptome
 +    ## the below command will use Metaeuk as gene predictor
 +    busco \
 +        --in $INPUT \
 +        --out $OUTDIR \
 +        --mode $MODE \
 +        --lineage_dataset $LINEAGEDB \
 +        --cpu 8
 +    conda deactivate
 +
 +** Proteomic:** 
  
 +    source activate busco-5
 +    # in the busco-5 environment, AUGUSTUS_CONFIG_PATH is set to
 +    # /scratch2/software/anaconda/envs/busco-5/config/
 +    # but we don't have writing permissions there
 +    # not sure why we need writing permissions but it doesnt work anyway
 +    # but we copied that dir to a place where we do have writing permissions:
 +    # you may want to copy it to your own home
 +    export AUGUSTUS_CONFIG_PATH="$HOME/busco/config/"
 +    INPUT='contigs_clean.fasta'
 +    OUTDIR='busco5_contigs_clean_out'
 +    MODE='genome'
 +    AUGUSTUS_SPECIES='leishmania_tarentolae'
 +    # setting the lineage db
 +    ## the latest version (as of writing) is odb10
 +    LINEAGEDB='/scratch5/db/Eukfinder/BUSCO/busco_downloads/lineages/eukaryota_odb10/'
 +    ## busco v5 only works with odb10
 +    ## it will not work with odb9
 +    busco \
 +        --in $INPUT \
 +        --out $OUTDIR \
 +        --mode $MODE \
 +        --lineage_dataset $LINEAGEDB \
 +        --cpu 8 \
 +        --augustus \
 +        --augustus_species $AUGUSTUS_SPECIES \
 +    conda deactivate
 +    
  
 **Note**: Take out the mitochondrial genome before running this analysis. **Note**: Take out the mitochondrial genome before running this analysis.
benchmarking_universal_single-copy_orthologs_busco.1586779511.txt.gz · Last modified: by 24.138.68.92