User Tools

Site Tools


benchmarking_universal_single-copy_orthologs_busco

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
benchmarking_universal_single-copy_orthologs_busco [2024/12/10 14:00] 129.173.94.151benchmarking_universal_single-copy_orthologs_busco [2024/12/10 14:49] (current) 129.173.94.151
Line 3: Line 3:
 Documentation by Dayana Salas-Leiva (last update by Dandan Zhao 12-10-2024) Documentation by Dayana Salas-Leiva (last update by Dandan Zhao 12-10-2024)
  
-Web: http://busco.ezlab.org/ +Web: http://busco.ezlab.org/ User Guide: https://busco.ezlab.org/busco_userguide.html
-User Guide: https://busco.ezlab.org/busco_userguide.html+
  
-You can check the completeness of your genome by identifying single-copy orthologs from the OrthoDB database. Newer versions of BUSCO utilize **Metaeuk** as a default gene predictor, can also be run with other tools like **tBLASTn****AUGUSTUS****Prodigal**, and **HMMER3**.+You can check the completeness of your genome by identifying single-copy orthologs from the OrthoDB database. Newer versions of BUSCO utilize Metaeuk as a default gene predictor, can also be run with other tools like tBLASTn, AUGUSTUS, Prodigal, and HMMER3.
  
-**tBLASTn** for eukaryotic genome and prokaryotic transcriptome modes+tBLASTn for eukaryotic genome and prokaryotic transcriptome modes
  
-**Augustus** for eukaryotic genome mode+Augustus for eukaryotic genome mode
  
-**Metaeuk** for eukaryotic genome and eukaryotic transcriptome modes+Metaeuk for eukaryotic genome and eukaryotic transcriptome modes
  
-**Prodigal** for prokaryotic genome mode+Prodigal for prokaryotic genome mode
  
-**HMMER3** for all modes+HMMER3 for all modes
  
 +There are two main dataset versions of busco: odb9 (contains 303 orthologs only compatible with Busco3) and odb10 (contains 255 orthologous only compatible with Busco4)
  
-There are two main dataset versions of busco: **odb9** (contains 303 orthologs only compatible with Busco3) and **odb10** (contains 255 orthologous only compatible with Busco4) +The following are examples of a shell script for genomic and proteomic search:
- +
-The following are examples of a shell script for genomic and proteomic search:+
  
 ***  BUSCO 3 *** ***  BUSCO 3 ***
Line 34: Line 32:
    cd $PWD    cd $PWD
    #busco3.0.0    #busco3.0.0
-   LINEAGEDB='/scratch5/db/Eukfinder/BUSCO/odb9/lineages/eukaryota_odb9/' 
    source activate busco-3    source activate busco-3
-   export AUGUSTUS_CONFIG_PATH="$HOME/Shared/BUSCO/config" +   export AUGUSTUS_CONFIG_PATH="/home/dsalas/Shared/BUSCO/config" 
-   run_BUSCO.py -i <fasta_file> -o <output_dir_name> -l $LINEAGEDB -m geno --cpu 1+   run_BUSCO.py -i <fasta_file> -o <output_dir_name> -l /home/dsalas/Shared/BUSCO/eukaryota_odb9 -m geno --cpu 1
    conda deactivate    conda deactivate
  
Line 49: Line 46:
    cd $PWD    cd $PWD
    #busco3.0.0    #busco3.0.0
-   LINEAGEDB='/scratch5/db/Eukfinder/BUSCO/odb9/lineages/eukaryota_odb9/' 
    source activate busco-3    source activate busco-3
    export AUGUSTUS_CONFIG_PATH="/home/dsalas/Shared/BUSCO/config"    export AUGUSTUS_CONFIG_PATH="/home/dsalas/Shared/BUSCO/config"
-   run_BUSCO.py -i <fasta_file> -o <output_dir_name> -l $LINEAGEDB -m prot --cpu 1+   run_BUSCO.py -i <fasta_file> -o <output_dir_name> -l /home/dsalas/Shared/BUSCO/eukaryota_odb9 -m prot --cpu 1
    conda deactivate    conda deactivate
  
Line 65: Line 61:
    #$ -cwd    #$ -cwd
    cd $PWD    cd $PWD
-   LINEAGEDB='/scratch5/db/Eukfinder/BUSCO/odb9/lineages/eukaryota_odb9/' +   source activate busco
-   source activate busco-4+
    #BUSCO 4.0.5    #BUSCO 4.0.5
-   export AUGUSTUS_CONFIG_PATH="$HOME/Shared/BUSCO/config" +   export AUGUSTUS_CONFIG_PATH="/home/dsalas/Shared/BUSCO/config" 
-   busco -i <input_scaffolds_file> -o <output_dir_name> -l $HOME/BUSCO/eukaryota_odb9 -m geno --cpu 1+   busco -i <input_scaffolds_file> -o <output_dir_name> -l /home/dsalas/Shared/BUSCO/eukaryota_odb9 -m geno --cpu 1
  
  
Line 80: Line 75:
    #$ -cwd    #$ -cwd
    cd $PWD    cd $PWD
-   LINEAGEDB='/scratch5/db/Eukfinder/BUSCO/odb9/lineages/eukaryota_odb9/' +   source activate busco
-   source activate busco-4+
    #BUSCO 4.0.5    #BUSCO 4.0.5
-   export AUGUSTUS_CONFIG_PATH="$HOME/BUSCO/config" +   export AUGUSTUS_CONFIG_PATH="/home/dsalas/Shared/BUSCO/config" 
-   busco -i <predicted protein fasta> -o <output_dir-name> -l $HOME/BUSCO/eukaryota_odb9 -m prot --cpu 1+   busco -i <predicted protein fasta> -o <output_dir-name> -l /home/dsalas/Shared/BUSCO/eukaryota_odb9 -m prot --cpu 1
  
  
-** * BUSCO 5.2.2 * **+** * BUSCO 5.2.2 * ** 
  
-** Genomic: default metaeuk**+** Genomic: default metaeuk** 
  
-   source activate busco-5 +    source activate busco-5 
-   INPUT='contigs_clean.fasta' +    INPUT='contigs_clean.fasta' 
-   OUTDIR='busco5_out' +    OUTDIR='busco5_out' 
-   MODE='genome' +    MODE='genome' 
-   # setting the lineage db +    # setting the lineage db 
-   ## the latest busco db for eukaryota is odb10 +    ## the latest busco db for eukaryota is odb10 
-   LINEAGEDB='/scratch5/db/Eukfinder/BUSCO/busco_downloads/lineages/eukaryota_odb10/' +    LINEAGEDB='/scratch5/db/Eukfinder/BUSCO/busco_downloads/lineages/eukaryota_odb10/' 
-   ## busco v5 only works with odb10 +    ## busco v5 only works with odb10 
-   ## it will not work with odb9 +    ## it will not work with odb9 
-   # run busco +    # run busco 
-   ## do not specify output dir with a trailing slash, it will lead to a fatal error +    ## do not specify output dir with a trailing slash, it will lead to a fatal error 
-   ## modes are genome, proteins, transcriptome +    ## modes are genome, proteins, transcriptome 
-   ## the below command will use Metaeuk as gene predictor +    ## the below command will use Metaeuk as gene predictor 
-   busco \ +    busco \ 
-       --in $INPUT \ +        --in $INPUT \ 
-       --out $OUTDIR \ +        --out $OUTDIR \ 
-       --mode $MODE \ +        --mode $MODE \ 
-       --lineage_dataset $LINEAGEDB \ +        --lineage_dataset $LINEAGEDB \ 
-       --cpu 8 +        --cpu 8 
-   conda deactivate+    conda deactivate
  
 +** Proteomic:** 
  
-**Proteomic:** +    source activate busco-5 
- +    # in the busco-5 environment, AUGUSTUS_CONFIG_PATH is set to 
-   source activate busco-5 +    # /scratch2/software/anaconda/envs/busco-5/config/ 
-   # in the busco-5 environment, AUGUSTUS_CONFIG_PATH is set to +    # but we don't have writing permissions there 
-   # /scratch2/software/anaconda/envs/busco-5/config/ +    # not sure why we need writing permissions but it doesnt work anyway 
-   # but we don't have writing permissions there +    # but we copied that dir to a place where we do have writing permissions: 
-   # not sure why we need writing permissions but it doesnt work anyway +    # you may want to copy it to your own home 
-   # but we copied that dir to a place where we do have writing permissions: +    export AUGUSTUS_CONFIG_PATH="$HOME/busco/config/" 
-   # you may want to copy it to your own home +    INPUT='contigs_clean.fasta' 
-   export AUGUSTUS_CONFIG_PATH="$HOME/busco/config/" +    OUTDIR='busco5_contigs_clean_out' 
-   INPUT='contigs_clean.fasta' +    MODE='genome' 
-   OUTDIR='busco5_contigs_clean_out' +    AUGUSTUS_SPECIES='leishmania_tarentolae' 
-   MODE='genome' +    # setting the lineage db 
-   AUGUSTUS_SPECIES='leishmania_tarentolae' +    ## the latest version (as of writing) is odb10 
-   # setting the lineage db +    LINEAGEDB='/scratch5/db/Eukfinder/BUSCO/busco_downloads/lineages/eukaryota_odb10/' 
-   ## the latest version (as of writing) is odb10 +    ## busco v5 only works with odb10 
-   LINEAGEDB='/scratch5/db/Eukfinder/BUSCO/busco_downloads/lineages/eukaryota_odb10/' +    ## it will not work with odb9 
-   ## busco v5 only works with odb10 +    busco \ 
-   ## it will not work with odb9 +        --in $INPUT \ 
-   busco \ +        --out $OUTDIR \ 
-       --in $INPUT \ +        --mode $MODE \ 
-       --out $OUTDIR \ +        --lineage_dataset $LINEAGEDB \ 
-       --mode $MODE \ +        --cpu 8 \ 
-       --lineage_dataset $LINEAGEDB \ +        --augustus \ 
-       --cpu 8 \ +        --augustus_species $AUGUSTUS_SPECIES \ 
-       --augustus \ +    conda deactivate 
-       --augustus_species $AUGUSTUS_SPECIES \ +    
-   conda deactivate +
  
 **Note**: Take out the mitochondrial genome before running this analysis. **Note**: Take out the mitochondrial genome before running this analysis.
benchmarking_universal_single-copy_orthologs_busco.1733853654.txt.gz · Last modified: by 129.173.94.151