User Tools

Site Tools


phylogeny_protocol5

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
phylogeny_protocol5 [2022/02/07 14:12] – created 134.190.232.106phylogeny_protocol5 [2022/02/08 14:02] (current) 134.190.232.106
Line 1: Line 1:
-It is often annoying to install +**How to build a YAML file thereby setting up Conda environment?** 
 + 
 +"//Users should be able to focus in their science, not in installing and managing software, so make it easy to them is fundamental.// " - Advice for Bioinformaticians. 
 + 
 +**Background** 
 + 
 +It is often annoying to set up installation dependencies(i.e., folder directory, hardcoded $PATH, and versions),especially when software developer does not follow good programming practices, such as avoiding hardcoded paths to file or script. 
 + 
 +To make the tools/pipelines more user friendly and reproducible, it is necessary make the installing process concise and straightforward thereby the most debugging efforts can be avoided.   
 + 
 +To speed up the installation process, developers usually prepare a conda environment definition file with all the dependencies listed, or/and a dockerfile to build an image (ideally both). Then install it into the conda environment or make it somehow findable by the $PATH system variable. 
 + 
 +  
 +**Build Yaml file** 
 + 
 +Taking the software pipeline PhyloToL as an example (Ceron-Romeroet.al.,2019), which requires the following dependencies. 
 + 
 +  - • Biopython (https://biopython.org/
 +  - • DendroPy (https://dendropy.org/
 +  - • P4 (http://p4.nhm.ac.uk/
 +  - • Bioperl (https://bioperl.org/
 +  - • MAFFT (v7; https://mafft.cbrc.jp/alignment/software/
 +  - • USEARCH (any version; https://www.drive5.com/usearch/
 +  - • Guidance (v2.02; http://guidance.tau.ac.il/overview.html) 
 +  - • trimAl (v1.3; http://trimal.cgenomics.org/
 +  - • RAxML (v8; https://cme.h-its.org/exelixis/web/software/raxml/index.html) 
 + 
 +1. First set up a Conda environment file name. 
 + 
 +<code> 
 +conda create -n PhyloToL python=3.6 
 + 
 +# Since most scripts in PhyloToL are Python3, so we first start with installing the python dependencies. 
 +# Since macOS, linux can have different software distributions, here since I tested it on Perun which run on Ubuntu linux system, so all the software packages are Linux compatible not for MacOS.  
 + 
 +</code> 
 + 
 +2. It is easy to use conda to install Biopython, DendroPy, Bioperl, MAFFT, Guidance, trimAl, RAxML 
 + 
 +For example, https://anaconda.org/bioconda/dendropy 
 +conda install -c bioconda dendropy  
 + 
 +3. However, for P4 and USEARCH which is not easy to set up. 
 + 
 +<code> 
 +# USEARCH  
 +https://www.drive5.com/usearch/download.html 
 +chmod +x /usr/bin/usearch6.0.98_i86linux32 
 +export PATH=/misc/scratch2/xizhang/PhyloTol/Conda:$PATH 
 + 
 +</code> 
 + 
 +<code> 
 +# P4  
 + 
 +https://p4.nhm.ac.uk/installation.html 
 + 
 +conda install scipy gsl nlopt bitarray 
 + 
 +https://github.com/pgfoster/p4-phylogenetics   
 + 
 +export PYTHONPATH=$PYTHONPATH:/misc/scratch2/xizhang/PhyloTol/Conda/p4-phylogenetics-master 
 + 
 +export PATH=$PATH:/misc/scratch2/xizhang/PhyloTol/Conda/p4-phylogenetics-master/bin 
 + 
 +must change the hardcoded path in setup.py  
 + 
 +my_include_dirs = ["/home/xizhang/.conda/envs/PhyloToL/include"
 +my_lib_dirs = ["/home/xizhang/.conda/envs/PhyloToL/lib"
 + 
 +conda install -c conda-forge nlopt 
 + 
 +conda install -c anaconda bitarray 
 + 
 +python3 setup.py build_ext -i 
 + 
 +p4 -help 
 + 
 +</code> 
 + 
 +4. With all these done, user can yield an yaml file.  
 + 
 +conda env export > PhyloToL.yml 
 + 
 +<code> 
 +name: PhyloToL 
 +channels: 
 +  - bioconda 
 +  - anaconda 
 +  - conda-forge 
 +  - defaults 
 +dependencies: 
 +  - _libgcc_mutex=0.1=main 
 +  - _openmp_mutex=4.5=1_gnu 
 +  - asttokens=2.0.5=pyhd8ed1ab_0 
 +  - biopython=1.79=py36h8f6f2f9_0 
 +  - bitarray=1.6.0=py36h7b6447c_0 
 +  - ca-certificates=2020.10.14=0 
 +  - certifi=2020.6.20=py36_0 
 +  - dendropy=4.5.2=pyh3252c3a_0 
 +  - executing=0.8.2=pyhd8ed1ab_0 
 +  - ld_impl_linux-64=2.35.1=h7274673_9 
 +  - libblas=3.9.0=11_linux64_openblas 
 +  - libcblas=3.9.0=11_linux64_openblas 
 +  - libffi=3.3=he6710b0_2 
 +  - libgcc=7.2.0=h69d50b8_2 
 +  - libgcc-ng=9.3.0=h5101ec6_17 
 +  - libgfortran-ng=11.2.0=h69a702a_12 
 +  - libgfortran5=11.2.0=h5c6108e_12 
 +  - libgomp=9.3.0=h5101ec6_17 
 +  - liblapack=3.9.0=11_linux64_openblas 
 +  - libopenblas=0.3.17=pthreads_h8fe5266_1 
 +  - libstdcxx-ng=9.3.0=hd4cf53a_17 
 +  - mafft=7.310=h1b792b2_4 
 +  - ncurses=6.3=h7f8727e_2 
 +  - nlopt=2.7.0=py36he9b8a8a_1 
 +  - numpy=1.19.5=py36hfc0c790_2 
 +  - openssl=1.1.1m=h7f8727e_0 
 +  - perl=5.26.2=h14c3975_0 
 +  - perl-bioperl=1.6.924=4 
 +  - perl-threaded=5.32.1=hdfd78af_1 
 +  - perl-yaml=1.29=pl526_0 
 +  - pip=21.2.2=py36h06a4308_0 
 +  - python=3.6.13=h12debd9_1 
 +  - python-devtools=0.8.0=pyhd8ed1ab_0 
 +  - python_abi=3.6=2_cp36m 
 +  - raxml=8.2.12=h779adbc_3 
 +  - readline=8.1.2=h7f8727e_1 
 +  - setuptools=58.0.4=py36h06a4308_0 
 +  - six=1.16.0=pyh6c4a22f_0 
 +  - sqlite=3.37.0=hc218d9a_0 
 +  - tk=8.6.11=h1ccaba5_0 
 +  - trimal=1.4.1=h7d875b9_5 
 +  - wheel=0.37.1=pyhd3eb1b0_0 
 +  - xz=5.2.5=h7b6447c_0 
 +  - zlib=1.2.11=h7f8727e_4 
 +prefix: /home/xizhang/.conda/envs/PhyloToL 
 + 
 +</code> 
 + 
 +**Execute Yaml file** 
 + 
 + With this yaml file on hand, the new user can easily install PhyloToL software dependencies(except P4 and USEARCH) via below. 
 + 
 +<code> 
 + 
 +source ~/.bashrc 
 + 
 +
 +# To activate this environment, use 
 +
 +# $ conda activate PhyloToL 
 +
 +# To deactivate an active environment, use 
 +
 +# $ conda deactivate 
 + 
 +</code> 
 + 
 +<Last updated by Xi Zhang on Feb 8th,2022>
phylogeny_protocol5.1644257543.txt.gz · Last modified: by 134.190.232.106