User Tools

Site Tools


phylogeny_protocol2

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
phylogeny_protocol2 [2021/10/10 19:05] 134.190.232.9phylogeny_protocol2 [2022/02/07 15:21] (current) 134.190.232.106
Line 1: Line 1:
 +The GitHub resource for this protocol: https://github.com/zx0223winner/TreeTuner
 +
 **Background** **Background**
  
Line 337: Line 339:
 MDNKYTSSAQNVLVLAQEQAKYFKHQAVGTEHLLLALAIEKEGIASKILGQ MDNKYTSSAQNVLVLAQEQAKYFKHQAVGTEHLLLALAIEKEGIASKILGQ
 </code> </code>
 +
 +Directory to renamed MMETSP: /misc/scratch2/###/###/mmetsp
  
 Then with the two renamed database available, you could merge then by 'cat'. Then build the new merged database via 'makeblastdb'. Then Blast them again.8-) Then with the two renamed database available, you could merge then by 'cat'. Then build the new merged database via 'makeblastdb'. Then Blast them again.8-)
Line 344: Line 348:
 Finally, after using two different methods, we can touch on the topic we raised up at very beginning. Coarse and fine-tuning large phylogenetic datasets via reducing the redundancy and complexity.  Finally, after using two different methods, we can touch on the topic we raised up at very beginning. Coarse and fine-tuning large phylogenetic datasets via reducing the redundancy and complexity. 
  
-1. **Fine-tuning**  Laura Eme (2012-14) written in Perl+1. **Coarse-tuning**: Let's start with the relatively simple one coarse-tuning via Treetrimmer (Maruyama et.al 2013) 
 + 
 +<code> 
 +ruby treetrimmer.rb sample/####_aligned_trimmed.newick sample/###_parameter_input.in sample/taxonomic_info.txt > ###_treetrimmer.newick 
 +</code> 
 + 
 +The "##..newick" and "###input.in" files can easily be prepared. The taxonomic_info.txt;however need to reformatted. 
 + 
 +<code> 
 +taxonomic_info.txt 
 +NP_563657 Eukaryota; Viridiplantae; Streptophyta; Streptophytina; Embryophyta; Tracheophyta; Euphyllophyta; Spermatophyta; Magnoliopsida; Mesangiospermae; eudicotyledons; Gunneridae; Pentapetalae; rosids; malvids; Brassicales; Brassicaceae; Camelineae; Arabidopsis; Arabidopsis thaliana 
 +XP_002889406 Eukaryota; Viridiplantae; Streptophyta; Streptophytina; Embryophyta; Tracheophyta; Euphyllophyta; Spermatophyta; Magnoliopsida; Mesangiospermae; eudicotyledons; Gunneridae; Pentapetalae; rosids; malvids; Brassicales; Brassicaceae; Camelineae; Arabidopsis; Arabidopsis lyrata; Arabidopsis lyrata subsp. lyrata 
 +</code> 
 + 
 +The taxonomic_info.txt can be created by acc2tax program. please read more from here:http://129.173.88.134:81/dokuwiki/doku.php?id=phylogeny_protocol3 
 + 
 +__Note: The acc2tax need the gene ID without version (e.g.NP_563657), so as the NCBI ID.__ Please find the usage of the program: http://129.173.88.134:81/dokuwiki/doku.php?id=taxonomy_recovery; http://129.173.88.134:81/dokuwiki/doku.php?id=phylogeny_protocol3 
 + 
 +<code> 
 +>WP_048801694.1 ATP-dependent Clp protease ATP-binding subunit [Leuconostoc citreum]GEK62024.1 ATP-dependent Clp protease ATP-binding subunit ClpC [Leuconostoc citreum] 
 +MDNKYTSSAQNVLVLAQEQAKYFKHQAVGTEHLLLALAIEKEGIASKILGQFNVTDDDIREEIEHFTGYGM 
 +</code> 
 + 
 +With the taxonomic_info.txt ready, you can get the tree file and another taxa file: 
 +<code> 
 +XP_026407875 Eukaryota; Viridiplantae; Streptophyta; Streptophytina; Embryophyta; Tracheophyta; Euphyllophyta; Spermatophyta; Magnoliopsida; Mesangiospermae; Ranunculales; Papaveraceae; Papaveroideae; Papaver; Papaver somniferum 2 4 
 +XP_034682772 Eukaryota; Viridiplantae; Streptophyta; Streptophytina; Embryophyta; Tracheophyta; Euphyllophyta; Spermatophyta; Magnoliopsida; Mesangiospermae; eudicotyledons; Gunneridae; Pentapetalae; rosids; rosids incertae sedis; Vitales; Vitaceae; Viteae; Vitis; Vitis riparia 2 
 +</code> 
 + 
 +This tree give a rough tree diversity estimation. 
 + 
 + 
 +2. **Fine-tuning**  Laura Eme (2012-14) written in Perl
  
 <code> <code>
Line 388: Line 424:
 Based on the trimmed aligned seq, you can re-analysis more rigorous downstream IQ-tree analysis. Based on the trimmed aligned seq, you can re-analysis more rigorous downstream IQ-tree analysis.
  
-2. **Coarse-tuning**Let's start with the relatively simple one coarse-tuning via Treetrimmer (Maruyama et.al 2013)+Notenot all genesspecies have taxa.This have nothing to do with the updates of NCBI taxonomy. 
 + 
 +The '0' in Gene name 'CP_0177652116_0_Stygamoeba_regulata_BSH-02190019' is not a NCBI taxid
  
  
-<Last updated by Xi Zhang on Oct 6th,2021> upcoming+<Last updated by Xi Zhang on Oct 6th,2021>
phylogeny_protocol2.1633903505.txt.gz · Last modified: by 134.190.232.9