User Tools

Site Tools


assembling_long_read_data

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
assembling_long_read_data [2017/11/09 12:48] 129.173.88.84assembling_long_read_data [2018/01/08 14:51] (current) 129.173.88.84
Line 1: Line 1:
 ====== ASSEMBLING LONG READ DATA ====== ====== ASSEMBLING LONG READ DATA ======
 +
 +Documentation by Sarah Shah
  
 When you have your porechopped reads in fastq and fasta formats, try out the following assemblers: When you have your porechopped reads in fastq and fasta formats, try out the following assemblers:
  
-Programs: ABruijn ([[https://github.com/fenderglass/ABruijn]]), Canu ([[http://canu.readthedocs.io/en/latest/quick-start.html]]), smartdenovo ([[https://github.com/ruanjue/smartdenovo]]), miniasm ([[https://github.com/lh3/miniasm]])+Programs: ABruijn ([[https://github.com/fenderglass/ABruijn]]), Flye ([[https://github.com/fenderglass/Flye]]), Canu ([[http://canu.readthedocs.io/en/latest/quick-start.html]]), smartdenovo ([[https://github.com/ruanjue/smartdenovo]]), miniasm ([[https://github.com/lh3/miniasm]])
  
 **ABruijn** **ABruijn**
Line 26: Line 28:
  
 /scratch2/software/ABruijn-1.0/bin/abruijn /path/to_your_fasta /path/to_an_output_directory <estimated coverage> --platform nano --threads 10 /scratch2/software/ABruijn-1.0/bin/abruijn /path/to_your_fasta /path/to_an_output_directory <estimated coverage> --platform nano --threads 10
 +</code>
 +
 +Abruijn has been replaced by **Flye** as of January 2018! Example usage:
 +<code>
 +#!/bin/bash
 +#$ -S /bin/bash
 +. /etc/profile
 +#$ -cwd
 +#$ -pe threaded 16
 +#$ -o leg
 +
 +source /scratch2/software/python-2.7-env/bin/activate
 +
 +unset PYTHONPATH
 +
 +flye --nano-raw Acas_merged_pc_fl.fastq --genome-size 45m --out-dir Acas_filtlongFlye --threads 16 --iterations 3 --min-overlap 3000
 </code> </code>
 **Canu** **Canu**
Line 60: Line 78:
 Download smartdenovo to your account on Perun. Download smartdenovo to your account on Perun.
 <code> <code>
-/path/to/smartdenovo/smartdenovo.pl -p prefix reads.fa > prefix.mak +/path/to/smartdenovo/smartdenovo.pl reads.fa > reads.mak 
-make -f prefix.mak+make -f reads.mak
 </code> </code>
 The **.utg** file is the important output. The **.utg** file is the important output.
Line 69: Line 87:
 The simplest and the fastest of all the assemblers here. First, self-map the fasta file using minimap2: The simplest and the fastest of all the assemblers here. First, self-map the fasta file using minimap2:
 <code> <code>
-minimap2 -x ava-ont reads.fa reads.fa | gzip -1 > reads.paf.gz+minimap2 -x ava-ont reads.fq reads.fq | gzip -1 > reads.paf.gz
 </code> </code>
  
Line 82: Line 100:
 </code> </code>
  
-Do a quick BLAST search of your contigs and separate out the eukaryotic and bacterial contigs. Compare your assemblies using QUAST ([[http://quast.bioinf.spbau.ru/]]) and continue to [[nanopore_tools_for_polishing|polishing and correcting]] your chosen assembly. +---- 
 + 
 +The Unicycler Github page ([[https://github.com/rrwick/Unicycler]]) has nice examples of how good, alright, and terrible graphs look like.  
 + 
 +Do a quick BLAST search of your contigs and separate out the eukaryotic and bacterial contigs. Compare your assemblies using QUAST ([[http://quast.bioinf.spbau.ru/]]) and continue to **[[nanopore_tools_for_polishing|polishing and correcting]]** your chosen assembly. 
  
assembling_long_read_data.1510246113.txt.gz · Last modified: by 129.173.88.84