User Tools

Site Tools


assembling_long_read_data

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
assembling_long_read_data [2017/11/09 12:53] 129.173.88.84assembling_long_read_data [2018/01/08 14:51] (current) 129.173.88.84
Line 5: Line 5:
 When you have your porechopped reads in fastq and fasta formats, try out the following assemblers: When you have your porechopped reads in fastq and fasta formats, try out the following assemblers:
  
-Programs: ABruijn ([[https://github.com/fenderglass/ABruijn]]), Canu ([[http://canu.readthedocs.io/en/latest/quick-start.html]]), smartdenovo ([[https://github.com/ruanjue/smartdenovo]]), miniasm ([[https://github.com/lh3/miniasm]])+Programs: ABruijn ([[https://github.com/fenderglass/ABruijn]]), Flye ([[https://github.com/fenderglass/Flye]]), Canu ([[http://canu.readthedocs.io/en/latest/quick-start.html]]), smartdenovo ([[https://github.com/ruanjue/smartdenovo]]), miniasm ([[https://github.com/lh3/miniasm]])
  
 **ABruijn** **ABruijn**
Line 28: Line 28:
  
 /scratch2/software/ABruijn-1.0/bin/abruijn /path/to_your_fasta /path/to_an_output_directory <estimated coverage> --platform nano --threads 10 /scratch2/software/ABruijn-1.0/bin/abruijn /path/to_your_fasta /path/to_an_output_directory <estimated coverage> --platform nano --threads 10
 +</code>
 +
 +Abruijn has been replaced by **Flye** as of January 2018! Example usage:
 +<code>
 +#!/bin/bash
 +#$ -S /bin/bash
 +. /etc/profile
 +#$ -cwd
 +#$ -pe threaded 16
 +#$ -o leg
 +
 +source /scratch2/software/python-2.7-env/bin/activate
 +
 +unset PYTHONPATH
 +
 +flye --nano-raw Acas_merged_pc_fl.fastq --genome-size 45m --out-dir Acas_filtlongFlye --threads 16 --iterations 3 --min-overlap 3000
 </code> </code>
 **Canu** **Canu**
Line 62: Line 78:
 Download smartdenovo to your account on Perun. Download smartdenovo to your account on Perun.
 <code> <code>
-/path/to/smartdenovo/smartdenovo.pl -p prefix reads.fa > prefix.mak +/path/to/smartdenovo/smartdenovo.pl reads.fa > reads.mak 
-make -f prefix.mak+make -f reads.mak
 </code> </code>
 The **.utg** file is the important output. The **.utg** file is the important output.
Line 71: Line 87:
 The simplest and the fastest of all the assemblers here. First, self-map the fasta file using minimap2: The simplest and the fastest of all the assemblers here. First, self-map the fasta file using minimap2:
 <code> <code>
-minimap2 -x ava-ont reads.fa reads.fa | gzip -1 > reads.paf.gz+minimap2 -x ava-ont reads.fq reads.fq | gzip -1 > reads.paf.gz
 </code> </code>
  
Line 83: Line 99:
 awk '/^S/{print">"$2"\n"$3}' in.gfa | fold > out.fa awk '/^S/{print">"$2"\n"$3}' in.gfa | fold > out.fa
 </code> </code>
 +
 +----
 +
 +The Unicycler Github page ([[https://github.com/rrwick/Unicycler]]) has nice examples of how good, alright, and terrible graphs look like. 
  
 Do a quick BLAST search of your contigs and separate out the eukaryotic and bacterial contigs. Compare your assemblies using QUAST ([[http://quast.bioinf.spbau.ru/]]) and continue to **[[nanopore_tools_for_polishing|polishing and correcting]]** your chosen assembly.  Do a quick BLAST search of your contigs and separate out the eukaryotic and bacterial contigs. Compare your assemblies using QUAST ([[http://quast.bioinf.spbau.ru/]]) and continue to **[[nanopore_tools_for_polishing|polishing and correcting]]** your chosen assembly. 
  
assembling_long_read_data.1510246425.txt.gz · Last modified: by 129.173.88.84