This is an old revision of the document!

ASSEMBLING LONG READ DATA

When you have your porechopped reads in fastq and fasta formats, try out the following assemblers:

Programs: ABruijn (https://github.com/fenderglass/ABruijn), Canu (http://canu.readthedocs.io/en/latest/quick-start.html), smartdenovo (https://github.com/ruanjue/smartdenovo), miniasm (https://github.com/lh3/miniasm)

ABruijn

ABruijn is relatively simple to use. As its name suggests, it uses A-Bruijn graph to find the overlaps between reads. It has a polishing step to improve quality.

It needs a fasta input. The final product is a polished_(1+number of iterations specified).fasta.

#!/bin/bash
#$ -S /bin/bash
. /etc/profile
#$ -cwd
#$ -pe threaded 10

unset PYTHONPATH
export PATH=/scratch2/software/Python-2.7.13/bin:$PATH
export LD_LIBRARY_PATH=/scratch2/software/Python-2.7.13/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=/scratch2/software/hdf5-1.8.18/lib:$LD_LIBRARY_PATH
export PATH=/scratch2/software/blasr/bin:$PATH

cd /path/to your working directory

/scratch2/software/ABruijn-1.0/bin/abruijn /path/to_your_fasta /path/to_an_output_directory <estimated coverage> --platform nano --threads 10

Canu

smartdenovo

miniasm