User Tools

Site Tools


de_novo_transcriptome_assembly_with_rnaspades

de novo transcriptome assembly with rnaSpades

By Joran Martijn (Apr 2023)

SPAdes is a very versatile assembly tool that originally specialized in assembling single-cell sequence data or any data generated with the Multiple Displacement Amplification (also known as whole-genome amplification) technique, but soon expanded into various other assembly problems, such as metagenome assembly and thus also transcriptome assembly.

Their rnaSPAdes manual is available here.

rnaSPAdes seems to have a better or on par accuracy with that of Trinity, on simulated transcriptome data and real transcriptomic data. rnaSPAdes seems to also be somewhat faster and less RAM hungry compared to Trinity and other assemblers (see this table). These tables are part of the rnaSPAdes paper, so they may be a bit biased towards their own tool, but nonetheless its another valid option for transcriptome assembly. Another perk is that it is actively maintained, and the developers are very helpful to your own particular issues!

See below an example of perun submission script for running a rnaSPAdes job ( also available on the RogerLab github page )

#!/bin/bash
#$ -S /bin/bash
. /etc/profile
#$ -cwd
#$ -m bea
#$ -M <your-email>
#$ -pe threaded 20
#$ -q 256G-batch

OUTDIR='rnaspades-3.14.1'
FW_READS='2_trimmomatic-0.39/eb_rna_fw.prd.fastq.gz'
RV_READS='2_trimmomatic-0.39/eb_rna_rv.prd.fastq.gz'
THREADS=20
MEMLIMIT=250

# library was prepped with dUTP method,
# so  all forward reads (read1) map to antisense
# and all reverse reads (read2) map to sense
# strands of the reference genome

# rnaSpades considers this type rf

source activate spades

rnaspades.py \
    -o $OUTDIR \
    --ss rf \
    -1 $FW_READS \
    -2 $RV_READS \
    -t $THREADS \
    -m $MEMLIMIT

conda deactivate
de_novo_transcriptome_assembly_with_rnaspades.txt · Last modified: by 134.190.232.186