User Tools

Site Tools


nanopore_tools_for_polishing

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
nanopore_tools_for_polishing [2019/01/18 08:21] 36.2.110.248nanopore_tools_for_polishing [2024/08/07 13:01] (current) 134.190.232.164
Line 1: Line 1:
 ====== Polishing your MinION assembly ====== ====== Polishing your MinION assembly ======
 Documentation by Jon Jerlström Hultqvist and Shelby Williams Documentation by Jon Jerlström Hultqvist and Shelby Williams
 +(updates by Joran Martijn)
  
 **Be aware that some scripts and commands might not be working any longer on Perun due to the switch to the new conda-environment system. Sections will be progressively updated to reflect this.** **Be aware that some scripts and commands might not be working any longer on Perun due to the switch to the new conda-environment system. Sections will be progressively updated to reflect this.**
Line 38: Line 39:
 minimap2 -t 8 $input interleavedshortreads.fq > temporary.paf minimap2 -t 8 $input interleavedshortreads.fq > temporary.paf
 echo "minimap2 done" echo "minimap2 done"
-racon -u -e 0.1 -w 5000 -q 1 -t 8 interleavedshortreads.fq temporary.paf $input >$output+racon -u -e 0.1 -w 500 -q 1 -t 8 interleavedshortreads.fq temporary.paf $input >$output
 echo "racon done" echo "racon done"
 rm temporary.paf rm temporary.paf
Line 48: Line 49:
 -S $output.sam -S $output.sam
 echo "Bowtie done" echo "Bowtie done"
-source deactivate+conda deactivate
 samtools view -F 4 -bS $output.sam |samtools sort > $output.sorted.bam samtools view -F 4 -bS $output.sam |samtools sort > $output.sorted.bam
 samtools index $output.sorted.bam > $output.sorted.bam.bai samtools index $output.sorted.bam > $output.sorted.bam.bai
Line 81: Line 82:
  
 First, make a BWA index of the assembly you wish to map onto by using the following command: First, make a BWA index of the assembly you wish to map onto by using the following command:
 +
 <code> <code>
 bwa index assembly_to_polish.fasta bwa index assembly_to_polish.fasta
 </code> </code>
 +
 Next, use the meteora_bwa.sh script to map the short reads onto your assembly. This will create a sorted.bam file. In this example, two paired-end read files will be mapped: Next, use the meteora_bwa.sh script to map the short reads onto your assembly. This will create a sorted.bam file. In this example, two paired-end read files will be mapped:
 +
 <code> <code>
-bwa mem -t 16 assembly_to_polish.fasta /scratch2/user/path/to/trimmedreads_1_PairNtrim.fastq.gz /scratch2/user/path/to/trimmedreads_2_PairNtrim.fastq.gz | samtools view -Sb | samtools sort >  piloninput.sorted.bam+bwa mem 
 +    -t 16 
 +    assembly_to_polish.fasta 
 +    /scratch2/user/path/to/trimmedreads_1_PairNtrim.fastq.gz 
 +    /scratch2/user/path/to/trimmedreads_2_PairNtrim.fastq.gz | \  
 +        samtools sort --threads 16 -o piloninput.sorted.bam
 </code> </code>
 +
 +UPDATE: You can now run bwa-mem2, which is an optimized version of bwa mem. It generates the exact same output, but is 2-4x faster:
 +
 +<code>
 +bwa-mem2 mem \
 +    -t 16 \
 +    assembly_to_polish.fasta \
 +    /scratch2/user/path/to/trimmedreads_1_PairNtrim.fastq.gz \
 +    /scratch2/user/path/to/trimmedreads_2_PairNtrim.fastq.gz | \ 
 +        samtools sort --threads 16 -o piloninput.sorted.bam
 +</code>
 +
 Once this is finished, use Pilon.sh to make changes in the assembly and generate a new consensus sequence. Pilon.sh can be formatted like so: Once this is finished, use Pilon.sh to make changes in the assembly and generate a new consensus sequence. Pilon.sh can be formatted like so:
 +
 <code> <code>
-java -Xmx16G -jar /scratch2/software/pilon/pilon-1.22.jar --genome assembly_to_polish.fasta --frags piloninput.sorted.bam --output P2x --outdir Pilon2x --threads 16+java -Xmx16G -jar /scratch2/software/pilon/pilon-1.22.jar 
 +    --genome assembly_to_polish.fasta 
 +    --frags piloninput.sorted.bam 
 +    --output P2x 
 +    --outdir Pilon2x 
 +    --threads 16
 </code> </code>
 +
 +UPDATE: The --threads option is as of v1.24 no longer maintained. It seems Pilon doesn't use more than 200-300% CPU (i.e. 3 threads) at most, so setting --threads to 4 orso should be sufficient.
 +
 You may run into an error where Pilon does not recognize the bam file created from the previous step as being indexed. To fix this, run: You may run into an error where Pilon does not recognize the bam file created from the previous step as being indexed. To fix this, run:
 +
 <code> <code>
 samtools index /path/to_bam_file samtools index /path/to_bam_file
 </code> </code>
 +
 This will return a .bam.bai file. This file needs to be in the same folder as Pilon.sh, but does not need to be placed in the script. This will return a .bam.bai file. This file needs to be in the same folder as Pilon.sh, but does not need to be placed in the script.
  
Line 102: Line 134:
 Shell script: Shell script:
 {{ :unicyclersh.docx |}} {{ :unicyclersh.docx |}}
 +
 +<code>
 +#!/bin/bash
 +#$ -S /bin/bash
 +. /etc/profile
 +#$ -cwd
 +#$ -pe threaded 16
 +
 +#cd /scratch2/jon/MinION/BMAN/assemblies/Unicycler_polish/
 +
 +echo "Starting"
 +
 +unset PYTHONPATH
 +export PATH=/scratch2/software/gcc-6.3.0/bin:/scratch2/software/Python-3.6.0/bin:$PATH
 +export LD_LIBRARY_PATH=/scratch2/software/gcc-6.3.0/lib64:/scratch2/software/Python-3.6.0/lib:$LD_LIBRARY_PATH
 +
 +/scratch2/software/Python-3.6.0/bin/unicycler_polish -1 /scratch2/shelbyw/RCL_Unicycler/RCL_1_PairNtrim.fq -2 /scratch2/shelbyw/RCL_Unicycler/RCL_2_PairNtrim.fq --long_reads RCL_MinION.CutAdapt75.3000.chop.fastq.gz -a RCL_unclean_AB_assembly_fix_Racon2_Pilon3.fasta --pilon=/scratch2/software/pilon/pilon-1.22.jar --samtools=/opt/perun/bin/samtools --threads 16
 +
 +
 +echo "Done!"
 +
 +</code>
  
 Formatting: Formatting:
Line 139: Line 193:
 If illumina reads are available it might be possible to skip nanopolish altogether and go directly to Pilon polishing after Racon. This has been exemplified in the Solanum penellii pre-print where nanopolish simply was not feasible. If illumina reads are available it might be possible to skip nanopolish altogether and go directly to Pilon polishing after Racon. This has been exemplified in the Solanum penellii pre-print where nanopolish simply was not feasible.
  
-Location: /scratch2/software/nanopolish+Location: /scratch2/software/anaconda/envs/nanopolish-0.12/bin
  
 Scripts: Scripts:
Line 158: Line 212:
 nanopolish merge - merges the pieces into new a new consensus. nanopolish merge - merges the pieces into new a new consensus.
  
-**Updated Nanopolish protocol (as of July 17 2018):** +**Updated Nanopolish protocol (as of March 8 2020):** 
  
 First, index your unchopped, raw reads file.  First, index your unchopped, raw reads file. 
 Use the sequencing_summary.txt produced by albacore during basecalling to speed up this step significantly. If you have several sequencing_summary.txt files these can be placed in a fof-file with the  path to the txt-file and called by -f. This also works in case of a single-file.: Use the sequencing_summary.txt produced by albacore during basecalling to speed up this step significantly. If you have several sequencing_summary.txt files these can be placed in a fof-file with the  path to the txt-file and called by -f. This also works in case of a single-file.:
 +for the following step **DO NOT** **use more than 1** thread because the program is not threaded!
 <code> <code>
 #!/bin/bash #!/bin/bash
Line 167: Line 222:
 . /etc/profile . /etc/profile
 #$ -cwd #$ -cwd
-#$ -pe threaded +#$ -pe threaded 1
 cd $PWD cd $PWD
  
-export PATH=/scratch2/software/anaconda/bin:$PATH +fast5path=/scratch2/path2/fast5/ 
-source activate nanopolish-python3 +fastq=/path2fastqlongreads.fastq 
- +seqsummary=/path2tosequencing_summary.txt 
-/scratch2/software/anaconda/envs/nanopolish-python3/bin/nanopolish index +source activate nanopolish-0.13.2 
--d /path/to/fast5/directory/+export PATH=/scratch2/software/anaconda/envs/nanopolish-0.13.2/bin:$PATH 
--f summary_files.fof \ +nanopolish index -d $fast5path -s $seqsummary $fastq 
-/path/to/reads.fastq+conda deactivate
  
-source deactivate 
  
 </code> </code>
Line 204: Line 257:
 samtools index reads.sorted.bam samtools index reads.sorted.bam
  
-source deactivate+conda deactivate
  
 </code> </code>
Line 224: Line 277:
 cd $PWD cd $PWD
  
-export PATH=/scratch2/software/anaconda/bin:$PATH +export PATH=/scratch2/software/anaconda/envs/nanopolish-0.12/bin:$PATH 
-source activate nanopolish-python3+ 
 +source activate nanopolish-0.12
  
-python /scratch2/software/anaconda/envs/nanopolish-python3/bin/nanopolish_makerange.py +nanopolish_makerange.py reference.fasta | parallel --results nanopolish.results -P 20 \ 
-reference.fasta | parallel --results nanopolish.results -P 20 \ +nanopolish variants --consensus -o polished.{1}.fa -w {1} \ 
-/scratch2/software/anaconda/envs/nanopolish-python3/bin/nanopolish variants --consensus polished.{1}.fa -w {1} \ +-r /path/to/reads.fastq -b reads.sorted.bam -g reference.fasta -t 10 --min-candidate-frequency 0.1
--r /path/to/reads.fastq +
--b reads.sorted.bam -g reference.fasta -t 10 --min-candidate-frequency 0.1+
  
-source deactivate+conda deactivate
  
 </code> </code>
nanopore_tools_for_polishing.1547814087.txt.gz · Last modified: by 36.2.110.248