nanopore_tools_for_polishing
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| nanopore_tools_for_polishing [2019/01/18 08:21] – 36.2.110.248 | nanopore_tools_for_polishing [2024/08/07 13:01] (current) – 134.190.232.164 | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| ====== Polishing your MinION assembly ====== | ====== Polishing your MinION assembly ====== | ||
| Documentation by Jon Jerlström Hultqvist and Shelby Williams | Documentation by Jon Jerlström Hultqvist and Shelby Williams | ||
| + | (updates by Joran Martijn) | ||
| **Be aware that some scripts and commands might not be working any longer on Perun due to the switch to the new conda-environment system. Sections will be progressively updated to reflect this.** | **Be aware that some scripts and commands might not be working any longer on Perun due to the switch to the new conda-environment system. Sections will be progressively updated to reflect this.** | ||
| Line 38: | Line 39: | ||
| minimap2 -t 8 $input interleavedshortreads.fq > temporary.paf | minimap2 -t 8 $input interleavedshortreads.fq > temporary.paf | ||
| echo " | echo " | ||
| - | racon -u -e 0.1 -w 5000 -q 1 -t 8 interleavedshortreads.fq temporary.paf $input >$output | + | racon -u -e 0.1 -w 500 -q 1 -t 8 interleavedshortreads.fq temporary.paf $input >$output |
| echo "racon done" | echo "racon done" | ||
| rm temporary.paf | rm temporary.paf | ||
| Line 48: | Line 49: | ||
| -S $output.sam | -S $output.sam | ||
| echo " | echo " | ||
| - | source | + | conda deactivate |
| samtools view -F 4 -bS $output.sam |samtools sort > $output.sorted.bam | samtools view -F 4 -bS $output.sam |samtools sort > $output.sorted.bam | ||
| samtools index $output.sorted.bam > $output.sorted.bam.bai | samtools index $output.sorted.bam > $output.sorted.bam.bai | ||
| Line 81: | Line 82: | ||
| First, make a BWA index of the assembly you wish to map onto by using the following command: | First, make a BWA index of the assembly you wish to map onto by using the following command: | ||
| + | |||
| < | < | ||
| bwa index assembly_to_polish.fasta | bwa index assembly_to_polish.fasta | ||
| </ | </ | ||
| + | |||
| Next, use the meteora_bwa.sh script to map the short reads onto your assembly. This will create a sorted.bam file. In this example, two paired-end read files will be mapped: | Next, use the meteora_bwa.sh script to map the short reads onto your assembly. This will create a sorted.bam file. In this example, two paired-end read files will be mapped: | ||
| + | |||
| < | < | ||
| - | bwa mem -t 16 assembly_to_polish.fasta / | + | bwa mem \ |
| + | | ||
| + | | ||
| + | | ||
| + | | ||
| + | | ||
| </ | </ | ||
| + | |||
| + | UPDATE: You can now run bwa-mem2, which is an optimized version of bwa mem. It generates the exact same output, but is 2-4x faster: | ||
| + | |||
| + | < | ||
| + | bwa-mem2 mem \ | ||
| + | -t 16 \ | ||
| + | assembly_to_polish.fasta \ | ||
| + | / | ||
| + | / | ||
| + | samtools sort --threads 16 -o piloninput.sorted.bam | ||
| + | </ | ||
| + | |||
| Once this is finished, use Pilon.sh to make changes in the assembly and generate a new consensus sequence. Pilon.sh can be formatted like so: | Once this is finished, use Pilon.sh to make changes in the assembly and generate a new consensus sequence. Pilon.sh can be formatted like so: | ||
| + | |||
| < | < | ||
| - | java -Xmx16G -jar / | + | java -Xmx16G -jar / |
| + | | ||
| + | | ||
| + | | ||
| + | | ||
| + | | ||
| </ | </ | ||
| + | |||
| + | UPDATE: The --threads option is as of v1.24 no longer maintained. It seems Pilon doesn' | ||
| + | |||
| You may run into an error where Pilon does not recognize the bam file created from the previous step as being indexed. To fix this, run: | You may run into an error where Pilon does not recognize the bam file created from the previous step as being indexed. To fix this, run: | ||
| + | |||
| < | < | ||
| samtools index / | samtools index / | ||
| </ | </ | ||
| + | |||
| This will return a .bam.bai file. This file needs to be in the same folder as Pilon.sh, but does not need to be placed in the script. | This will return a .bam.bai file. This file needs to be in the same folder as Pilon.sh, but does not need to be placed in the script. | ||
| Line 102: | Line 134: | ||
| Shell script: | Shell script: | ||
| {{ : | {{ : | ||
| + | |||
| + | < | ||
| + | #!/bin/bash | ||
| + | #$ -S /bin/bash | ||
| + | . / | ||
| + | #$ -cwd | ||
| + | #$ -pe threaded 16 | ||
| + | |||
| + | #cd / | ||
| + | |||
| + | echo " | ||
| + | |||
| + | unset PYTHONPATH | ||
| + | export PATH=/ | ||
| + | export LD_LIBRARY_PATH=/ | ||
| + | |||
| + | / | ||
| + | |||
| + | |||
| + | echo " | ||
| + | |||
| + | </ | ||
| Formatting: | Formatting: | ||
| Line 139: | Line 193: | ||
| If illumina reads are available it might be possible to skip nanopolish altogether and go directly to Pilon polishing after Racon. This has been exemplified in the Solanum penellii pre-print where nanopolish simply was not feasible. | If illumina reads are available it might be possible to skip nanopolish altogether and go directly to Pilon polishing after Racon. This has been exemplified in the Solanum penellii pre-print where nanopolish simply was not feasible. | ||
| - | Location: / | + | Location: / |
| Scripts: | Scripts: | ||
| Line 158: | Line 212: | ||
| nanopolish merge - merges the pieces into new a new consensus. | nanopolish merge - merges the pieces into new a new consensus. | ||
| - | **Updated Nanopolish protocol (as of July 17 2018):** | + | **Updated Nanopolish protocol (as of March 8 2020):** |
| First, index your unchopped, raw reads file. | First, index your unchopped, raw reads file. | ||
| Use the sequencing_summary.txt produced by albacore during basecalling to speed up this step significantly. If you have several sequencing_summary.txt files these can be placed in a fof-file with the path to the txt-file and called by -f. This also works in case of a single-file.: | Use the sequencing_summary.txt produced by albacore during basecalling to speed up this step significantly. If you have several sequencing_summary.txt files these can be placed in a fof-file with the path to the txt-file and called by -f. This also works in case of a single-file.: | ||
| + | for the following step **DO NOT** **use more than 1** thread because the program is not threaded! | ||
| < | < | ||
| #!/bin/bash | #!/bin/bash | ||
| Line 167: | Line 222: | ||
| . / | . / | ||
| #$ -cwd | #$ -cwd | ||
| - | #$ -pe threaded | + | #$ -pe threaded |
| cd $PWD | cd $PWD | ||
| - | export PATH=/scratch2/software/anaconda/bin:$PATH | + | fast5path=/scratch2/path2/fast5/ |
| - | source activate nanopolish-python3 | + | fastq=/ |
| - | + | seqsummary=/ | |
| - | / | + | source activate nanopolish-0.13.2 |
| - | -d / | + | export PATH=/ |
| - | -f summary_files.fof \ | + | nanopolish index -d $fast5path |
| - | / | + | conda deactivate |
| - | source deactivate | ||
| </ | </ | ||
| Line 204: | Line 257: | ||
| samtools index reads.sorted.bam | samtools index reads.sorted.bam | ||
| - | source | + | conda deactivate |
| </ | </ | ||
| Line 224: | Line 277: | ||
| cd $PWD | cd $PWD | ||
| - | export PATH=/ | + | export PATH=/ |
| - | source activate nanopolish-python3 | + | |
| + | source activate nanopolish-0.12 | ||
| - | python / | + | nanopolish_makerange.py reference.fasta | parallel --results nanopolish.results -P 20 \ |
| - | reference.fasta | parallel --results nanopolish.results -P 20 \ | + | nanopolish variants --consensus |
| - | / | + | -r / |
| - | -r / | + | |
| - | -b reads.sorted.bam -g reference.fasta -t 10 --min-candidate-frequency 0.1 | + | |
| - | source | + | conda deactivate |
| </ | </ | ||
nanopore_tools_for_polishing.1547814087.txt.gz · Last modified: by 36.2.110.248
