User Tools

Site Tools


nanopore_tools_for_polishing

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
nanopore_tools_for_polishing [2020/03/31 14:26] 24.138.68.92nanopore_tools_for_polishing [2024/08/07 13:01] (current) 134.190.232.164
Line 1: Line 1:
 ====== Polishing your MinION assembly ====== ====== Polishing your MinION assembly ======
 Documentation by Jon Jerlström Hultqvist and Shelby Williams Documentation by Jon Jerlström Hultqvist and Shelby Williams
 +(updates by Joran Martijn)
  
 **Be aware that some scripts and commands might not be working any longer on Perun due to the switch to the new conda-environment system. Sections will be progressively updated to reflect this.** **Be aware that some scripts and commands might not be working any longer on Perun due to the switch to the new conda-environment system. Sections will be progressively updated to reflect this.**
Line 81: Line 82:
  
 First, make a BWA index of the assembly you wish to map onto by using the following command: First, make a BWA index of the assembly you wish to map onto by using the following command:
 +
 <code> <code>
 bwa index assembly_to_polish.fasta bwa index assembly_to_polish.fasta
 </code> </code>
 +
 Next, use the meteora_bwa.sh script to map the short reads onto your assembly. This will create a sorted.bam file. In this example, two paired-end read files will be mapped: Next, use the meteora_bwa.sh script to map the short reads onto your assembly. This will create a sorted.bam file. In this example, two paired-end read files will be mapped:
 +
 <code> <code>
-bwa mem -t 16 assembly_to_polish.fasta /scratch2/user/path/to/trimmedreads_1_PairNtrim.fastq.gz /scratch2/user/path/to/trimmedreads_2_PairNtrim.fastq.gz | samtools view -Sb | samtools sort >  piloninput.sorted.bam+bwa mem 
 +    -t 16 
 +    assembly_to_polish.fasta 
 +    /scratch2/user/path/to/trimmedreads_1_PairNtrim.fastq.gz 
 +    /scratch2/user/path/to/trimmedreads_2_PairNtrim.fastq.gz | \  
 +        samtools sort --threads 16 -o piloninput.sorted.bam
 </code> </code>
 +
 +UPDATE: You can now run bwa-mem2, which is an optimized version of bwa mem. It generates the exact same output, but is 2-4x faster:
 +
 +<code>
 +bwa-mem2 mem \
 +    -t 16 \
 +    assembly_to_polish.fasta \
 +    /scratch2/user/path/to/trimmedreads_1_PairNtrim.fastq.gz \
 +    /scratch2/user/path/to/trimmedreads_2_PairNtrim.fastq.gz | \ 
 +        samtools sort --threads 16 -o piloninput.sorted.bam
 +</code>
 +
 Once this is finished, use Pilon.sh to make changes in the assembly and generate a new consensus sequence. Pilon.sh can be formatted like so: Once this is finished, use Pilon.sh to make changes in the assembly and generate a new consensus sequence. Pilon.sh can be formatted like so:
 +
 <code> <code>
-java -Xmx16G -jar /scratch2/software/pilon/pilon-1.22.jar --genome assembly_to_polish.fasta --frags piloninput.sorted.bam --output P2x --outdir Pilon2x --threads 16+java -Xmx16G -jar /scratch2/software/pilon/pilon-1.22.jar 
 +    --genome assembly_to_polish.fasta 
 +    --frags piloninput.sorted.bam 
 +    --output P2x 
 +    --outdir Pilon2x 
 +    --threads 16
 </code> </code>
 +
 +UPDATE: The --threads option is as of v1.24 no longer maintained. It seems Pilon doesn't use more than 200-300% CPU (i.e. 3 threads) at most, so setting --threads to 4 orso should be sufficient.
 +
 You may run into an error where Pilon does not recognize the bam file created from the previous step as being indexed. To fix this, run: You may run into an error where Pilon does not recognize the bam file created from the previous step as being indexed. To fix this, run:
 +
 <code> <code>
 samtools index /path/to_bam_file samtools index /path/to_bam_file
 </code> </code>
 +
 This will return a .bam.bai file. This file needs to be in the same folder as Pilon.sh, but does not need to be placed in the script. This will return a .bam.bai file. This file needs to be in the same folder as Pilon.sh, but does not need to be placed in the script.
  
Line 102: Line 134:
 Shell script: Shell script:
 {{ :unicyclersh.docx |}} {{ :unicyclersh.docx |}}
 +
 +<code>
 +#!/bin/bash
 +#$ -S /bin/bash
 +. /etc/profile
 +#$ -cwd
 +#$ -pe threaded 16
 +
 +#cd /scratch2/jon/MinION/BMAN/assemblies/Unicycler_polish/
 +
 +echo "Starting"
 +
 +unset PYTHONPATH
 +export PATH=/scratch2/software/gcc-6.3.0/bin:/scratch2/software/Python-3.6.0/bin:$PATH
 +export LD_LIBRARY_PATH=/scratch2/software/gcc-6.3.0/lib64:/scratch2/software/Python-3.6.0/lib:$LD_LIBRARY_PATH
 +
 +/scratch2/software/Python-3.6.0/bin/unicycler_polish -1 /scratch2/shelbyw/RCL_Unicycler/RCL_1_PairNtrim.fq -2 /scratch2/shelbyw/RCL_Unicycler/RCL_2_PairNtrim.fq --long_reads RCL_MinION.CutAdapt75.3000.chop.fastq.gz -a RCL_unclean_AB_assembly_fix_Racon2_Pilon3.fasta --pilon=/scratch2/software/pilon/pilon-1.22.jar --samtools=/opt/perun/bin/samtools --threads 16
 +
 +
 +echo "Done!"
 +
 +</code>
  
 Formatting: Formatting:
Line 169: Line 223:
 #$ -cwd #$ -cwd
 #$ -pe threaded 1 #$ -pe threaded 1
- 
 cd $PWD cd $PWD
  
-export PATH=/scratch2/software/anaconda/envs/nanopolish-0.12/bin:$PATH+fast5path=/scratch2/path2/fast5/ 
 +fastq=/path2fastqlongreads.fastq 
 +seqsummary=/path2tosequencing_summary.txt 
 +source activate nanopolish-0.13.2 
 +export PATH=/scratch2/software/anaconda/envs/nanopolish-0.13.2/bin:$PATH 
 +nanopolish index -d $fast5path -s $seqsummary $fastq 
 +conda deactivate
  
-source activate nanopolish-0.12 
- 
-nanopolish index \ 
--d /misc/scratch3/jon/MINION_RAW_DATA/BlastoE_200206/20200206_1906_MN19285_FAK59336_42cb123e/fast5/ \ 
--f /misc/scratch3/jon/MINION_RAW_DATA/BlastoE_200206/20200206_1906_MN119285_FAK59336_42cb123e/sequencing_summary_FAK59336_f9140271.txt \ 
-/misc/scratch3/gseaton/CHE_Blasto_raw/E_Blasto_long_raw/PORECHOP/Blasto_E_chopped.fastq  
- 
-conda deactivate 
  
 </code> </code>
Line 231: Line 282:
  
 nanopolish_makerange.py reference.fasta | parallel --results nanopolish.results -P 20 \ nanopolish_makerange.py reference.fasta | parallel --results nanopolish.results -P 20 \
-/scratch2/software/anaconda/envs/nanopolish-python3/bin/nanopolish variants --consensus polished.{1}.fa -w {1} \+nanopolish variants --consensus -o polished.{1}.fa -w {1} \
 -r /path/to/reads.fastq -b reads.sorted.bam -g reference.fasta -t 10 --min-candidate-frequency 0.1 -r /path/to/reads.fastq -b reads.sorted.bam -g reference.fasta -t 10 --min-candidate-frequency 0.1
  
nanopore_tools_for_polishing.1585675562.txt.gz · Last modified: by 24.138.68.92