User Tools

Site Tools


fastx_toolkit_resources

FASTX-toolkit http://hannonlab.cshl.edu/fastx_toolkit/

  1. used mainly to manipulate sequence reads from illumina sequencers
  2. most of the tools can handle fasta and fastq files
  1. commandline
    1. on perun
      1. conda environment
        1. source activate fastx_toolkit

fastx_clipper -h

  1. online graphical interface

http://hannonlab.cshl.edu/fastx_toolkit/galaxy.html https://usegalaxy.org/

  1. local downloads

http://hannonlab.cshl.edu/fastx_toolkit/download.html

DOES NOT WORK WITH LOWERCASE nucleotides

DOES NOT WORK with interleaved sequences

DOES NOT WORK WITH AMINO ACIDS

-common syntax

  1. i for input file
  2. o for output file
  3. h for help

fastx_clipper [-l N] = discard sequences shorter than N nucleotides. default is 5

more p2.fasta

>sequence1 blahblah
ACGTACGTACGTACGTACGTACGTT
>sequence47 bluhbluhbluh
TTTTTTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCCC
>myfavourite
AGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAG
>myfav2
AGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAG
>contig23 acanth
AGTAGTGACTGAGTAATAGACGTAG	

fastx_clipper -i p2.fasta -o blah -l 50

more blah

>sequence47 bluhbluhbluh
TTTTTTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCCC

fastx_renamer

  1. renames the sequences
    1. n TYPE
      1. default or SEQ uses sequence as name
      2. COUNT uses counter
  2. the extra information on the header line is lost

more p2.fasta

>sequence1 blahblah
ACGTACGTACGTACGTACGTACGTT
>sequence47 bluhbluhbluh
TTTTTTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCCC
>myfavourite
AGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAG
>contig23 acanth
AGTAGTGACTGAGTAATAGACGTAG

fastx_renamer -i p2.fasta -o p2renamed.fasta -n SEQ

>ACGTACGTACGTACGTACGTACGTT
ACGTACGTACGTACGTACGTACGTT
>TTTTTTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCCC
TTTTTTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCCC
>AGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAG
AGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAG
>AGTAGTGACTGAGTAATAGACGTAG
AGTAGTGACTGAGTAATAGACGTAG

fastx_renamer -i p2.fasta -o p2renamed.fasta -n COUNT

>1
ACGTACGTACGTACGTACGTACGTT
>2
TTTTTTAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCCC
>3
AGAGAGAGAGAGAGAGAGAGAGAGAGAGAGAG
>4
AGTAGTGACTGAGTAATAGACGTAG

fasta_formatter

  1. can change multiline sequences (interleaved) to single line sequences (sequential)
  2. can change the length of interleaved lines

more p5.fasta

>M86863.1
GTAACATGACGTTGACCGTGCGGGGCTACATGTAGCAGCTGGGTGTGCTAACTACGGATACATGCCTACA
ACCCCCACAAGTCAAGACCATTGCGACGCGGAAACAGGAGCCCGCAAAAGAGGAGAAAAACAACGGCGAG 
>seq2
ACTCGGGGGCGGAGTGGGTCACGTGACTTTCCTTTTTCCCCTCACCTGGCCCGCTCCGTCCATATCTCTG
TCGTACAAGACAATATTGTCGCAACGCAAAAGGTCCATAAATTACTGGGTAGACGCAACTCTATTTGAAG
GCAACCTACCGTTTGCTTTTAGTGTTTTGGTTTTGTTACCATATCCAAAAAAAAACCATATATCCAAAAA
TTCCGCTGCACCATCTCTTCTTCTCTCCATCAACTACCCCTGCGGAGAAATTCACACCACAGTTACAATG

fasta_formatter -i p5.fasta -o p5oneline

>M86863.1
GTAACATGACGTTGACCGTGCGGGGCTACATGTAGCAGCTGGGTGTGCTAACTACGGATACATGCCTACAACCCCCACAAGTCAAGACCATTGCGACGCGGAGGG
>seq2
ACTCGGGGGCGGAGTGGGTCACGTGACTTTCCTTTTTCCCCTCACCTGGCCCGCTCCGTCCATATCTCTGTCGTACAAGACAATATTGTCGCAACGCAAAAGGTC

fasta_formatter -i p5.fasta -o p5.20 -w 20

>M86863.1
GTAACATGACGTTGACCGTG
CGGGGCTACATGTAGCAGCT
GGGTGTGCTAACTACGGATA
CATGCCTACAACCCCCACAA
GTCAAGACCATTGCGACGCG
GAAACAGGAGCCCGCAAAAG
AGGAGAAAAACAACGGCGAG
>seq2
ACTCGGGGGCGGAGTGGGTC
ACGTGACTTTCCTTTTTCCC
CTCACCTGGCCCGCTCCGTC
CATATCTCTGTCGTACAAGA
CAATATTGTCGCAACGCAAA
AGGTCCATAAATTACTGGGT
AGACGCAACTCTATTTGAAG
GCAACCTACCGTTTGCTTTT
AGTGTTTTGGTTTTGTTACC
ATATCCAAAAAAAAACCATA
TATCCAAAAATTCCGCTGCA
CCATCTCTTCTTCTCTCCAT
CAACTACCCCTGCGGAGAAA
TTCACACCACAGTTACAATG

fastx_trimmer

  1. first and last base to keep
    1. default is entire read
    2. f first base to keep
      1. 1=first base
    3. l last base to keep
      1. based on the entire read before -f

more p6.fasta

>seq1
AAAAAAAAAATTTTTTTTTTGGGGGGGGGG
>seq2
CCCAAATTTGGGCCCAAATTTGGG

fastx_trimmer -i p6.fasta -o p6.trimmed -f 10

>seq1
ATTTTTTTTTTGGGGGGGGGG
>seq2
GGGCCCAAATTTGGG

fastx_trimmer -i p6.fasta -o p6.trimmed -f 10 -l 15

>seq1
ATTTTT
>seq2
GGGCCC

fastx_reverse_complement

  1. reverse complements the sequences
    1. does both

more p7.fasta

>seq1
AAAAAAAAAAGGGGGGGGGG
>seq2
AGAGAGTT

fastx_reverse_complement -i p7.fasta -o p7rc

>seq1
CCCCCCCCCCTTTTTTTTTT
>seq2
AACTCTCT
fastx_toolkit_resources.txt · Last modified: by 24.222.40.124