User Tools

Site Tools


extracting_a_single_fasta_entry_from_a_multifasta_file

This is an old revision of the document!


How to extract a single fasta file from a multifasta file

You have a multifasta file with 200 individual fasta files but you only want one of them. How do you extract it?

DO NOT open the file in a text editor and simply cut and paste the one you want. Unless you are very careful this approach can introduce hidden and unwanted characters that will cause programs expecting plain unix type text files to fail.

A better way is to use makeblastdb and blastdbcmd

makeblastdb -in name_of_multifasta_file -dbtype nucl -parse_seqids (for nucleotide files)

makeblastdb -in name_of_multifasta_file -dbtype prot -parse_seqids (for protein files)

identify the name of the entry you want. this will be the text string right after the > and before a space in the header For example, in the header

m.260160 g.260160 ORF g.260160

the name is m.260160

blastdbcmd -db name_of_multifasta_file -entry name_of_entry_you_want -out name_of_single_entry

extracting_a_single_fasta_entry_from_a_multifasta_file.1561042887.txt.gz · Last modified: by 129.173.90.41