This is an old revision of the document!
How to extract a single fasta file from a multifasta file
You have a multifasta file with 200 individual fasta files but you only want one of them. How do you extract it?
DO NOT open the file in a text editor and simply cut and paste the one you want. Unless you are very careful this approach can introduce hidden and unwanted characters that will cause programs expecting plain unix type text files to fail.
A better way is to use makeblastdb and blastdbcmd
makeblastdb -in name_of_multifasta_file -dbtype nucl -parse_seqids
"-"for nucleotide sequences
makeblastdb -in name_of_multifasta_file -dbtype prot -parse_seqids
- for protein sequences
identify the name of the entry you want. this will be the text string right after the > and before a space in the header For example, in the header
m.260160 g.260160 ORF g.260160
the name is m.260160
blastdbcmd -db name_of_multifasta_file -entry name_of_entry_you_want -out name_of_single_entry
