By Joran Martijn
Selecting sequences:
# select a sequence with the exact as fasta header ID
$ seqkit grep -p
# select a sequence of which the fasta header ID matches a
$ seqkit grep -rp
# select a set of sequences of which the exact IDs are listed in
$ seqkit grep -f
# select a set of sequences of which the IDs match regex patterns listed in
$ seqkit grep -rf
Removing sequences
# remove a sequence with the exact as fasta header ID
$ seqkit grep -vp
# remove a sequence of which the fasta header ID matches a
$ seqkit grep -vrp
# remove a set of sequences of which the exact IDs are listed in
$ seqkit grep -vf
# remove a set of sequences of which the IDs match regex patterns listed in
$ seqkit grep -vrf
A FASTA header consists of two parts, the header ID and the header DESCRIPTION. The ID is essentially anything between ''>'' and the first space, and the DESCRIPTION is anything after that.