**Quick taxonomy recovery** using the Accession numbers from either a Blast or Plast output:\\ use acc2tax program available in the environmental path.\\ here is an example of how to run it for protein IDs (-p):\\ acc2tax -i /db1/extra-data-sets/Acc2tax/acc2taxIN_example -p -d /db1/extra-data-sets/Acc2tax/Acc2Tax_071119 -o taxonomy.out don't forget to make sure that your input file contains only the accession numbers without their version, see the example file given above. :-( E.g., MBI4782295.1 shall be MBI4782295, otherwise the bugs will occur: Couldn't find: [MBI4782295.1] :-D Trim the version ".1" behind the accession MBI4782295.1 > cat file | cut -d '.' -f1 > out_file Note: 1. You can get the accession list from Blast/Plast result (output.txt) directly using the command below: > cat output.txt | cut -f2 | cut -d '.' -f1 > out_file 2. If there are "|" in the accession numbers (i.e., gb|KAA8922376.1|) > cat output.txt | cut -d "|" -f2 | cut -d '.' -f1 > out_file 3. It can still acquire a list of unknown like below even the NCBI taxonomy database is updated to the latest. Couldn't find: [MBR3349819] Couldn't find: [HBS54143] Couldn't find: [MYJ28876] This might due to these protein IDs(MBR3349819,HBS54143) from the species cannot put into the taxonomy like NP_051083. i.e., Lineage is not in (full) status. NP_051083 cellular organisms,Eukaryota,Viridiplantae,Streptophyta,Streptophytina,Embryophyta,Tracheophyta,Euphyllophyta,Spermatophyta,Magnoliopsida,Mesangiospermae,eudicotyledons,Gunneridae,Pentapetalae,rosids,malvids,Brassicales,Brassicaceae,Camelineae,Arabidopsis,Arabidopsis thaliana ========================== acc2tax database Location:\\ /db1/extra-data-sets/Acc2tax/ /db1/extra-data-sets/Acc2tax/Acc2Tax_04_01_2024 (Up to date Jan 04, 2024) \\