curation_of_phylogenomic_datasets
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| curation_of_phylogenomic_datasets [2023/09/19 12:21] – [Identifying xenologs] 134.190.232.90 | curation_of_phylogenomic_datasets [2025/03/06 11:50] (current) – 134.190.145.228 | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| - | ====== | + | Joran Martijn |
| + | |||
| + | ====== Curation of phylogenomic datasets ====== | ||
| Phylogenomic analyses attempt to use genomic data to answer phylogenetic questions. Often we're asking about the shape of a species tree. How did modern day taxa diverge over their evolutionary history? What is the deepest divergence (i.e. the root) of these taxa? | Phylogenomic analyses attempt to use genomic data to answer phylogenetic questions. Often we're asking about the shape of a species tree. How did modern day taxa diverge over their evolutionary history? What is the deepest divergence (i.e. the root) of these taxa? | ||
| Line 15: | Line 17: | ||
| * if one of the pair had underwent horizontal gene transfer at some point in its evolutionary history since its divergence with the other of the pair, and the pair's common ancestor gene was present in the LCA or one of its descendants, | * if one of the pair had underwent horizontal gene transfer at some point in its evolutionary history since its divergence with the other of the pair, and the pair's common ancestor gene was present in the LCA or one of its descendants, | ||
| - | Typically when we construct new phylogenomic datasets, we use similarity searches such as BLAST and DIAMOND and HMMER to generate sets of genes. | + | Typically when we construct new phylogenomic datasets, we use similarity searches such as BLAST and DIAMOND and HMMER (sometimes in combination with Markov Clustering, or MCL, algorithms) |
| This is an extremly practical approach, but can be fairly rough. Genes that are truely orthologs relative to genes that were found with BLAST may be missed if similarity searches are too stringent. On the other hand, genes that are NOT true orthologs (i.e. their divergence with the genes found with BLAST // | This is an extremly practical approach, but can be fairly rough. Genes that are truely orthologs relative to genes that were found with BLAST may be missed if similarity searches are too stringent. On the other hand, genes that are NOT true orthologs (i.e. their divergence with the genes found with BLAST // | ||
| Line 63: | Line 65: | ||
| Be on the lookout for phylogenetic artefacts though. A gene that is in fact a regular ortholog may branch with strong support with an unrelated taxon, for example because they have similar taxonomic compositions, | Be on the lookout for phylogenetic artefacts though. A gene that is in fact a regular ortholog may branch with strong support with an unrelated taxon, for example because they have similar taxonomic compositions, | ||
| - | * Are situated on a long, well supported branch, that, if used for rooting the gene tree, yields an ingroup with a species tree like topology. This may indicate genes that were introduced into these taxa via horizontal gene transfer from a donor //outside// the species tree, i.e. **out-xenologs**. This pattern is pretty much identical to that of **out-paralogs** (see above). In either case, you would want to remove these genes from the phylogenomics dataset | + | * Are situated on a long, well supported branch, that, if used for rooting the gene tree, yields an ingroup with a species tree like topology. This may indicate genes that were introduced into these taxa via horizontal gene transfer from a donor //outside// the species tree, i.e. **out-xenologs**. |
| + | |||
| + | This pattern is pretty much identical to that of **out-paralogs** (see above). In either case, you would want to remove these genes from the phylogenomics dataset | ||
| * NOTE: I made the terms ' | * NOTE: I made the terms ' | ||
curation_of_phylogenomic_datasets.1695136890.txt.gz · Last modified: by 134.190.232.90
