ISM Research Memorandum
No.
1022
Title:
Quantitative Assessments of Genome-Wide Indels Support Atlantogenata
at the Root of Placental Mammals
Author(s):
Waddell, Peter J (ISM, University of Tokyo);
Umehara, Satoshi (University of Tokyo);
Griche, Karim-Cyril (C/- University of Tokyo);
Kishino, Hirohisa (University of Tokyo)
Key words:
Indel phylogenetics; Atlantogenata; Boreotheria; Supraprimates; Laurasiatheria;
Eutheria; Placentalia; phylogenomics; ancestral polymorphism.
Abstract:
Recently sequenced genomes of mammals and other vertebrates potentially contain
many rarely occurring indels. These should provide an abundant source of phylogenetic
characters with which to test difficult to resolve parts of the tree of placental
(eutherian) mammals. One of the hardest and most important parts of the tree is the
location of the root. While the hypothesis Boreotheria (also called Boreoeutheria) is
consistent with LINE insertion data, it is unclear which of the three hypotheses,
Epitheria (Xenarthra diverging first), Atlantogenata (Xenarthra with Afrotheria) or
Exafricoplacentalia (Afrotheria first) is correct. Here we test these hypotheses using
a log likelihood ratio test and highly conserved well-aligned indels that are five base
pairs or larger in size. Finding a data configuration of 51 for Boreotheria with just 2
and 1 supporting alternatives, and finding 4 15 3 for Epitheria, Atlantogenata and
Exafricoplacentalia, respectively, our tests reject the alternatives to Boreotheria
(p << 0.05) and Atlantogenata (p < 0.05), so the root of placentals appears to fall
between these two main lineages of placental mammals. Exploratory analyses of large
collections of indels reveal some important possibilities. One of these is that the
duration of the two main lineages of placentals may have been very brief, possibly ~
1 million years or so. Another is that indels, which are located some distance from
other indels, indeed do seem to show a much lower homoplasy (C.I. e 0.8) than sequence
data (C.I. ~ 0.5-0.6). On a more cautionary note, there are also indications of
unexpected biases in the homology and alignment of the genomic data. Given these
caveats, it is important that maximally independent types of low-homoplasy molecular
data corroborate all major clades of mammals. Estimating the ancestral population sizes
of mammals at such an ancient age is shown to be particularly difficult, but potentially
soluable. Taking account of biases in estimating the proportion of characters involved
in ancestral sorting and in the lengths of deep internal edges, are essential. The
resolved root of placentals is important to reconstructions of ancestral characters and
genomes, and should yield important insights in these areas.