Ongoing Adaptive Evolution of ASPM, a Brain Size Determinant in Homo sapiens
- Nitzan Mekel-Bobrov1,2,
- Sandra L. Gilbert1,
- Patrick D. Evans1,2,
- Eric J. Vallender1,2,
- Jeffrey R. Anderson1,
- Richard R. Hudson3,
- Sarah A. Tishkoff4,
- Bruce T. Lahn1,*
+ Author Affiliations
- ↵* To whom correspondence should be addressed. E-mail: blahn@bsd.uchicago.edu
Abstract
The gene ASPM (abnormal spindle-like microcephaly associated) is a specific regulator of brain size, and its evolution in the lineage leading to Homo sapiens was driven by strong positive selection. Here, we show that one genetic variant of ASPM in humans arose merely about 5800 years ago and has since swept to high frequency under strong positive selection. These findings, especially the remarkably young age of the positively selected variant, suggest that the human brain is still undergoing rapid adaptive evolution.
Homozygous null mutations of ASPM cause primary microcephaly, a condition characterized by severely reduced brain size with otherwise normal neuroarchitecture (1). Studies have suggested that ASPM may regulate neural stem cell proliferation and/or differentiation during brain development, possibly by mediating spindle assembly during cell division (1, 2). Phylogenetic analysis of ASPM has revealed strong positive selection in the primate lineage leading to Homo sapiens (3–5), especially in the past 6 million years of hominid evolution in which ASPM acquired about one advantageous amino acid change every 350,000 years (4). These data argue that ASPM may have contributed to human brain evolution (3–6). Here, we investigate whether positive selection has continued to operate on ASPM since the emergence of anatomically modern humans.
Human ASPM has 28 exons with a 10,434–base pair open reading frame (1) (fig. S1). We resequenced the entire 62.1-kb genomic region of ASPM in samples from 90 ethnically diverse individuals obtained through the Coriell Institute and from a common chimpanzee (7). This revealed 166 polymorphic sites (table S1). Using established methodology (7), we identified 106 haplotypes. One haplotype, numbered 63, had an unusually high frequency of 21%, whereas the other haplotypes ranged from 0.56% to 3.3% (fig. S2). Moreover, this haplotype differed consistently from the others at multiple polymorphic sites (save for a few rare haplotypes that are minor mutational or recombinational variants of haplotype 63, as discussed later) (table S2). Two of these polymorphic sites are nonsynonymous, both in exon 18, and are denoted A44871G and C45126A (numbers indicate genomic positions from the start codon, and letters at the beginning and end indicate ancestral and derived alleles, respectively). These two sites reside in a region of the open reading frame that was shown previously to have experienced particularly strong positive selection in the lineage leading to humans (4) (fig. S1).
The unusually high frequency of haplotype 63 is strongly suggestive of positive selection (8). We tested the statistical significance of the possibility of positive selection using coalescent modeling (7). The frequency of haplotype 63 is notably higher in Europeans and Middle Easterners (including Iberians, Basques, Russians, North Africans, Middle Easterners, and South Asians), as compared with other populations (table S1). We therefore focused on this group to take advantage of its relatively simple and homogeneous demographic structure (9). Because 7 of the 50 Europeans and Middle Easterners were homozygous for haplotype 63, we tested the probability of obtaining 7 or more homozygotes (among 50) for a single haplotype across a 62.1-kb region containing 122 segregating sites (the number of polymorphic sites found in Europeans and Middle Easterners). The recombination rate and the gene conversion rate of the locus used in the test were obtained from our polymorphism data (7). For demographic history, we assumed a severe bottleneck followed by exponential growth (7) that is likely to be much more stringent than the bottleneck associated with the colonization of Eurasia (10). These parameters produced a highly significant departure from the neutral expectation (P < 0.00001). The simulation was repeated with a wide range of demographic histories (7), all of which produced highly significant results. We repeated the above tests on the entire Coriell panel, which also demonstrated strong significance. Finally, we repeated these tests using the inferred frequency of haplotype 63 (instead of the number of individuals who are homozygous for this haplotype) and again obtained significant statistical values. These data indicate that haplotype 63 has spread to high frequency under positive selection. The two nonsynonymous polymorphisms previously mentioned are either the target of selection or closely linked to the target site.
We define haplogroup D (where D stands for “derived”) as the class of haplotypes with the derived G allele at the A44871G nonsynonymous polymorphic site. The haplotypes with the A allele are defined as non-D. This classification is meant to capture the notable structure in the 106 haplotypes. Haplogroup D comprises two subgroups. One contains the predominant haplotype 63 and its closely related mutational variants [including haplotypes 64, 66, and 71 (table S2)]. The other contains recombinant haplotypes between haplotype 63 (or its close mutational variants) and non-D haplotypes (including haplotypes 62, 65, and 105). The frequency of haplogroup D chromosomes is 28% in the entire Coriell panel and 44% in Europeans and Middle Easterners. Another prominent feature separating D and non-D haplotypes is the fact that the two classes have fixed differences relative to each other at multiple sites, where D haplotypes have the derived alleles (except for the rare recombinants between D and non-D chromosomes). This unusual haplotype structure is consistent with the following evolutionary history: (i) a rapid increase of haplotype 63 from a single ancestral copy to high frequency, and (ii) the introduction of minor variants of haplotype 63 by mutation and by recombination between D and non-D chromosomes (these variants make up the other members of haplogroup D).
Another signature of positive selection is extended linkage disequilibrium (LD) (8), which is evident in the 62.1-kb region. Indeed, the 50 haplo group D chromosomes show nearly complete LD across the region, with only three cases of LD breakdown [haplotypes 62, 65, and 105, present in one, two, and one copy, respectively (table S2)]. In contrast, the non-D chromosomes do not show any unusual LD across the region. We investigated the decay of LD beyond the 62.1-kb region by sequencing the Coriell panel for two flanking intergenic segments of roughly 5 kb each, positioned 25 kb upstream and downstream of ASPM. There is notable LD breakdown in these two segments, confirming that the site of selection is most likely within ASPM.
Following established methodology (7), we estimated the coalescence age (i.e., time to the most recent common ancestor) of haplogroup D at 5800 years, with a 95% confidence interval between 500 and 14,100 years. In comparison, the coalescence age of all the chromosomes (both D and non-D) is ∼800,000 years. Thus, the age of haplogroup D substantially postdates the emergence of anatomically modern humans, estimated at ∼200,000 years ago (11). A rough calculation showed that the fitness advantage of haplogroup D is in the range of a few percent over non-D haplotypes.
By genotyping the A44871G diagnostic polymorphism described earlier, we obtained the global frequency distribution of haplogroup D chromosomes from a separate panel of 1186 individuals (7). Consistent with the Coriell panel, we observed much higher frequency of haplogroup D chromosomes in Europeans and Middle Easterners than in other populations (Fig. 1). The corresponding estimate of FST, a statistic of genetic differentiation, is 0.29 between Europeans/Middle Easterners and other populations and 0.31 between Europeans/Middle Easterners and sub-Saharan Africans. These values indicate considerable genetic differentiation at this locus (12). Several scenarios may account for such notable differentiation. One is that haplogroup D first arose somewhere in Eurasia and is still in the process of spreading to other regions. The other is that it arose in sub-Saharan Africa, but reached higher frequency outside of Africa partly because of the bottleneck during human migration out of Africa. Finally, it is possible that differential selective pressure in different geographic regions is partly responsible.
Collectively, our data offer strong evidence that haplogroup D emerged very recently and subsequently rose to high frequency under strong positive selection. The recent selective history of ASPM in humans thus continues the trend of positive selection that has operated at this locus for millions of years in the hominid lineage (3–5). Although the age of haplogroup D and its geographic distribution across Eurasia roughly coincide with two important events in the cultural evolution of Eurasia—namely, the emergence and spread of domestication from the Middle East ∼10,000 years ago (13) and the rapid increase in population associated with the development of cities and written language 5000 to 6000 years ago around the Middle East (14)—the significance of this correlation is not yet clear.
Supporting Online Material
www.sciencemag.org/cgi/content/full/309/5741/1720/DC1
Materials and Methods
Figs. S1 and S2
Tables S1 and S2
References and Notes
- Received for publication 30 June 2005.
- Accepted for publication 10 August 2005.