Iranian_Mazandarani Iran_Chalcolithic 42.05 Iran_IA:F38 31.35 Iran_Neolithic 14.95 Andronovo_Kytmanovo 7.6 Yamnaya-Catacomb_Ulan 1.95 Han 1.35 Andamanese_Onge 0.75 Papuan 0 distance%=0.285 / distance=0.00285 Iranian_Zoroastrian Iran_Chalcolithic 40.95 Iran_IA:F38 38.25 Yamnaya-Catacomb_Ulan 11.25 Iran_Neolithic 6.45 Andronovo_Kytmanovo 2.8 Andamanese_Onge 0.15 Han 0.15 Papuan 0 distance%=0.2979 / distance=0.002979 Kurdish Iran_IA:F38 96.3 Andronovo_Kytmanovo 2.35 Han 0.6 Andamanese_Onge 0.55 Papuan 0.2 Iran_Chalcolithic 0 Iran_Neolithic 0 Yamnaya-Catacomb_Ulan 0 distance%=1.3845 / distance=0.013845 Latvian Yamnaya_Peshany 44 Loschbour 30.65 Sweden_MN:Gokhem4 25.35 Barcin_Neolithic 0 Ulchi 0 distance%=1.2569 / distance=0.012569 Polish Sweden_MN:Gokhem4 39.4 Yamnaya_Peshany 39 Loschbour 21.6 Barcin_Neolithic 0 Ulchi 0 distance%=0.5208 / distance=0.005208 Swedish Sweden_MN:Gokhem4 42.15 Yamnaya_Peshany 36.55 Loschbour 21.3 Barcin_Neolithic 0 Ulchi 0 distance%=0.939 / distance=0.00939 Kalash Iran_Neolithic 54.05 Yamnaya-Catacomb_Ulan 25.5 Andronovo_Kytmanovo 9.35 Han 7.6 Andamanese_Onge 1.95 Papuan 1.55 distance%=0.4592 / distance=0.004592 Pashtun_Afghanistan Iran_Neolithic 54.8 Andronovo_Kytmanovo 31 Han 7.5 Yamnaya-Catacomb_Ulan 3.5 Andamanese_Onge 1.75 Papuan 1.45 distance%=0.4921 / distance=0.004921 Pathan Iran_Neolithic 55.25 Yamnaya-Catacomb_Ulan 19.1 Andronovo_Kytmanovo 12.3 Han 8.45 Andamanese_Onge 3.4 Papuan 1.5 distance%=0.4848 / distance=0.004848 Tajik_Pomiri Iran_Neolithic 41.7 Andronovo_Kytmanovo 30.65 Yamnaya-Catacomb_Ulan 19.15 Han 6.95 Andamanese_Onge 1.05 Papuan 0.5 distance%=0.5318 / distance=0.005318Admittedly, these estimates look very conservative, but certainly not out of the ballpark. I suspect that I'll be able to improve the models and statistical fits as new Bronze Age steppe samples become available. Indeed, I'll be updating the spreadsheet above regularly.
Friday, July 22, 2016
I've got a new test. Currently I'm only using it to explore ancient genomes, but at some point I'll make another version available to the general public, one way or another. However, that might take a little bit of work and time to mitigate the effects of the calculator effect and so on. Below is a spreadsheet featuring a wide range of ancient and present-day samples from recent papers. A table with the Fst genetic distances between the seven ancestral populations is available here. here. Also, using the K7 ancestry proportions, I modeled the ancient ancestry of a few present-day populations from the Near east, Northern Europe and South Central Asia with the nMonte R script. Bronze Age steppe admixture in groups from the latter two regions is usually inferred at 40-50% with tools based on formal stats, such as qpAdm and TreeMix, so I wanted to check if I could reproduce such results.
Thursday, July 14, 2016
Open access at Science:
Abstract: We sequenced Early Neolithic genomes from the Zagros region of Iran (eastern Fertile Crescent), where some of the earliest evidence for farming is found, and identify a previously uncharacterized population that is neither ancestral to the first European farmers nor has contributed significantly to the ancestry of modern Europeans. These people are estimated to have separated from Early Neolithic farmers in Anatolia some 46-77,000 years ago and show affinities to modern day Pakistani and Afghan populations, but particularly to Iranian Zoroastrians. We conclude that multiple, genetically differentiated hunter-gatherer populations adopted farming in SW-Asia, that components of pre-Neolithic population structure were preserved as farming spread into neighboring regions, and that the Zagros region was the cradle of eastward expansion.
Broushaki et al., Early Neolithic genomes from the eastern Fertile Crescent, Science 14 Jul 2016, DOI: 10.1126/science.aaf7943
Economic overhaul + population shift in Late Neolithic Iran
qpAdm tour of Iran
Yamnaya =/= Eastern Hunter-Gatherers + Iran Chalcolithic
Monday, July 11, 2016
Very interesting new preprint at bioRxiv:
Abstract: It is a long standing question as which genes define the characteristic facial features among different ethnic groups. In this study, we use Uyghurs, an ancient admixed population to query the genetic bases why Europeans and Han Chinese look different. Facial trait variations were analyzed based on high dense 3D facial images; numerous biometric spaces were examined for divergent facial features between European and Han Chinese, ranging from inner-landmarks to dense shape geometrics. A series of genome-wide association analyses were conducted on a discovery panel of Uyghurs. Six significant loci were identified and four of which, rs1868752, rs118078182, rs60159418 at or near UBASH3B, COL23A1, PCDH7 and rs17868256 were replicated in two independent cohorts of Uyghurs or Southern Han Chinese. We further developed a quantitative model to predict 3D faces based on 277 top GWAS SNPs. In hypothetic forensic scenarios, this model was found to significantly enhance the rate of suspect verification, suggesting a practical potential of related research.Lu Qiao et al., Detecting Genome-wide Variants of Eurasian Facial Shape Differentiation: DNA based Face Prediction Tested in Forensic Scenario, bioRxiv, posted July 11, 2016, doi: http://dx.doi.org/10.1101/062950
Back in May I hypothesized that present-day East Asians were prehistoric hybrids of partly Ancient North Eurasian (ANE) origin. I got the idea from a series of TreeMix runs (see here). This was essentially confirmed recently in the Lazaridis et al. 2016 preprint. Refer to page 147 in the paper's supplementary information PDF here. However, based on more recent TreeMix runs featuring data from Lazaridis et al., I'd say the situation is more complex than just some minor ANE-related admixture in East Asians. I suspect now that all East Asians, including even the Onge, an ancient isolate population from the Andaman Islands, harbor significant ANE-related ancestry that may have arrived in East Asia in separate waves. Here's what I'm talking about. Note that all of the samples on the East Asian node - Upper Paleolithic west Siberian forager Ust-Ishim, Han Chinese and Onge - are influenced by a massive migration edge from the base of the AG3-MA1 or ANE branch. However, as per the second graph, only the ancestors of more northerly East Asians, like those of the Han, appear to have been recipients of the latest ANE-related admixture into East Asia. East and West Eurasians seperated at least 45,000 years ago, but...
Saturday, July 9, 2016
Lazaridis et al. showed that their Steppe_EMBA grouping, which included Afanasievo, Poltavka and Yamnaya, as well as two Potapovka samples, one Russia_EBA sample and one Srubnaya_outlier sample, were best modeled in the following two ways using qpAdm:
Steppe_EMBA Eatern Hunter-Gatherer (EHG) 0.568 Iran Chalcolithic (Iran_ChL) 0.432 Steppe_EMBA Caucasus Hunter-Gatherer (CHG) 0.181 Eastern Hunter-Gatherer (EHG) 0.527 Iran Chalcolithic (Iran_ChL) 0.292I'm not a huge fan of either of these models, but especially the first one, even though I understand that they're both statistically very sound. For one, the uniparental markers don't match, and two, TreeMix seems to disagree (see here).
Outgroups Anatolia_Neolithic Andamanese_Onge Chukchi Han Israel_Natufian Karitiana Kostenki14 Levant_Neolithic MA1 Mbuti.DG Papuan WHG Steppe_EMBA Anatolia Chalcolithic (Anatolia_ChL) 0.128 Caucasus Hunter-Gatherer (CHG) 0.375 Eastern Hunter-Gatherer (EHG) 0.497As far as I can tell, it's a very decent fit, especially considering that I'm using 12 outgroups and three reference populations. To me, at least, the standard errors look surprisingly low for such a complex model: 0.033, 0.046 and 0.020, respectively. Now, I'm not arguing here that Chalcolithic Anatolia is the answer. What I'm saying is that multiple lines of evidence do not support Chalcolithic Iran as a real source of admixture for Steppe_EMBA, and I'm offering what I see as a plausible alternative among the currently available samples. I know that this is a work in progress for the Broad MIT/Harvard team, and we'll have to wait for more ancient samples and another paper or two before a consensus is reached on the topic. But here's my prediction: Steppe_EMBA only has 10-15% admixture from the post-Mesolithic Near East not including the North Caucasus, and basically all of this comes via female mediated gene flow from farming communities in the Caucasus and perhaps present-day Ukraine.
Friday, July 8, 2016
Open access at Genome Biology and Evolution:
Abstract: In a recent interdisciplinary study, Das and co-authors have attempted to trace the homeland of Ashkenazi Jews and of their historical language, Yiddish (Das et al. 2016. Localizing Ashkenazic Jews to Primeval Villages in the Ancient Iranian Lands of Ashkenaz. Genome Biology and Evolution). Das and co-authors applied the geographic population structure (GPS) method to autosomal genotyping data and inferred geographic coordinates of populations supposedly ancestral to Ashkenazi Jews, placing them in Eastern Turkey. They argued that this unexpected genetic result goes against the widely accepted notion of Ashkenazi origin in the Levant, and speculated that Yiddish was originally a Slavic language strongly influenced by Iranian and Turkic languages, and later remodeled completely under Germanic influence. In our view, there are major conceptual problems with both the genetic and linguistic parts of the work. We argue that GPS is a provenancing tool suited to inferring the geographic region where a modern and recently unadmixed genome is most likely to arise, but is hardly suitable for admixed populations and for tracing ancestry up to 1000 years before present, as its authors have previously claimed. Moreover, all methods of historical linguistics concur that Yiddish is a Germanic language, with no reliable evidence for Slavic, Iranian, or Turkic substrata.Flegontov et al., Pitfalls of the geographic population structure (GPS) approach applied to human genetic history: A case study of Ashkenazi Jews, Genome Biol Evol (2016) doi: 10.1093/gbe/evw162 See also... Khazar shmazar Irano-Turko-Slavic roots of Ashkenazi Jews?
Wednesday, July 6, 2016
Just wanted to see if I could model Early Neolithic versus Chalcolithic Zagros farmer ancestry in present-day Iranians using qpAdm. I reckon I can, more or less. The outcomes below are all fairly solid statistical fits, especially considering the complexity of the models and the close similarity between the Early Neolithic and Chalcolithic Zagros farmers. Update 23/08/2016: I added most of the Iranian Zoroastrians from Broushaki et al. 2016 to the analysis.
Outgroups Bichon Chukchi Karelia_HG Karitiana Kostenki14 Levant_Neolithic MA1 Mbuti.DG Mota Papuan Ust_Ishim Iranian_Bandari Iran_Chalcolithic 0.136 ± 0.121 Iran_Neolithic 0.631 ± 0.152 Yamnaya_Samara 0.164 ± 0.033 Han 0.026 ± 0.017 Yoruba 0.044 ± 0.013 Iranian_Lor Iran_Chalcolithic 0.723 ± 0.078 Iran_Neolithic 0.106 ± 0.079 Yamnaya_Samara 0.130 ± 0.024 Han 0.041 ± 0.011 Iranian_Mazandarani Iran_Chalcolithic 0.558 ± 0.066 Iran_Neolithic 0.209 ± 0.065 Yamnaya_Samara 0.178 ± 0.022 Han 0.055 ± 0.010 Iranian_Persian Iran_Chalcolithic 0.617 ± 0.064 Iran_Neolithic 0.181 ± 0.062 Yamnaya_Samara 0.148 ± 0.022 Han 0.054 ± 0.010 Iranian_Zoroastrian Iran_Chalcolithic 0.592 ± 0.061 Iran_Neolithic 0.172 ± 0.061 Yamnaya_Samara 0.209 ± 0.021 Han 0.027 ± 0.010However, please note that despite the close similarity between the Early Neolithic and Chalcolithic Zagros farmers, the latter did not in most part descend from the former. In fact, it's very likely that the Chalcolithic farmers were largely, or perhaps even entirely, derived from newcomers to present-day Iran from somewhere to the west of the Zagros Mountains (see here). It's true that in the basic four-way qpAdm model in Lazaridis et al. the Chalcolithic Zagros farmers are largely modeled as Neolithic Zagros farmers (or Iran_N). However, a more comprehensive analysis in the same paper explains them as a mixture of Caucasus Hunter-Gatherers (CHG), Neolithic farmers from the Levant, and Neolithic Zagros farmers, with admixture ratios of 0.631, 0.202 and 0.167, respectively. I can basically reproduce the same model with the outgroups listed above, except with Israel_Natufian in place of Levant_Neolithic, which I have to use as one of the reference populations.
Iran_Chalcolithic Caucasus_HG 0.522 ± 0.111 Iran_Neolithic 0.246 ± 0.108 Levant_Neolithic 0.232 ± 0.026The qpAmd algorithm is freely available at GitHub here. All of the present-day and ancient samples are freely available at the Reich Lab website here. See also... Ulan IV
Monday, July 4, 2016
Courtesy of Arbuckle et al. at the Journal of Archaeological Science. Emphasis is mine:
Abstract: In this paper we address the timing of and mechanisms for the appearance of domestic cattle in the Eastern Fertile Crescent (EFC) region of SW Asia through the analysis of new and previously published species abundance and biometric data from 86 archaeofaunal assemblages. We find that Bos exploitation was a minor component of animal economies in the EFC in the late Pleistocene and early Holocene but increased dramatically in the sixth millennium BC. Moreover, biometric data indicate that small sized Bos, likely representing domesticates, appear suddenly in the region without any transitional forms in the early to mid sixth millennium BC. This suggests that domestic cattle were imported into the EFC, possibly associated with the spread of the Halaf archaeological culture, several millennia after they first appear in the neighboring northern Levant.These findings more or less correlate with the results in the new Lazaridis et al. preprint:
During subsequent millennia, the early farmer populations of the Near East expanded in all directions and mixed, as we can only model populations of the Chalcolithic and subsequent Bronze Age as having ancestry from two or more sources. The Chalcolithic people of western Iran can be modelled as a mixture of the Neolithic people of western Iran, the Levant, and Caucasus Hunter Gatherers (CHG), consistent with their position in the PCA (Fig. 1b).In other words, the small cows weren't just imported into the Eastern Fertile Crescent; they came with people who also made a major genetic impact on the region. Here's my own PCA featuring the relevant Lazaridis et al. samples. Key: Caucasus_HG = Caucasus Hunter-Gatherer; Iran_ChL = Iran Chalcolithic; Iran_HG = Iran Hunter-Gatherer; Iran_N = Iran Neolithic; Levant_N = Levant Neolithic. here). Obviously, this doesn't square too well with the idea of a Proto-Indo-European homeland in the Zagros Mountains of western Iran, does it? See also... qpAdm tour of Iran Yamnaya =/= Eastern Hunter-Gatherers + Iran Chalcolithic
Saturday, July 2, 2016
If Indo-Iranian languages didn't expand from the Andronovo horizon, but rather from an earlier archaeological steppe culture, which is what it seems like based on the latest analysis of ancient genomes from the steppe (see page 123 here), then I reckon the best option is the Catacomb Culture. As far as I can tell, one of the Yamnaya samples from Allentoft et al. 2015, RISE552 from the Ulan IV burial, might actually be a Catacomb sample. That's because Ulan IV is classified as an West Manych Catacomb Culture site. Check out this awesome paper on one of the graves from this site here.
Outgroups Bichon Chukchi Israel_Natufian Karitiana Kostenki14 MA1 Mbuti.DG Papuan Ust_Ishim Kalash Ulan_IV 0.609 ± 0.051 Iran_Neolithic 0.184 ± 0.066 Andamanese_Onge 0.175 ± 0.041 Han 0.032 ± 0.023I'm not saying this model is definitive by any stretch, but it's more or less statistically sound, with fairly low standard errors for each of the coefficients (0.051, 0.066, 0.041, 0.023 respectively). It's also very similar to the optimal qpAdm model of the Kalasha in Lazaridis et al. 2016. Interestingly, it also matches closely a TreeMix analysis that I posted at my other blog last year, months before I even knew that ancient genomes from Neolithic Iran were on the way (see here). This is what I said in that blog entry:
Both of these models are correct; they just show the same thing in different ways. So if we mesh them together the Kalash and Pathans come out ~65% LNE/EBA European (which includes substantial Caucasus or Caucasus-related ancestry), ~12% ASI, and ~23% something as yet undefined. If I had to guess, I'd say the mystery ~23% was Neolithic admixture from what is now Iran.That's not bad considering how difficult it is to make predictions about ancient population movements without direct evidence from ancient DNA. In any case, it's a lot better than what has been published on the topic in some major journals.