search this blog

Wednesday, June 22, 2016

Yamnaya =/= Eastern Hunter-Gatherers + Iran Chalcolithic


The fully public version of the Lazaridis et al. 2016 dataset is now available for download at the Reich Lab website here. Many thanks to the authors for releasing their data before formal publication, and in fact apparently even before the end of the peer review.

As usual from this team, it's high quality stuff with hundreds of thousands of SNPs genotyped in most of the 45 ancient samples from the Near East. And to think that only a couple of years ago the idea of getting genome-wide data from even a single ancient individual from hot places like the Near East was just that, an idea.

I'm planning to do a lot with this data, but the first issue I want to tackle is the genetic structure of the Yamnaya pastoralists from the Early Bronze Age (EBA) European steppes.

Lazaridis et al. show that Early to Middle Bronze Age steppe groups, including Yamnaya, tagged by them as Steppe EMBA, are best modeled with formal statistics as a mixture of Eastern European Hunter-Gatherers (EHG) and Chalcolithic farmers from western Iran. The mixture ratios are 56.8/43.2, respectively.

However, they add that a model of Steppe EMBA as a three-way mixture between EHG, the Chalcolithic farmers and Caucasus Hunter-Gatherers (CHG) is also a good fit and plausible.

I've looked at the topic before and concluded that Yamnaya had to be in large part of CHG origin, with only minor admixture from early farmers, probably from Eastern Europe (see here and here). After having a chance to study the data from Lazaridis et al. 2016, I stand by my earlier results.

Below are a couple of TreeMix graphs featuring Yamnaya alongside a variety of modern and ancient groups, including several potentially relevant to its ancestry, such as Armenia_Chalcolithic and Iran_Chalcolithic from Lazaridis et al. 2016. The full output is available for download here.

Now, it is true that TreeMix is a temperamental algorithm. It can react in extreme ways to the types of samples chosen by the user, often showing results that might appear wrong, or at the very least counter-intuitive. On the other hand, my experience shows that it's also exceptionally effective at picking up and characterizing significant and relatively sudden pulses of admixture. Moreover, unlike modeling with formal stats, it's an unsupervised test.

Clearly, the graphs below are very much at odds with the claim that Yamnaya might be in large part of Iranian Chalcolithic or similar ancestry. As per my earlier tests, it appears to be overwhelmingly a mixture between EHG and CHG.



It's also important to note that the uniparental marker data in Lazaridis et al. firmly back up my TreeMix output, with the Steppe EMBA groups showing starkly different Y-chromosome and mitochondrial (mtDNA) haplogroups from the ancient samples from Iran.


Indeed, mtDNA haplogroup U7 is an excellent diagnostic marker for ancestry from the southern Caspian region, and, sure enough, it appears in the Iranian Chalcolithic set. Conversely, it's conspicuous by its absence from all Bronze Age steppe remains tested to date.

Admittedly, it's still extremely difficult to be precise about the source of the southern admixture in Yamnaya without lots of high quality samples from all over the steppe and surrounds. But already Iran looks a highly unlikely proposition.

See also...

Modeling Steppe_EMBA

The story of mtDNA haplogroup U7

Economic overhaul + population shift in Late Neolithic Iran

Indian genetic history in three simple graphs

102 comments:

Gökhan said...

Then you insist on "authantic Kartvelian ancestry " in Yamnaya? I think you ignoring the barrier role of great Caucaus mountains on preventing such a genetic impact!!! Otherwise we should admit that CHG was already in southern russian steppes during Neolethic. My best Candidate for source of southern admixtures in Yamnaya is East of Caspian, modern day Tachikistan.

Davidski said...

Then you insist on "authantic Kartvelian ancestry " in Yamnaya?

No idea what language they spoke, but they were CHG-rich and probably already lived north of the Caucasus somewhere during the Neolithic.

My best Candidate for source of southern admixtures in Yamnaya is East of Caspian, modern day Tachikistan.

Will end up looking as much of a fail on graphs like those above as the Chalcolithic Iranians IMHO.

Nirjhar007 said...

Say Tajikistan ;) ...

I don't care much about the graphs. Just that its becoming quite clear a population from Iran or similar contributed to the steppes. They can be the agropastoralists or herders . But we must remember that geography is a very significant factor. The closer sample you for example compare to steppes area ,they will appear as more compatible if i'm not wrong.

Tesmos said...

David, will you/Chad create a new caculator based on these new results? Or is not worth it?

Davidski said...

@Nirjhar

I don't care much about the graphs.

Bahaha.

It's not a coincidence that Yamnaya is the first sample flagged by TreeMix to be a mixture, and also not a coincidence that the mixture edge comes from CHG.

Iranian Chalcolithic admixture is also totally ruled out by the mtDNA data.

Davidski said...

@Tesmos

We'll have a go at it eventually.

Nirjhar007 said...

Yes Dave Bahahaha...

Krefter said...

I agree CHG is a better proxy than Iran_Chl for Yamnaya's non-EHG side. We'll definitly need more stats though to rule out or keep the possibility Yamnaya had ancestry from Iran. The following parts from Lazardis 2016 also support the idea CHG is a better proxy for Yamnaya's non-EHG side.

Figure S7.10
f3(Steppe_EMBA; EHG, CHG)=-0.02 compared to f3(Steppe_EMBA; EHG, Iran_Chl)=-0.015

Figure S7.9
f4(EHG, Steppe_EMBA; CHG, Chimp)=-0.03
f4(EHG, Steppe_EMBA; Iran_Chl, Chimp)=-0.019

EHG admixture in CHG can't explain these stats because in Figure S7.9 Steppe_EMBA's closeness to CHG would be less greater compared to EHG's closeness to CHG than its closeness to Iran_Chl compared to EHG's closeness to Iran_Chl.

Table S7.11 though shows Steppe_EMBA's closeness to outgroups(in F4-stats) can be explained slightly better as Iran_Chl+EHG. The difference is tiny though.

MfA said...

That perfectly overlaps with the Maykop culture. Maykop is the link between steppe and Armenia/Iran Chalcolithc. Increased CHG and EHG levels, steppe mtdna on Armenia_ChL show that Maykop probably not as much as Yamnaya but is going to be quite EHG.

Iosif Lazaridis (Broad) said...

It's great to see the data already being analyzed and I hope it will be useful in your analyses!

I just wanted to leave a brief comment that the model of Steppe_EMBA as a mixture of EHG+CHG is rejected (Table S7.11), while that of EHG+Iran_ChL is not. Note that in Table S7.11 we are modeling Steppe_EMBA and the references with respect to 13 outgroup populations (the set O9ALNW), not all of which are included in the TreeMix graph.

It is possible for some models to succeed with a particular set of outgroups (both EHG+CHG and EHG+Iran_ChL are feasible with only the O9 set of outgroups; Table S7.10), but for some of them to be rejected when additional outgroups are introduced (Table S7.11). As we mention further down, that doesn't mean there is no CHG-related ancestry in Steppe_EMBA as we can model it as a 3-way mixture involving CHG as one of the sources. What it does mean, however, is that CHG+EHG cannot be the only sources, as this model is rejected (Table S7.11). A further test of our overall model is that when we withhold Iran_ChL as a source, and infer mixture proportions by intersecting the EHG->Steppe_EMBA and Levant_N+Levant_BA clines (p. 134), we get fairly reasonable agreement (mixture proportions).

We try to be cautious in our interpretation of the admixture models, because of three factors: (i) we don't know the geographical extent of populations like "CHG" or "Iran_ChL" so admixture from Iran_ChL does not imply admixture from geographical Iran or CHG from the geographical Caucasus, (ii) we do not have samples from many places and it's very likely that slightly different mixtures than the sampled populations existed elsewhere, (iii) it is possible that the actual history of admixture may be more complex than the simplest parsimonious models identified by the analysis.

Overall, our admixture analysis rejects several possible models (such as EHG+CHG) and thus puts constraints on what may have happened, and also proposes some models that are more resilient to rejection (such as EHG+Iran_ChL+CHG). But, by no means should these be regarded as the final word or unique solutions, but rather as one possible way that the data can be modeled.

Nirjhar007 said...

Thank you Doctor :) .

Davidski said...

@Iosif

Have you considered a model with Yamnaya as a three-way mixture of EHG, CHG and an Anatolia Chalcolithic-like group from Eastern Europe or the North Caucasus, like maybe Cucuteni-Tripolie?

Their uniparental markers might be a better match with Yamnaya.

Gioiello said...

@ Davidski

U7a is certainly more diffused in Southern Asia (more India than Iran as a place of expansion), but U7b is very old in Europe and Italy above all. I wrote a lot about that from the first fake paper of Brisighelli. The U6* found in Romania 35000 years ago should make all be cautious.

Davidski said...

Thanks. So we can say with some confidence that Bronze Age steppe groups did not have ancestry from Iran, South Asia or Italy. ;)

Karl_K said...

"So we can say with some confidence that Bronze Age steppe groups did not have ancestry from Iran, South Asia or Italy."

Yes. But we can't rule out South America, Australia, or China.

Nirjhar007 said...

Or Poland...

Gioiello said...

I'd say that we have a few samples tested, thus we can not do reliable hypotheses whether steppes have or not some samples of U7a or U7b or Others. We need much more data. Perhaps autosome is more reliable, perhaps...

Nirjhar007 said...

Autosomes are never very reliable ,especially narrating language changes. Its bound to get complex and confusing , with more samples coming up...

Gioiello said...

But you all know that I think that R-L23 came to the steppes from Italy or anyway Western Europe. Seen that Lazaridis reads this blog, why Italy has only 5 Y tested and other countries have hundreds? When some tests on Tyrrhenian Italy? Weren't you searching for R-L51?

Ricardo Costa de Oliveira said...

Don't you forget the individual from the Hotu Cave, who past and spatial position are important keys to the history ? I would like to know what type of Y-DNA J he has in terms of detailed SNPs.

ryukendo kendow said...
This comment has been removed by the author.
Davidski said...

Possibly the real thing is something like Iran_LN+CHG+EEF.

Yamnaya is mostly EHG and CHG, although there is some extra stuff in there that will be hard to work out. And that creates an extra complexity that is similar by coincidence to Iran Chalcolithic.

ryukendo kendow said...
This comment has been removed by the author.
ryukendo kendow said...

@ Davidski

When we use qpAdm, as you no doubt would, we can look at residuals with the right pops/outgroups to figure this out, especially comparing the models with CHG vs Iran_LN, as its Iran_LN that is missing the EEF/Levant_N affinity but it still does better.

Nirjhar007 said...

Gioiello,

Are you asking me?. Well, mail him, its far better option Dottore.

Roy King said...

@Davidski,
Your Treemix graph shows a migration edge from Yamnaya to Armenia Chalcolithic. This fits with an early migration of Yamnaya ancestry to Armenia and conforms to the similar pattern with Kumtepe4 and the Anatolian Chalcolithic sample. Yet these Chalcolithic samples predate the Yamnaya (circa 4000 BCE for the chalcolithic samples, 3500 BCE for Kumtepe4 and 3000 BCE for Yamnaya). Do you have an explanation for this pattern?

Nirjhar007 said...

Perhaps Roy , its the reverse?.

Chad Rohlfsen said...

I've got the samples too, along with the Onge. I'll be running many mixes starting tonight or tomorrow.

Karl_K said...

@Nirjhar

"Autosomes are never very reliable"

Great stuff. Keep it coming!

They are where most of the information will be.

It is so much easier for a population to lose a haplogroup that to lose the entire autosome. Two classic examples are the Neanderthal and Denisovan admixture events. We wouldn't know they existed with autosomal data.

As for the language and culture, you really have almost no data on that.

Davidski said...

rk,

Here's a graph with Iran Neolithic added.

https://drive.google.com/file/d/0B9o3EYTdM8lQNGE0OFhCSE11NkU/view?usp=sharing

I can try more outgroups tomorrow, but I'll be somewhat surprised if I see an migration edge from Iran Chalcolithic or Neolithic to Yamnaya.

Roy,

The migration edge into Armenia Chalcolithic is from the base of the EHG branch. Yamnaya is on a sister branch because its southern ancestry is taken care of by the migration edge from Caucasus HG.

So that migration edge into Armenia Chalcolithic looks like it's coming from something similar to Samara Eneolithic. But I can try and check that directly tomorrow.

Nirjhar007 said...

Karl,
Autosomes are blunt . SNP's are more indicative and definitive .

As for the data , give me your mail. I will show you data .

Anybud,

Apparently the Hg's can be tested too now?.

Krefter said...

@Ryu,

The strongest F3 signal for Steppe_EMBA is CHG. The stat f4(EHG, Steppe_EMBA; CHG, Chimp) is more negative than the stat f4(EHG, Steppe_EMBA; Iran_Chl or Iran_N, Chimp).

Evidence goes both ways. Like David has said Iran_Chl has EEF/Natufian-related ancestry as Yamanaya probably does, so this could be why Yamnaya fits better as Iran_Chl+EHG than CHG+EHG.

Also, as other posters have stated mtDNA from Iran_Chl rules out the possibility they're an important ancestor of Yamnaya. Maybe a relative but Iran_Chl itself. We need lots of CHG mtDNA to know how good of a fit they are.

ryukendo kendow said...
This comment has been removed by the author.
Davidski said...

TreeMix is bendy alright, but it's very good at picking up strong pulses of admixture.

Yamnaya's non EHG and CHG ancestry on these graphs is probably taken care of by its position on the tree.

ryukendo kendow said...
This comment has been removed by the author.
REZA said...

Off topic but related to ancient dna studies:

Ancient DNA from Sanganji Jomon in fastq format
https://www.ebi.ac.uk/ena/data/view/PRJEB6943

Ancient DNA from Pathagonia and Tierra del Fuego in CRAM format
https://www.ebi.ac.uk/ena/data/view/PRJEB6943

Free Whole Genome Sequnce of Andamanese in VCF format
https://www.ebi.ac.uk/ena/data/view/ERZ126015

Karl_K said...

@Nirjhar

"Karl,
Autosomes are blunt . SNP's are more indicative and definitive...
Apparently the Hg's can be tested too now?"

What are you talking about? I like where you're going... but it is so far from crazy I can't exactly understand at the moment.

ryukendo kendow said...
This comment has been removed by the author.
ryukendo kendow said...
This comment has been removed by the author.
Davidski said...

rk,

Residuals for the last graph.

https://3.bp.blogspot.com/-9OjkAdqUuQQ/V2qqMB1NF7I/AAAAAAAAEk0/AdVifGzjuUc_BiPjYvqxD1sttsJ69WlcwCLcB/s1600/Residuals.png

ryukendo kendow said...
This comment has been removed by the author.
rozenfag said...

@ REZA : Is there a corresponding publication for Jomon DNA?

REZA said...

@ rozenfag:
Yes: https://eurogenes.blogspot.com/2015/01/ancient-jomon-people-not-like-present.html

mickeydodds1 said...

In the immortal words of Sir Alex Ferguson, 'it's squeaky bum time'.

Iranocentrist said...

Where is Mike Thomas, the equalizer

Arch Hades said...

Well since CHG is related to Iranian Neolithic, and since we know CHG has been in the Caucasus since the late Upper Paleolithic, considering geographical closeness.... I don't see why the southern strain in the Yamnaya would not be CHG.

But there was some data in this new study by Lazardis that kind of irked me. For one their ADMIXTURE model showed CHG to be mixed, modeled around 15-20% EHG, 80-85% Iranian Neolithic. Because of Jones et al I thought CHG represented a 'pure' strain that was in isolation on the Northern side of the West Asian highlands for 10-15 thousand years away from the Near East. Not just part of a genetic strain that was already long in Iran and that was actually diluted a little bit when it entered into the Caucasus.

Atriðr said...

@Gökhan
I think you ignoring the barrier role of great Caucaus [sic] mountains on preventing such a genetic impact

This is a recurring issue I have as well. There needs be to demonstrated how such migrations would have a) traversed the mountains in significant numbers or b) used naval means via the Black or Caspian Seas.

The reason for the linguistic-genetic diversity in the Caucasus is because of those mountains.

Now is also a good time to post this: Finno-Ugric has Indo-Iranian loan words, but Indo-Iranian does not. Or let me re-phrase, modern Indo-Iranian languages have no Finno-Ugric loan words, but Finno-Ugric languages have Indo-Iranian loan words.

In my opinion, the most viable reason for this is that the Indo-Iranians that went south did not yet meet the Finno-Ugrics (for long enough a time period). While another group of Indo-Iranians did.

Which brings me back to the Caucasus mountains.

A vector of Iran_N or Iran_Chlc moving around the Caspian Sea from South to North, then East to West via Sintashta-area (moving up between Caspian and Aral Seas) ... maybe via BMAC first too (a possible route), although Sintashta-area route my stronger leaning. Then Sintashta down to BMAC-IVC, and Sintashta area to Yamnaya.

@Davidski Have you ever plotted this (going through Andronovo before Yamnaya), David?
Would it possible when you get the chance?

Colin Welling said...

@ gio But you all know that I think that R-L23 came to the steppes from Italy or anyway Western Europe. Seen that Lazaridis reads this blog, why Italy has only 5 Y tested and other countries have hundreds? When some tests on Tyrrhenian Italy? Weren't you searching for R-L51?

You aren't simplifying matters with your theory and each piece is already so implausible.

r1b was in the mesolithic steppe, in an unadmixed guy nonetheless. R1b L23 is dominant in the yamnaya, who lived at a time very close to the genesis of R1b L23 (probably within about 1000 years). The R1b L51 found in bell beakers was associated with the introduction of steppe genes into central europe.

The evidence clearly supports that r1b L23 was born on or very near the steppe and that the R1b L51 lineage came from the steppe during the late neolithic early bronze age.

What you have done with you theory is claim that yamnaya the descendants of recent migrants from italy and that the R1b L51 in bell beakers came from Italy and therefore has nothing to do with the brand new steppe heritage they introduced to central europe.

Colin Welling said...

@david Have you considered a model with Yamnaya as a three-way mixture of EHG, CHG and an Anatolia Chalcolithic-like group from Eastern Europe or the North Caucasus, like maybe Cucuteni-Tripolie?

But that would draw the yamnaya too westward, especially with WHG. Remember that we are talking about the Southern yamnaya and the eastern yamnaya. My guess is that the non EHG in these yamnaya samples came from the Southern Caucasus. It makes sense given the Kartvelian connection to PIE.

ak2014b said...

Great that the authors of the paper are commenting here. I can ask them directly.

@Iosif Lazaridis (Broad)

The provided accession number PRJEB14455 to get the BAM files isn't recognised at ebi any more.

Earlier on, I was able to download the sole BAM file I saw for this paper using the number. The file was called I0013.1240k.bam, but no sample in this paper is named I0013. I went back to check if more BAM files had been uploaded in the meantime, and the accession number doesn't seem to exist any more. (The 230 Genomes paper still has all 246 BAM files at its accession number though.)

Has the number changed and if so what is it now?

Matt said...

Haven't read the comment thread yet, but hmm.... The specific reasons given for the model with both Iran and CHG in the paper were that taking admixture from CHG alone gave too much distance from other Near Eastern populations.

Since these treemix have the admixture edge coming from effectively the very base of the CHG, thus closer to Iran_Chalcolithic and Armenia_Chalcolithic, perhaps that gets round this problem?

I'd advise trying the treemix with Anatolia_Neolithic or other "Western ME" farmers in as a test, since that's the condition in with Iran_Chalcolithic admixture became better than CHG alone in the paper.

Again suggests admixture from a population that is CHG like, but slightly more "mainstream" for the early Neolithic Middle East than CHG itself? (CHG is a drifted upper paleolithic "glacial survivor" relative that was less well situated to participate in proto-neolithic cultural / genetic developments that took place in slightly more southerly warmer populations).

Olympus Mons said...


a.Iran, or Armenia calcolithic is a thousand years after the "southern Caucasus" source of CHC to the Yamnaya was kicked out of Caucasus and (some, not that many) jump the mountains and were at that point part of Steppe admixture.
b. Only until 7th/6th milenia BCE is sampled in southern caucasus will it be found. 4th milenia, or even in part second half of 5th is already too late...
c. Only when meshoko culture and even probably Kalmykia is sampled will it be perfectly clear.


Colin Welling said...

@ losif We try to be cautious in our interpretation of the admixture models, because of three factors: (i) we don't know the geographical extent of populations like "CHG" or "Iran_ChL" so admixture from Iran_ChL does not imply admixture from geographical Iran or CHG from the geographical Caucasus,

in figure 1 of the paper you can see that the middle eastern populations are converging on one another since the neolithic. There was definitely a gradient of admixture in the middle east around the time of the chalcolithic.

BTW, are you guys planning on testing the western yamnaya? Im curious if their non EHG heritage will be similar to the other yamnaya samples. I think it will say a lot about the character of the yamnaya. For example, did the yamnaya rapidly expand from a single, admixed population. Were the yamnaya only connected to one another by their EHG side. Im also dying to know about the possible connection between Bell Beakers and the western steppe. Thanks.

Olympus Mons said...

Davidsi,
"comes from CHG..." errr we are talking about that guy in the land of the shulaveri-shomu, right?! humm. ok.

Olympus Mons said...

@Colin Welling,
there is a conundrum. Both Yamnaya and bell beaker will have parts similar of admixture. Just that bell beaker shall have much more CHG because BB where much less diluted into EHG than Yamnaya.

BB were never, never steppe or Yamnaya. But "Bell beaker" main genetics were a population that went to just north of caucasus mountains and a thousand years later part (prob small) of Yamnaya at the same time they were starting to spread into western europe.

So looking just into DNA will help zero, nada, to solve the mistery. unless a lot more samples come into the mix.

So, sample baby, sample.

MfA said...

David,

When you do an ADMIXTURE run can you seperate Lor, Mazandaran, Shiraz, Bandari samples and not lump all of them under just the Iranians?

Ariele Iacopo Maggi said...

https://genetiker.wordpress.com/2016/06/22/y-snp-calls-from-the-ancient-near-east/

Genetiker has r2 from both iranian samples.

Gioiello said...

Mggi writes: "Genetiker has r2 from both iranian samples".


Of course.

Ariele Iacopo Maggi said...

Genetiker also said this about the early bronze age armenian y-dna " It’s R1b1a-A702(xV88, P297) "

Davidski said...

rk & Matt,

With this topology TreeMix doesn't see Yamnaya as a mixture of anything, even with as many as 6 edges.

https://drive.google.com/file/d/0B9o3EYTdM8lQZnBPNG5XSE1zQnc/view?usp=sharing

But it's a really nice tree with one edge, and again essentially shows Yamnaya to be intermediate between EHG and CHG.

The residuals look pretty good, at least for Yamnaya.

https://4.bp.blogspot.com/-9sImQEbNfOI/V2sn-IPNuyI/AAAAAAAAElI/jLXWPVDei9s73tCaDhA3xu9G4n9tYMUhACLcB/s1600/Residuals.png

Roy King said...

Genetiker has Hotu Cave as J2a-M410.

Roy King said...

Seems as though the Caspian (Hotu Cave) could have been the refugium for J2a-M410.

Davidski said...

The difference between Armenia Chalcolithic and Armenia EBA is intriguing. Armenia Chalcolithic is more European, while Armenia EBA is more Caucasian.

https://drive.google.com/file/d/0B9o3EYTdM8lQMU1IUnYzbzdkTUk/view?usp=sharing

But I've been unable to get a migration edge from Samara Eneolithic to Armenia Chalcolithic. The migration edges into Armenia Chalcolithic are from the base of the EHG branch, and they're pretty easy to reproduce using different topologies.

Iranocentrist said...

David congrats on calling R2 for the P sample.

Davidski said...

I was just guessing at the time, but it was a natural choice considering the J and L south of the Caucasus/Caspian that were already in the results.

Rob said...

Atrior

"This is a recurring issue I have as well. There needs be to demonstrated how such migrations would have a) traversed the mountains in significant numbers or b) used naval means via the Black or Caspian Seas. "

There's the Azerbajan corridor, widely known
That there are many language groups in the Caucaus only serves to confirm rather than doubt its role as a conduit of language spread

The issue of lack of FU loans in IA might also relate to issue of social language dominance.

Rob said...

Ok so ancient samples are pointing toward an R*- diversification in south Central Asia, somewhat expectedly.
So R1 existed quite broadly by the Neolithic, but it's clear that L23 arose northwest of central Asia

postneo said...

CHG is very old so its understandable that a chalc population would do better. why is iran_chalc chosen over armenian_chalc? Perhaps iran_chalc had more differentiated proto asi needed for a fit.

Nirjhar ADNA is blunt but so are many other things of interest. Languages don't ebb or boom like uniparental dna. languages are under selection pressure to be stable and be mutually intelligible over a reasonable length of time and space.

Davidski said...

@Rob

Ok so ancient samples are pointing toward an R*- diversification in south Central Asia, somewhat expectedly.

No they're not.

They're showing that South Central Asia was a sink, even for lineages that were until recently thought to have originated there.

South Asian R1a is from the Bronze Age steppes, while R2 is from Neolithic Iran.

Gioiello said...

@ Davidski

"South Asian R1a is from the Bronze Age steppes, while R2 is from Neolithic Iran"

And don't forget that the R1b1-L389- (Joshi, Raza) I tested in India I supposed having come from Central Asia, from Uzbekistan, thus born around the Caucasus. Practically Kura Araxes has a Villabruna R1b1a (XP297) many thousands of years later, and that Caucasus has R1b1-L389+ with YCAII=21-23 whereas Italy has 18-22, 18-23, 23-23 I am saying from so long and it is one of the proofs (if they are Worth something) of my Italian Refugium, and that Caucasus and Italy have from so long contemporaneously the same haplogroups I am sayin Always from so long.

Rob said...

Well, we can drop the adjunctive "south" from "Central Asia". The case for R1a seems clear enough, but R2 really doesn't look Iranian here. It's a drifter from further north, but I don't think that north was Russia

GrenadierGunther said...

I know you haven't done this in awhile(I think when Corded Ware/Yamnaya samples came out was the last time you did this), but could you post the Eurogenes K15 for these samples?

I find K15 much more accurate than direct population composition if you don't have nearly all ancient samples(don't bother trying to convince me otherwise, lol).

Would be really appreciated. Shame the y-str Felix guy isn't around anymore to upload ancient samples on GEDMatch.

postneo said...

@Gioello
"And don't forget that the R1b1-L389- (Joshi, Raza) I tested in India I supposed having come from Central Asia, from Uzbekistan, thus born around the Caucasus"

You need to refresh your geography. Uzbekistan is nowhere near the caucasus. Its far, like driving from rome to moscow.

Raza could have uzbeg roots but not Joshi.

Rob said...

Iranocentrist

Alberto said...

@Davidksi

The difference between Armenia Chalcolithic and Armenia EBA is intriguing. Armenia Chalcolithic is more European, while Armenia EBA is more Caucasian.

Yes, in figure S7.16 (page 98 SI), the f4(Armenia_ChL, Armenia_EBA; A, Chimp) show the most positive results for: EHG, SHG, Samara_Eneolithic, WHG, MA1, Kostenki14. While the most negative are: Iran_N, CHG, Iran_LN.

In the f3 stats (Extended Data Table 2) the Armenia_ChL samples don't show any strong signal of admixture with the available populations. The lowest score is with EHG + Levant_N (Z=-1.5), quite weak. So it seems that either we don't have the right populations yet or these people were unadmixed for quite a while.

I think that this points to an origin in the east, where we don't have samples, or where they could have been unadmixed for a while (and having high EHG ancestry). I wonder how good (or bad) these samples will be for modelling S-C Asian populations.

BTW, the strongest signal in those f3s for Steppe_EMBA is for EHG + Abkhasian (Z=-11.2).

Shaikorth said...

"In the f3 stats (Extended Data Table 2) the Armenia_ChL samples don't show any strong signal of admixture with the available populations. The lowest score is with EHG + Levant_N (Z=-1.5), quite weak. So it seems that either we don't have the right populations yet or these people were unadmixed for quite a while."

Or they were admixed and we have the right populations, but too much drift screws up the f3 test. See Kalash and BedouinB.

Alberto said...

@Shaikorth

Or they were admixed and we have the right populations, but too much drift screws up the f3 test. See Kalash and BedouinB.

Yes, but I guess that implies that they were not admixed for quite a while. Or could such drift happen in just a few generations?

Grey said...

Arch Hades

"I thought CHG represented a 'pure' strain that was in isolation on the Northern side of the West Asian highlands for 10-15 thousand years away from the Near East. Not just part of a genetic strain that was already long in Iran and that was actually diluted a little bit when it entered into the Caucasus."

If there were multiple HG groups in various LGM refuges who expanded out afterwards then if they had no particular advantage they'd spread till they bumped into each other.

Say for the sake of argument the refuges were
- aegean and southern Black sea
- south caspian
- north black/caspian seas
- nile delta
- persian gulf
and you assign CHG to southern Caspian and imagine where they would spread before they bumped into one of the other HG groups coming the other way then you'd get borders roughly
- west: east anatolia
- north: half way up the east/west caspian sea shore
- south: mid Iran/Zagros
- east: altai?
then the first CHG-like farmers might start from somewhere within that range.

(and similarly for the others if there was also first natufian farmers, first aegean farmers etc)

So the CHG HGs that were found in the Caucasus might be from that first HG expansion and not the later CHG farmer expansion.

The CHG-like farmer expansion might have started from Iran/Zagros or maybe Anatolia near the border with ENF or somewhere off east of the Caspian - they'd all be CHG-like.

Grey said...

Atrior
"This is a recurring issue I have as well. There needs be to demonstrated how such migrations would have a) traversed the mountains in significant numbers or b) used naval means via the Black or Caspian Seas."

I tend to think the most likely route for CHG-like farmers onto the steppe was east of Caspian or west of Black sea.

Shaikorth said...

@Alberto

No, it wouldn't happen so fast in reality. However it didn't have to happen - this is an ancient sample with lower quality than the Kalash and BedouinB samples, and we know this creates artificial drift.

ryukendo kendow said...
This comment has been removed by the author.
Gioiello said...

@ postneo
"You need to refresh your geography. Uzbekistan is nowhere near the caucasus. Its far, like driving from rome to moscow.
Raza could have uzbeg roots but not Joshi".

Of course Joshi is an outlier:
N93357 Bhaskar Dinkar Joshi, 1910-1997 India R-P25
13 18 14 10 11-12 12 12 13 12 13 28 15 9-10 11 11 25 15 20 29 12-14-15-15-16-17 10 10 20-24 14 15 18 19 30-35 12 12 11 8 15-16 8 10 11 8 10 11 12 22-24 15 10 12 12 15 8 11 22 20 15 12 11 13 10 11 11 12
but the haplogroup formed 17200 years ago and only a deep SNP test of Joshi could say more, but older is the time and the least are the probabilities that his origin are in India.
Anyway these R-L389- have nothing to do with the subclades, which formed in European-Siberian hunter-gatherers.


Shaikorth said...

@RK

Think so (because haplotypes do not survive intact) and that's why Chromopainter etc. was used only on higher quality samples.

IIRC Rathlin1 and Ballynahatty are the lowest coverage (10x) ancient samples that have been run on Chromopainter and produced sensible results.

Alberto said...

@Shaikorth

There are 5 Armenia_ChL samples and only one is low coverage. The other 4 are 2x-3x. In the TreeMix above they also don't show as an extremely drifted population (nowhere near CHG or EHG).

I think this is probably a matter of not having the right populations for showing strong signal of admixture, but who knows. Let's see what further analysis show about these samples.

Karl_K said...

@RK

"Shaikorth, will quality issues mess up TVD and segment-based analyses?"

If you don't have both alleles for most positions (or better, actual haplotypes), then the segment analysis doesn't work at all.

This is because they first look for the segments that are impossible to be the same (homozygous different). For diploid sequence, all other situations are possibly identical segments.

If your coverage is too low to make accurate homozygous calls, then you couldn't even tell how a mother and child were related to each other.

Shaikorth said...

2-3x is still quite low, though better than many of the old genomes. But SHG's signal isn't even negative and we have pretty good idea what they are. If we had 10x coverage Armenia_Chl the signal should reach the -3 threshold.

Olympus Mons said...

So according to Genetiker, that R1b had P297.
Go figure in my thesis there is a suppl chapter called...
SUPPL III
EXODUS TO IBERIA
Plight of the P297 mutating to M269?

oh, boy, oh Boy.

http://blogs.sapo.pt/cloud/file/eb6b52b82097d41dfa0e5797a2fa7945/olympusmons/2016/From%20Shulaveri%20to%20Bell%20beaker.pdf

Gioiello said...

@ Olympus Mons
"So according to Genetiker, that R1b had P297.
Go figure in my thesis there is a suppl chapter called...
SUPPL III
EXODUS TO IBERIA
Plight of the P297 mutating to M269?

oh, boy, oh Boy".

Look at again:

Sample Region Culture Haplogroup
I1635 Armenia Kura-Araxes R1b1a-CTS4244(xV88, P297) calls
I1293 Iran Mesolithic J2a-CTS1085 calls
I1945 Iran Neolithic R2a-Y3399 calls
I1949 Iran Neolithic pre-R2-M479 calls
I1662 Iran Copper Age J2a-PF5008(xL581) calls
I1685 Levant Natufian CT(xJ1, J2a, J2b, T1, P) calls
I1690 Levant Natufian CT(xJ, L, R1a, V88, M269) calls
I1414 Levant PPNB E1b1b1b2-CTS11781 calls
I1416 Levant PPNB CT(xH, I, J, K) calls
I1727 Levant PPNB F(xG, J, LT, K2) calls
I1700 Levant PPNC H2-P96 calls
I1705 Levant Bronze Age J1a2b-Z2324 calls
I1730 Levant Bronze Age J2b-L282(xJ2b2) calls

No J in Middle East (Levant) before Bronze Age.
No R-P297 in Armenia (x does mean: no).

There is only one person winning on all the front: ME.

Olympus Mons said...

Gioiello,
Gosh, you right. Jumped the gun. ahahahah.

No, you are not winning all front.

But he was M415, right? So M415 is equivalent to P25, therefore M415 can still be direct ancestral to M269/L23.. no?


Olympus Mons said...

and Gioiello,
Nobody has sample in the right... time. Once you get Caucasus (southern) 7th/6th Millennia BCE.. it all be clear.



Gioiello said...

Olympus, you are a kind person as all Caucasian people, but I wrote already many years ago what is in the Caucasus and what is in Italy. About R1b1-L389+, Caucasus has only one haplotype with YCAII=21-23, and Italy has at least three: 18-22, 18-23, 23-23, and the subclades with 19-23 derive from 18-23 and not from the Caucasian 21-23.
This is my theory, and so far the oldest R1b1a at the P297 level has been found in Italy (Villabruna, 14000 YBP, which does mean years before present).

Olympus Mons said...

Gioiello
i am looking for the origin of M269... not R1b itself. And I am positive that M269 either was "born" in shulaveri or while shulaveri was running...

So I am fine with Italy as a refugee for any haplogroup you choose. Although I don't believe it. - There is a great varience in R1b in Portugal, spain and Italy (north). To me is just the fact that they r1b spread to Europe start with Bell beaker in Zambujal Portugal. Moved to spain, south France and North Italy and did not stop. Bell beaker turn into Celts, into Etruscan and so forth...

Gioiello said...

Olympus, things are much more complex than you think. Iberia had migrations from Italy from 7500 year ago and also later and in Iberia there are above all recent subclades of R1b1a2, and also R-V88 arrived clearly from Sardinia /Italy. Let's wait that much more aDNA is tested, at least in Italy as in Iberia and in other countries. Italy has so far only 5 Y tested.

Olympus Mons said...

Gioiello,
Yes. Lots more sampling.
Its underway...

Olympus Mons said...

Gioilello,
I assume this Kura-araxes has P25. So if you go to my thesis (http://shulaveri2bellbeaker.blogs.sapo.pt/) P25 in Southern caucasus is well im agreement.
I figure that P297 or even L23 will be picked up at Arukhlo, Mentesh Tepe and so forth...

Atriðr said...

@Rob
There's the Azerbajan [sic] corridor, widely known
That there are many language groups in the Caucaus [sic] only serves to confirm rather than doubt its role as a conduit of language spread


I mountaineered over a decade of my life. Not that easy to move large amounts of people through mountain ranges. Not impossible; but mountains are generally refuges. How did CHG (or whatever new name is used to refer to precise substrate) enter the Steppes? I'd still look via an easier path than through mountains. Vast movements are generally over plains or ships (as we can observe in Europe right now).

The issue of lack of FU loans in IA might also relate to issue of social language dominance.
Yes. The elite-model is one of the theories for this reason. However, some of the words that entered Finno-Ugric from I-I (like *orya) makes it unclear. And rarely do you only have unidirectional exchange. There will always be some exchange.

@Grey
I tend to think the most likely route for CHG-like farmers onto the steppe was east of Caspian or west of Black sea.
I like both these routes, although I favor east of Caspian more.

Rob said...

@ Atrior

Fair enough, but again, looking at the diversity of languages from various Caucasian langauges to Iranic to Turkic & Mongolic tells us people moved through there constantly, right ?

There is the possibility of CHG moving via east of the Caspian, but that would have been operant earlier than the Chalcolithic (although the dates for the Kelteminar culture keep changing, so its hard to make a solid theory here). Otherwise, the east Caspian route only opened up with the Bactrian camels closer to 2000 BC, so too late for that.
The other option is via Anatolia the Balkans & back round to the steppe, That's probably the least likely

About I-A. Didn;t scholars like Trubachev identify some 'odd' strata, like some supposedly outright Indic toponyms in the Black Sea area ?

Jijnasu said...

As regards FU there is a theory that the loans in FU are from a sister group to IA and Ir, which they label Andronovan

Olympus Mons said...

@Grey
When would you think that CHG-like farmers migrated to steppe? If you had to guess what would be the millennia?

Grey said...

Olympus Mons

No idea :)

I just look at maps and make guesses - the details is something else.

Atriðr said...

@Rob
Yes, for sure. But its (the Caucasus) rich diversity comes from its ideal usage as a refuge imo (from both directions). Yes, the flux in data and dates won't allow to pinpoint anywhere with certainty just yet - but soon.

Yes, Trubachyov thought that toponyms in Crimea and NE of Black Sea were Indo-Aryan instead of Iranian. I agree with this. Although, I would say Indo-Iranian (or mother language of Sanskrit/Avestan); Indo-Aryan (if one considers it something close to Sanskrit) is closer to Indo-Iranian than later Iranic languages.

@Jijnasu - yes, Andronovan as sister group is my instinct as well, as inferred from one of my above comments. But really, I hold it to be the same Indo-Iranian language - simply the one that did not go down to the subcontinent.

I think genetics will beat me to it, but yes, Andronovo has a much bigger role to play.

Once BMAC results are released, and IVC for clarity - we can start claiming things with certainty. Already, the picture is almost clear.

Open Genomes said...

@Iosif Lazaridis (Broad):

Please see the comments here showing high IBD between
M291439 (I1706 Levantine Early Bronze Age, 'Ain Ghazal, Jordan, 2490-2300 BCE),
F999933 (BR2, Late Bronze Age Kyjatice Culture, Ludas-Varjú-dűlő, Hungary, 1270-1110 BCE)
and
M472767 (I0232 Srubnaya Novoselki, Northern Forest, Samara Russia, 1850-1200 BCE) and F999933 (BR2, Late Bronze Age Kyjatice Culture, Ludas-Varjú-dűlő, Hungary, 1270-1110 BCE):

http://eurogenes.blogspot.com/2016/06/german-bell-beakers-in-context-of.html