Y chromosome genetic matches

The National Geographic genographic project analyzed my Y chromosome, which has helped understand my deep genealogy. Augmented with further tests I paid for, the following copy numbers of Short Tandem Repeats at various positions were found:

DYS39339019/394391385a*385b*
value132516111114
rate0.29–1.721.47–3.621.55–3.67 1.83–4.181.41–2.98F
DYS426388439*389-1392389-2
value121210141132
rate4.19–9.22 1.41–3.740.12–1.172.24–5.02
DYS458*459a459b455454447
value171010111124
rateFFF
DYS437448449*464a*464b*464c*
value142032151516
rateFFFF
DYS464d*460GATAH4YCAIIaYCAIIb456*
value161212192315
rateFF
DYS607576*570*CDYa*CDYb*442
value161918364214
rateFFFF
DYS438
value11
rate0.06–1.69
The 95% confidence interval on the estimate of the rates are from a summary web page and are in change per 1000 generations. Those marked with F are unknown, but are probably faster than the nominal rate of 2 per 1000 generations. The ones marked with * in the name are sometimes discounted as being fast. A paper parameterizes the mutation rates by: 3.1×10-6 × e0.200 ×M - 1.06 × |δ| for increase, and 4×10-7 × e0.302×M - 1.06 × |δ| for decrease, by δ from the marker value of M. A nomenclature change adjusted by 1 downward all the 464x on May 19, 2003: the values here are in the new scheme. The 389-2 nomenclature used here is the length of the entire 389 region including 389-1. The notation is in FtDNA notation, by the NIST standards, 442 should be 19, and GATA H4.1 13. CDY is also called DYS724.

From the markers, I was classified as a R1a(M17) individual and I was classified as belonging to haplogroup R1a1a by SNP typing (M124-[xR2a] M157-[uncommon] M269-[xR1b1a1a2] M343-[xR1b] M56-[uncommon] M64.2-?[uncommon] M87-[uncommon] P25-[unstable xR1b1] SRY10831.2-[R1a1] M17+[R1a1a] M173+[R1] M198+[R1a1a] M207+ [R]; later I also verified M94+[BT] M42+[BT] M299+[BT] M139+[BT] M294+[CT] M168+[CT] P143+[CF] ), just like 72% of the West Bengal Brahmins: see here for a basic idea of how a Single Nucleotide Polymorphism test works. (Further testing showed I am actually R1a1a1b2a1a2c Y7 with additional Y30, Y29, Y944, and Y2428, but not Y16494 (which is provisionally R1a1a1b2a1a2c2d5~xa); looking at the pattern of my close STR matches, I suspect that 12 marker matches may define only upto R1a1a1b but no shallower). As shown in this diagram, before the cladistic nomenclature started, it has been variously named as belonging to haplogroup 3, IX, 1D, 45, Eu19, H16, or D, according to classification systems adopted by various authors.

Obviously a difference in a slower mutating site like 392 could count as much as 14 times a mutation at a faster moving site like 439, but I ignore that in the following discussion.

According to Family Tree DNA site, no one with this exact profile has been tested by them yet. An ancient DNA from a Siberian mummy more than 3000 years back is, however, a 12 marker match for me, but see there for a discussion on whether that may be coincidental. A 9 marker (19,388,389I,389II,390,391,392,393,439) exact match for me is very common (10% of R1a1a) in Kyrgyzstan, and 12 marker matches (the first 12) has been seen in 1.5% of people represented in the YSearch database originating from Kazakstan (1/66; 95% exact binomial confidence interval on percentage [0.07,8]), 0.9% from Pakistan (1/108; [0.02,5]), 0.8% from Russian Federation (19 Altai(Siberian)+1 Kirghiz / 2357; [0.5,1.3]), 0.3% from Slovakia (1/367; [0.007,1.5]) and India (3/1107; [0.05,0.8]), 0.2% from Greece (1/534; [0.005,1]) and Mongolia (1/575; [0.004,1]), 0.1% from Hungary (1/786; [0.003,0.7]) and Ukraine (1/1049; [0.002,0.5]), and a sprinkling from Scotland (1/7719; [0.0003,0.07]), Germany (1/8388; [0.0003,0.07]), England (2/17101; [0.001,0.04]), and Ireland (1/9618; [0.0002,0.06]). The confidence interval in each case reflects the effect of only a small number of people having been tested: thus, if the true percentage of people of Russian origin who are a 12/12 match for me is outside the range 0.5–1.3%, the chance of seeing a large enough sampling fluctuation that we observe 20 matches out of 2357 randomly chosen people from a large population of Russians would be less than 5%. But, the actual population tested is not a random sample: the majority of the Siberian matches may have been from the Mendur-Sokkon village in the Ust'-Kanskii region in the southern part of the Altai republic. Similarly, people who got tested may be related in socio-economic and geneological ways: for example, the population from India that is tested may be skewed towards the group that resides in the US today. In addition, people report their origin in idiosyncratic way: so while some Englishmen report it as England, others will call themselves from the UK. Finally, with itinerant groups like gypsys, the country of origin is not a stable marker over long periods. The numbers here should, therefore, be interpreted cautiously.

At distance one, we find 7% from Kyrgyzstan (1/14; [0.2,33]), 4% from Slovenia (4/101; [1,10]), 2.8% from Pakistan (3/108; [0.6,8]), 2.2% from Slovakia (8/367; [0.9,4]), 1.5% from India (17/1107; [0.9,2]) and Iceland (2/135; [0.2,5]), 1.2% from Bulgaria (1/84; [0.03,6]) and Canada (2/170; [0.1,4]), 0.8% from Norway(8/952; [0.4,1.6]), 0.7% from Denmark (4/587; [0.2,1.7]), Russian Federation (12 Altai(Siberan) + 4 others / 2357; [0.4,1.1]), Syrian Arab Republic (1 Arab/152; [0.02,4]), and Uzbekistan (1/152; [0.02,4]), 0.6% from Lebanon (1/154; [0.02,4]), Hungary (5/786; [0.2,1.5]), Lithuania (1 Askenazi-Levite + 3 others / 699; [0.2,1.5]), Greece (3/534; [0.1,1.6]), and Poland (1 Prussia + 13 others/2535; [0.3,0.9]), 0.5% from Austria (2/389; [0.06,1.8]), 0.4% from Ukraine (4/1049; [0.1,1.0]), 0.3% from Mongolia (2/575; [0.04,1.3]) and Czeck Republic (1/385; [0.007,1.4]), 0.2% from Germany (15/8388; [0.1,0.3]), Sweden (2/1213; [0.02,0.6]), and Italy (4/2479; [0.04,0.4]), 0.1% from China (1 Chinese Ethnic Minority/897; [0.003,0.6]) and United Kingdom (1 Great Britain, 1 Shetland Islands and 2 ohers / 7722; [0.01,0.1]), and a sprinkling from Spain (1/2292; [0.001,0.2]), Ireland (4/9618; [0.01,0.1]), Scotland (3/7719; [0.008,0.1]), and England (6/17101; [0.01,0.08]).

A small note on the timing issues for these matches: the previous calculations show, for example, that the chance is less than 5% that descendants of two brothers will have a 12 marker match 62 generations later. But, if enough of those descendants all do their DNA tests, then it is almst certain that some matches will be found. So, once we search a large database like this, we may often find ancestries deeper than the calculations naïvely suggest, and the last common ancestor with many of these people may have lived longer in the past.

The security modal haplotype (i.e. the ‘catch all’) of the modal R1a haplotypes listed by the Scandinavian Y-DNA project is close (1 difference in 14 markers carefully chosen to represent R1a). I am almost equally distant to the specific modal haplotypes: the Eastern European one is closest at a distance 20/37, and the rest are at distances of 21/37 (English, Eurasian, and Old Norse) or 22/37 (Ashkenazi-Levite). A naïve calculation indicates the last common ancestor was between 5 and 159 generations back, at 99% confidence; but since these are not random sequences, rather they were chosen because they are modal, the estimates may be skewed.

The 393:13 426:12 388:12 392:11 455:11 454:11 437:14 438:11 also appear as slow mutating markers among the R1a1 in Bornholm, Denmark (Descendants of Henning Andersen b.c. 1627, d.c. 1709). The 426:12 388:12 455:11 437:14 448:20 YCAIIa:19 442:14 438:11 also appears in R1a from Baltic to Northern India, and 426:12 388:12 392:11 464b:15 464c:16 464d:16 YCAIIa:19 438:11 appears in Ukraine to Scotland-Poland. And the 19:16 388:12 389I:14 389II:32 390:25 391:11 392:11 393:13 439:10 (and A7.2:10) is the median markers among the R1a1a in Kyrgyzstan.

The following is from older information: 12/12 matches were found in Austria-Hungary(1/151), Greece (1/312), Hungary (1/385), India (3/793), Ireland (1/5460), Kazakhstan (1/38), Mongolia (1/574), Pakistan (1/66), Russia (19 Siberia/Altai + 1 Kirghiz out of 1827), Scotland (1/4382), Slovakia (1/221) and Ukraine (1/558). One each from Austria-Hungary, India, Mongolia, and 3 of unknown origin, and all 20 from Russia had their haplogroup determined to be R1a by an SNP test.

Matches at distance one (95% CI for last common ancestor of 5–121 or 6–148 generations depending on the model and assuming the slow mutation rate) are from Austria-Hungary (1/151), Bulgaria (1/30), China (1* chinese ethnic minority/677), Czechoslovakia (1/130), Denmark(1**+2/335), England (2/10167), Germany (1*+1**+8/4628), Greece (3/312), Hungary (1/385), Iceland (1*+1**/130), India (6*+1**+5/793), Ireland (1/5460), Italy (2/1140), Kyrgyzstan (1*/10), Mongolia (2*/574), Norway (1*+2/511), Pakistan (1/66), Poland (5/1170), Prussia (1/159), Russia (1*+1+12* Altai(Siberian)/1827), Scotland (1*+1/4382), Shetland (1/141), Slovakia (1**+6/221), Slovenia (1/46), Sweden (1/680), Syrian Arab (1*/129), Ukraine (2/558), United Kingdom (1/3405), United States (1/555), and Uzbekistan (1*/147). The asterisked numbers (and 2 of unknown origin) in each case had their haplogroup confirmed as R1a (** is R1a1) by an SNP test. Some details about more distant matches are here.

In the lists below, a number of Y SNPs, are referred to. Both ISOGG and Yfull provide phylogenies for these. In modern browsers, these trees will be displayed (after processing) in the windows below. The early history is roughly given by the following: Y-Adam (~60–90Kabp) → A0-T → A1 (~140Kabp) → A1b (~110 Kabp) → BT=A1b2 (~55Kabp, NE Africa) → CTCF (~31–55Kabp, NE Africa) → F (~60–80Kabp, out of Africa) → GHIJK → HIJK → IJK → K (~40Kabp) → K2 → K2b → P=K2b2 (~35Kabp, Central Asia) → P1=K2b2a → R=K2b2a2 M207 (~27Kabp, Asia) → R1 M173 (~18.5Kabp, SW Asia) → R1a M420 (Eurasian Steppes or Indus Valley) → R-SRY10831.2/Page65.2/PF6234/SRY1532.2, R-M448/L122/PF6237 and R-M459/PF6235 are R1a1, R-L120/M516/PF6236 is roughly here → R-M17, R-M198/PF6238, R-L168, R-M514/PF6240, R-M515 and R-M512/PF6239 are R1a1a, R-L12, R-L235, R-L399, R-L450, R-L451, R-L457/PF6191, R-L458, R-L579, R-Page68 roughly here → R-M417 and Page7 are R1a1a1, → R-CTS7083/L664/S298 (and CTS4385?) marks R1a1a1a (Northwestern branch) and R-S224/Z645, R-S441/Z647 (and CTS5508, CTS9754, PF6162, PF6168, F3044?) marks R1a1a1b (Eastern branch). This brings us to the approximate limit of 12 STR resolution (5±0.5kabp).

Migration map of R1a1a1 according to Eupedia.

A low resolution Y chromosome phylogeny inferred in a recent paper.

If browser supports, Yfull data reformatted to highlight relevant SNPs. The data is copylefted, scroll the window to top to see copyleft notice. Tooltips provide age information when available. The tree can be reformatted to display other sets of SNPs: instructions at top of the window.

If browser supports, ISOGG data reformatted to highlight relevant SNPs. The data is copyrighted, scroll the window to top to see copyright notice. Tooltips provide information about the SNP when available. The tree can be reformatted to display other sets of SNPs: instructions at top of the window.

If browser supports, FTDNA data reformatted to highlight relevant SNPs. The data is copyrighted, scroll the window to top to see copyleft notice. The tree can be reformatted to display other sets of SNPs: instructions at top of the window.

I have managed to communicate with five of the many families with exact matches to the 12 markers who have details listed on the FtDNA site:

I have not been able to contact

I have not yet tried to contact

Three Kyrgyz tribes (Sari-Bagish, Saruu, Adigine) from "Umuraki, Bishkek ataul, kyrgyzs, Sari-Bagish clan", "Inka, Cholpon-ata ataul, kyrgyzs, Saruu clan", and "Alay ataul, Kazike, kyrgyzs, Adigine clan" are a match (R-M512).

In addition there are private matches with names Koshenov, Maffett(R-M512; from William Norris), T.K.(R-512; Koshimbai of Baibaqty of Saruqyrgyz of Kazak) and 073-B001(R-M512). (A post about matches to Nurtan Mambetov KG, shows one of the names as T K Koshenov.) There is also an 11/11 match with Dennis Maule (There was a bug in uploading that omitted 389-2 and inserted a fake DYS 19b=DYS19/394, this ysearch account seems to have that), descendant of Riccardo Maule (c.1890–c.1956) from Vincensa, Italy.

Ancestry.com finds two 10/10 matches for me with Cathy Warren and Zachary Kurek which they interpret as having a last common ancestor about 15 generations back. They also find numerous other matches with the predicted common ancestor 31–35 generations back.

Not updated: A match at distance 1/12 happens with Harish Radhakrishna Kamath (no longer on the site, but a private match exists with this name), with the most distant known paternal ancestor is Raghavendra Kamath 1913–1989 from Karkal, and he is R-Z94. The Kamath family, who are a Gaura Saraswat Brahmin family from Mangalore whose family tradition says they moved from Kashmir to Trihotapura in Bengal and around 500 AD moved to Goa from whence they moved to Mangalore after the Portuguese started administering that region. A 37 marker test shows that there are 12 markers which are different, and the total difference is 22 in the stepwise mutation sense; but since all except one of the multiple differences is at faster varying sites, the relevant result is probably that 14–15 mutations in 37 generations has a 95% confidence interval of between 10 and 125 generations without convergent or back mutation correction, not correcting for the fact that the match was found by the search of a large database. Someone related to Ms. Susan Colby (descendant of Jurgen Kolbe b c 1675, Angermunde, Brandenburg,Pr; R-Z283), and many other families: Balda (Ramesh; R-M417), Barker (Jordan Cheyne; descedant of Thomas Madison probably 1941 IL; R-M512), Berezik (private match; from Antonio Berezik, b.c.1785, Lesko, Poland; R-M512), Berezik (private match; from Antonii Berezik, b.c.19th c., Lesko, Poland; R-M512), Bergtun (Dag Harald; most distant Knut Eriksen Rakvåg,b. c. 1695,Rakvåg,Oterøy,Romsdal,MRO,Norway; R-M198), Bialkowski (Jacek Piotr; descendant of Franciszek Bialkowski c 1790–1856 Swidniki, Pols; R-CTS3402), Cholewa (private match; from Henryk Cholewa, b. 1885, Tarnawa, Poland; R-Z280), Cremer (James Clarence; from Thomas Cremer b. c.1832? 1835? Hanover, Germany; R-M198), Dibek (Mithat; most distant from Bulgaristan; R-M512), Dixon (James Robert; descendant of John Dickson c 1775–1836; R-Z280), English (private match; [E11] Strickland English, c.1805–aft.1860, Vermont, USA; R-Z93), Ermolenko(Petr Efimovich; Ivan Ermolenko 1740–1760; R-512), Fox (Stephen Lynn; most distant ancestor Gottlieb Fuchs Dec 1846–19 Jul 1903 from Germany; R-M512), Gyetvay (Mark; from Hungary; R-M198), Hall (Joshua, great-great-grandson of Henry, descendant of John Hall of Hunslet c 1780, Leeds, England; R-M417), Hallai (Julian, no longer listed; but a private match with that name from Jozsèf Hallai 1875–1916, Kisiratos, Hungary; R-M198), Hasan (A.M.; R-M512), Herman (Jason Christopher; from Frank Herman, 18 Feb 1888–1 Apr 1954; R-M512), Higgs (Harold; from Peter Rollason b. 1760?, Warwickshire, U.K.; R-CTS4179), Higgs (Marvis E.; from Peter Rollason b. 1760?, Warwickshire, U.K.; R-L176), Houghton (Christopher; R-M417), Irvine (David Peter Gerard; from Samuel Irvine, 1789, Limavady, County Londonderry; R-M512), Kadyrbekov (Zamirbek; R-M417), Kennedy (Donald Grinker, descendant of Abraham Kennedy, b c 1858, Vilnius, Lithuania; R-M417), Kozhomuratov (Edyl; R-M512), Kozhoyarov (Sharshenaly; R-M417), Kulbayev (Torekhan; Babasan tribe of Northern Kazakhstan; R-M417), Kurek (Zachary Michael; descendant of Joseph Kurek b c. 1879 in Czekoslovakia; R-M417), Kutbi (Tharwat; from Jameel Ibrahim Kutbi; R-L657), Larsen (Kurt P.; most distant Lars Nielsen 1718–1800 of Tise, Borglum, Hjørring Amt, Denmark; R-CTS4179), MacDonald (David; earliest known ancestor b.c. 1795, possibly Virginia; his son John b. 1828 Tennessee d.c. 1880 Texas; R-L176), MacDonald (Donald John; R-M417), MacDonald (private match; from John McDonald, 1786; R-M512) Maksutov (Nurlan; R-M417), Malik (Tariq Murtaza; of Awan Malik family), Mantor (Donald Duncan; from Samuel Duncan, 1619–Apr 1680; R-M417), Matheson (Alister Hugh, no longer listed; but a private match of that name; R-M512), Mathieson (Malolm B.; most distant known ancestor is James Matthieson b. 1826 at Paisley, Scotland), McDonald (Neill Flemmon; from John Norman McDonald; R-L176), McEachern (Carlton Leslie; R-L176), Meszaros (Anthony Joseph; descendant of Istvan Meszaros b 1856 in Nyirlugos Kerekn%aacute;d, Hungary; R-L579), Miclash (Robert; R-M417), Mughal-Pathan (Yasser Khan; descendant of Abdulla Khan Mughal-Pathan) Onus (Curtis; R-M417), O'Reagan Cook (Dr. John Michael, M.D.; descendant of Alexander Cook), Passarella (Richard; R-M417), Psaltopoulos (Dimitrios Vassilios; most distant Lazaros Psaltoglou from northern Turkey; R-M417), Rao (Varun Shankar; descendent of Venkayya b.1830; R-Y7), Rickerl (David; descendant of Michael Rückerl b. c. 1750 of Germany; R-Z280), Rollason (private match; from Peter Rollason b. 1760?, Warwickshire, U.K.; R-CTS4179), Runnells (Martin George; descedant of Thomas Runnels, b 1790 VA or KY), Ryskulov (Uran; R-M512), Samakov (Aybek; R-M459), Samarkin (from Dei Samarkin Samarsky uezd, Erzya, Russia; R-M417), Sanasy (Vijayakumar; Murugesu, 1876–1944; R-M417), Seroka (Wladislaw; from Mieczyslaw Seroka, Lublin province; R-M512), Seyit (Abuov Omar; of Omar from Seyit from Azhibai from Aidar from Kete; R-Z93), Seyit (Aidyngaliev Alkerey; of Alkerey from Seyit from Azhibai from Aidar from Kete; R-M417), Seyit (Kojanov Alseyit and Gazezov Alseyit; of Alseyit from Seyit from Azhibai from Aidar from Kete; R-M512), Shomin (Bernard Noel; from Juro(George) Nicolai/Nicolas/Nicholas Šomin; R-M417), Skvortsov (Vitally Denysovych; descendant of Porfyriy Skvortsov, b. c.1865, d. c. 1920 Radalivka, Ukraine; R-Z92), Sydykov (Azamat; R-M198), Tagankin (Vladimir Cladimirovich; descendant of Alexandr Tagankin c. 1890), Turner (John Wendell, descendant of Jeremiah 1773 VA–1872 KY; R-M198), Usubali (Marat Shertai; two kits; Solto-Tœlœk; R-M512), Velikoselski (Olag; descendant of Ivan Velikoselskiy, b. c.1580 in Novgorod-Tomsk, Russia; R-Z280), Windsor(John David; John T Windsor 1839–1924 from previous Stapleton? Z2122+ if correct would make him R1a1a1b2a2b; R-Z94), and the family of Dr. P.S. Ramanujam (descendant of Venkatacharan) also appear at a distance of 1/12; as do many people at much closer distances also appear in the ySearch database. Similarly the match at a distance of 2/25 with Marwan Aladnan (R-L657), Fazim Mohammad (R-M512) and Varun Shankar Rao (R-Y7; descendent of Venkayya b. 1830; with ancestors as Ramaswamy of Kuthalam, Tamil Nadu, Rao of Mangalore, Karnataka, and Venkateswaran of Palakkad, Kerala) may not be particularly significant. As stated earlier, looking at the distribution of haplotypes, a 12 character almost match cannot distinguish the clades under R1a1a1b.

Sorenson molecular genealogy foundation claims 6/7 (DYS 393,458,459(only one),455,454,464,456) match with Webb (descendant of William Smith Webb, b. 18 Mar 1818 in Virginia) and another 6/7 (393,394,not 458,455,437,449,442) match with Mitchell (descendant of William Michell b. 11 Jul 1842, whose son Thomas b. 14 Jan 1874 was born in Frontenec, Ontario, Canada). Ybase also finds a number of matches.

Valid HTML 4.0! Valid CSS!