Wednesday, March 25, 2015

Getting extra out of 23andme and AncestryDNA tests: Y SNPs

After learning from the ISOGG website ( that both 23andme and AncestryDNA tests include some Y SNPs, I did some investigations to see what we might get out of those results.

23andme Haplogroups and SNPs

First, 23andme gives outdated haplogroup assignments, but by looking at the raw data, you can learn which SNPs they have tested. This turned out to be very helpful in evaluating an important lead. I had found a 12 for 12 match based upon FamilytreeDNA STR (single tandem repeat) testing with someone with the surname Morford. Because we had strong autosomal DNA matches with Morford descendants, and because the Morfords in question lived close to my Long ancestors in Greene County, Pennsylvania, I am pretty confident that we have a genetic connection to these Morfords.

It turned out the Morford descendant had done additional SNP tests that placed him in haplogroup R-P312. I looked at my dad’s 23andme results, I saw that he had tested positive for SNPs showing that he was positive for U106 and beyond that, L48, while being negative for L47. Because R-U106 and R-P312 are mutually exclusive haplogroups, we apparently could not be a match despite sharing 12 STR markers in common. So the search for the patrilineal ancestor continues! However, by leveraging the 23andme results, I was able to learn which specific additional SNP I could test to best refine my haplogroup.

I also learned that another Long cousin was assigned by 23andme to a slightly different haplogroup than my dad, who had an extra “d” at the end, indicating a slightly more refined haplogroup. The difference, it appears, is that the cousin probably had a “no-call”. Beware that such subtle differences can arise due to methodology not genetics.

Comparing AncestryDNA Y SNPs

Next, I explored Ancestry DNA, where two Barmore cousins had results. One of them had done a 12 marker Y test at FamilyTreeDNA, which placed him in haplogroup J2a1h. We wanted to confirm that the other Barmore cousin matched his Y DNA. I was pretty sure he would, because the two were matches at AncestryDNA (and GEDMATCH, based on autosomal DNA). After a query to CeCe Moore, I learned that Ann Turner had created a spreadsheet tool for exploring Y SNPs from AncestryDNA. I downloaded the spreadsheet from, copied in the raw data. The results showed that the cousin had 26 positive SNPs with haplogroup associations. Most significantly, they included 4 SNPs for which he tested positive that would place him in J2a1h2a1. This seems to match the Barmore cousin with the 12 marker result. This seemed like pretty convincing evidence that the two cousins indeed shared Y DNA, and allows us to focus on more detailed STR or SNP testing in lieu of spending money on another basic Y test.

Thanks to CeCe, Tim Janzen, and the team managing the U106 haplogroup project for answering my questions about how to get a bit more mileage out of my Y results!

Friday, March 20, 2015

Using Genetic Genealogy to Break through Brick Walls in Your Family Tree

Getting Started in Genetic Genealogy

Genetic genealogy is the process of using DNA tests to determine how people are related through shared DNA (by “blood”). To better understand this rapidly evolving field, the International Society of Genetic Genealogy ( has a useful guide and glossary for “newbies”. Also see “Definitions of the terms used in genetic genealogy” at the FamilyTreeDNA website for more definitions:
There are four main types of DNA used for genetic genealogy. Autosomal DNA is the most useful for general genealogy in recent generations, although all types may help to answer particular questions.
Y chromosome is passed only from fathers to sons, so it traces only a single line of descent (patrilineal). It is very useful for testing even distant relationships through that line.
Mitochondrial (mtDNA): passed only from mothers to children—single line of descent only (matrilineal). It is useful for testing even distant relationships through that line, but challenging to use due to surname changes that generally make it more difficult to trace female lines.
Autosomal (auDNA or atDNA): all chromosomes except the sex chromosomes—shared and recombined from parents, it represents all lines but is only reliable at detecting relationships within about 5-6 generations (see figure below).
See figure from Your Genetic Genealogist blog by CeCe Moore:
chromosome DNA is passed from fathers to daughters or from mothers to either sons or daughters, giving it a unique pattern of descent (see figures 2A and 2B). It can be particularly useful in narrowing relationships identified through autosomal DNA tests. Tests for autosomal DNA include X-DNA, but only FamilytreeDNA, 23andme, and GEDMATCH show your X-DNA results (AncestryDNA does not).

See figure from  which shows average percentage of X DNA from each generation. Note that you should not necessarily “expect” the average, because the X chromosome recombines somewhat unpredictably.
Also see the X-DNA Inheritance chart for females here: (shaded boxes show possible sources of X DNA).

A simple rule to remember is that X DNA cannot pass through 2 males in sequence.

Basics of Autosomal DNA Testing Strategies

Each of the companies offers the basic DNA test for about $99 (note that there are often sales throughout the year, especially around the winter holidays and DNA day on April 25). Buying multiple kits can save too. This site compares the three major companies: 
FamilyTreeDNA is particularly helpful for testing hypothesized relationships, because matches are generally responsive, and the site has the best tools for examining your matches in detail. FamilyTreeDNA also has good international coverage.
AncestryDNA is particularly helpful for identifying unknown ancestors and relatives of US origins, because it has the largest database of users in the U.S. with family trees and will generally yield the most matches. However, the site lacks tools for examining your matches (see below for how to get tools!). AncestryDNA is also in the process of extending its services to the U.K.
23andme may be helpful for finding living relatives, because they have a broad database of users (not just genealogists!). However, many users are anonymous and lack family history information, so it is challenging to use for genealogy. One perk of 23andme is that their test does provide haplogroup information for Y DNA (for males) and mitochondrial DNA—their results for the Y are not directly comparable to Familytreedna’s Y tests, but they can add some clues and help to exclude some families.
To get the most answers to your questions, you may choose all three, and FamilyTreeDNA accepts transfers from the other two companies for $39. (Alert: the newest version of 23andme (V4) does not transfer to FamilyTreeDNA or GEDMATCH).
The Geno 2.0 test from National Geographic provides “deeper” (=older) ancestry than is used by genealogists, but there is a free transfer to FamilyTreeDNA where the results may be useful, as they include some Y and mitochondrial DNA results.

Interpreting Autosomal DNA Results

The Centimorgan (cM) is a measurement of how likely a segment of DNA is to have been inherited from a common ancestor.
>10 cM block indicates definite shared ancestry.
5-10 cM block probable shared ancestry (most companies and GEDMATCH are using a threshold of about 7 cM to determine matches. GEDMATCH also uses 7 cM as a common threshold for matches based upon X DNA, although it is more complicated to interpret those values because men and women have different amounts of X DNA).
Smaller segments can indicate shared ancestry, but they may also be false positives (see post by Roberta Estes for more:

Table A: Likelihood based upon length of shared segment

Length of shared segment
Likelihood you and your match share a common ancestor within 6 generations (values will be different for endogamous populations)
>30  cM
20-30 cM
12-20 cM
6-12 cM
<6 cM

Table B: Likelihood of matching actual relatives

Shared  DNA
Average cM Shared
Likelihood of Matching
Mother, father, siblings
Grandparents, aunts, uncles, half-siblings, double first cousins
Great-grandparents, first cousins, great-uncles, great-aunts, half-aunts/uncles, half-nephews/nieces
First cousins once removed, half first cousins
Second cousins, first cousins twice removed
Second cousins once removed
Third cousins, second cousins twice removed
Third cousins once removed
Fourth cousins
Fourth cousins once removed
Fifth cousins
Sixth cousins or more distant
Triangulation is the process of determining that a particular autosomal DNA segment has been inherited from a common ancestor by identifying two or more cousins who share that segment. Note that this does not mean that all descendants of that ancestor will have that segment, but it suggests that the segment might be an indicator of descent from that family line.

Tools for Triangulation is a free, donation-supported site for comparing results across the 3 major companies. By donating $10, you can become a “Tier 1” member that has some additional tools, including Triangulation.
The Autosomal DNA Segment Analyzer here: triangulates your FamilyTreeDNA matches.
Genome Mate allows you to keep track of your matches across the platforms from FamilyTreeDNA, 23andme, and GEDMATCH.
Matches in common: finding all matches shared by two or more individuals. This feature is available at FamilyTreeDNA and GEDMATCH.

Some Common Questions

Why does my known cousin not appear as a DNA match? The odds of matching depend on the degree of the relationship (see table above)—known cousins may appear closer or more distant due to random inheritance of DNA. As a result some cousins, even as close as 3rd cousins, may not appear in your match lists. Also, non-paternity may also account for a lack of a relationship—the presumed relatives actually had different fathers and/or mothers than what was expected.
Why does my sibling have different matches than I do? Because autosomal DNA is randomly inherited from one’s parents, siblings will have somewhat different autosomal DNA. Also, note that females inherit X DNA from their fathers and their mothers, while males inherit X-DNA only from their mothers, so brothers and sisters have different X-DNA results. For this reason, it may be helpful to have results from your siblings in addition to your own results.
Why do I have a relatively close match to someone, yet we cannot find our relationship? In cases where people actually know their recent ancestors, this result may reflect having more than one shared line of ancestry. You may look for “cousin marriages” in the trees of such individuals, which will increase the DNA passed down by those ancestors in common.

Supercousins or “Up cousins”: A person one or more generations higher (removed "upwards") than anyone alive in your direct line, whose DNA results can help you make connections.

Two pathways for searches

1. Find a suspected common ancestor based upon records or family lore
1. Find matches in common with one or more shared DNA segments
2. Find one or more descendants to test
2. Search the trees of those matches for families and places in common
3. Get DNA results
3. Identify likely common ancestors
4. Look if shared segments and matches in common support the hypothesis
4. Determine if paper records support a connection

 Ten strategies for using genetic genealogy to break through brick walls    

1)      Secure samples from the oldest generations: In your immediate family, recruit DNA samples from the highest generation available on the line of interest. Once processed and stored with a company like FamilyTreeDNA, DNA samples may be used for additional testing in the future.
a.      Note that siblings will have somewhat different results so it can be worth getting samples from each. In particular, males and females have different X DNA results.
b.      For general searches with a focus on U.S. ancestry, I recommend starting with AncestryDNA and transferring results to FamilyTreeDNA ($39) and (free or $10 to get triangulation).
c.       If you want to validate a hypothesized relationship, going straight to FamilyTreeDNA may be a more efficient solution because they offer more sophisticated analysis tools.
2)      Build a cousin network for genetic genealogy: recruit 1st, 2nd, 3rd, 4th, and 5th degree cousins who share descent on your lines of interest to take DNA tests. Especially seek out “supercousins” from higher generations who carry more DNA from the ancestors of interest.
a.      Note that results from cousins who have multiple lines of descent from the same ancestors will have greater power to detect matches with those ancestors.
b.      Cousins whose ancestors were half-siblings of your ancestor of interest will have a weaker match, but their results can help you to isolate that paternal or maternal line.
c.       Living cousins who would have an X DNA, Y DNA, or mitochondrial DNA connection may be particularly valuable for validating relationships, including ones that may be too distant for autosomal DNA to reliably trace.
3)      Find the cousins to fill out your network: To identify cousins who would be helpful in your search, you can use Wikitree and other online family trees to identify living descendants who have tested or might be willing to test. Genealogy sites will generally yield higher responses, but even general sites like Facebook can work, although response rates can be low.
4)      Share your information so others can help you: Link your DNA results and all of your known surnames to complete family trees so that folks can better find points of connection—let them help you! Avoid posting partial trees (for example, your paternal or maternal family only) and clearly identify lines that represent a known or suspected adoption.
5)      Contact your matches but give them details to understand the connection: When you contact individuals with whom you share a match, be sure to identify the type of match (i.e., autosomal) and the name associated with the kit. Genetic genealogists often manage results from many individuals.
6)      Systematically search your DNA results using multiple strategies:
a.      Find matches in common with known relatives; make notes associating those individuals with the shared surnames and/or locations. Note that even if you can’t trace a particular matching individual to your family, you may be able to figure out where they connect to your tree, and triangulate on their results to find other matches in common.
b.      Search matches by surnames, particularly relatively rare ones (all three companies permit this, although Ancestry works the best, and AncestryDNA Helper tool allows you to search by full names. Based on paper genealogy, you may have hunches about which families are connected to yours. If you can find someone with a rare surname in that family, you might try searching for that surname in your matches.
c.       Search matches by placenames, particularly when relatively rare ( is best for this kind of search)
d.      Search matches by shared DNA segments (use the tools under “Triangulation”).
7)      Group and sort your results: Generate lists or spreadsheets that show clusters of shared matches by family group. You can also generate spreadsheets showing shared matches by DNA segments on each chromosome. If someone unknown shares one of those segments, you may be able to guess to which line they relate.
8)      Use the hints (shaking leafs) at AncestryDNA to identify folks who appear to share a common ancestor—contact those people and encourage them to upload their results to GEDMATCH so that you can search for matches in common and compare matching DNA segments.
9)      Break out the advanced tools: If you have AncestryDNA results, install Jeff Snaveley’s “AncestryDNA helper” tool (available in the Google Chrome store for use only with the Chrome web browser) to automatically obtain lists of matches and ancestors of matches. If you have multiple kits in your Ancestry account, you can use this to easily identify shared matches.
10)  Hunt for new leads: You can use the “Ancestors of Matches” results from the AncestryDNA Helper to find individual names (first and last name combined) that are particularly common in your ancestry. You can sort by the “incidence” column to determine individual names that appear multiple times. This is currently one of the best options when trying to trace a common surname like Smith.

Final word: Genetic genealogy adds a powerful scientific tool for family historians. You will want a skeptical frame of mind when pursuing possible leads and matches—do not discount that folks may be related along multiple lines or that family trees may have errors.