Science the Heck Out of Your DNA — Part 7

This is my 100th post!
A special thanks to all of my readers for making blogging so much fun.
Scroll down for links to other posts in this series.

I presented a talk on this method at the i4GG conference
in December 2017. The video is available for purchase here,
either individually or as part of the all-conference package

The first three posts in this series explained the underlying principles behind the probability approach to genetic genealogy: form two or more hypotheses about how an “unknown” person (or target) fits into a known tree, calculate how likely each hypothesis is, then focus on the most likely hypotheses for additional testing and/or paper-trail confirmation.  The next three posts walked us through specific examples in which the method was applied.  Those examples used a calculator in a table format; you enter a matrix of names, shared cM amounts, and hypothesized relationships to perform the calculations.

Tables, however, aren’t very intuitive when we’re dealing with branching trees, and it’s very easy to make an error when determining or entering the relationships.  I often forget to specify half relationships, and typos are always a concern.

For the past few months, Jonny Perl of DNA Painter and I have been working on a much more intuitive way to test hypotheses.  The math is the same, but the interface allows you to build a visual descendant tree rather than deal with a messy (and boring) table.  Plus, it calculates the relationships for you!

Announcing … the What Are the Odds? tool! (WATO for short.)

What I love most about Jonny Perl’s programming is that he makes complex concepts intuitive and even fun. (Check out his DNA Painter chromosome mapper, if you haven’t already.)  For that reason, I’m tempted to unleash you on WATO with no instructions and let you figure it out on your own.  If that’s your cup of tea, have at it!  If not, continue reading for a quick tutorial.

 

Before You Start

The goal of WATO is to help you figure out where an unknown person fits into a known tree.  We’ll call that person the “target”.  If you are an adoptee looking looking for birth family, you’re the target.  Alternately, if you have a well-supported tree and an unknown match, they are the target, and this approach can help you figure out how they are related to your family.

Before you use the tool, your target person should have:

  • Multiple DNA matches of 40 cM or more who are all descended in known ways from the same ancestor or couple (A few matches below 40 cM are fine, too)
  • A descendant tree of the ancestral person/couple that includes the known DNA matches in their proper places
  • The amount of shared DNA (cM are preferable, but percent will work) between the target person and each of those DNA matches. (The shared DNA amounts can come from any of the companies, but for FTDNA data, you must subtract out segments smaller than 7 cM first.)
  • Two or more educated guesses about where the target person might fit into the tree

 

How to Use the Tool

To get started, go to this URL:  https://dnapainter.com/tools/probability

You will be greeted with a set of instructions.  Read through them if you like, then click “CLOSE INSTRUCTIONS” at the top right to dismiss them. The blank tool will look like this:

 

 

First, click on the text that says “Enter target name here” to title the tree.  Then, enter a few notes to summarize the question at hand.  This fictional example will try to identify the birth father of “Doug Mayo”, who has several DNA matches to the Pickle family.

 

 

Click on the box labeled “Most recent common ancestor or couple” and enter the name(s) of the common ancestor/couple of the DNA matches.

Hover your cursor over the box and click “Add child” in the menu option that appears. Repeat for each child of that couple who has a DNA-tested descendant, plus one.  In our example, Peter Pickle and his wife Gladys Honey had seven children, but only four are represented by DNA tested descendants. I’ll add four children, plus a fifth to represent their other three children in our hypotheses.

Click on each child in turn to enter a name. Because no known descendants of Maisie, Gerald, or James Pickle have tested yet, I’ll lump them together in the same box for now to avoid cluttering the diagram.

Use the same approach to fill in the descendant tree, tracing down to their DNA-tested descendants. You don’t need to add every single descendant—just the ones that lead to DNA matches—and you don’t need know every person’s real name.

When you reach someone who is a DNA match to the target person, hover over their box, click “Enter Match cM”, enter the shared amount, and click “Save”.  There’s an option to enter % shared if you prefer.

The box will change colors and show the cM amount.

Continue until all of the known DNA matches in the family have been accounted for.  At this stage, my diagram looks like this:

Now we add “dummy” people where we think the target person might fit into the Pickle tree.  It’s possible that Doug Mayo is descended from Maisie, Gerald, or James Pickle. I’m not sure which generation he’s in, so I’ll add three for now.


In each spot where Doug might fit, I hover over the dummy person and select “Use as Hypothesis”.

 

The hypothesis will be automatically numbered and assigned a “score” based on how likely it is relative to the other hypotheses. If there’s only one hypothesis, the score will always be either “1” (possible) or “0” (not possible). Because the scores are based on comparisons to the other hypotheses and are not absolute scores, they will change as I add more hypotheses.

I think it’s also possible that Doug is descended from Jasper Pickle, either through a full sibling to JJ and Annie or through a half sibling.  First, I add a set of dummy people and hypotheses to represent the full sibling line.

Note that all of the hypotheses were automatically renumbered when new ones were added.  Don’t get too attached to the numbering at any one step of the process!

Then, I add another dummy child to Jasper and select “Define Half Relationships”.

Then, I can use the tick boxes to specify which of Jasper’s children are half siblings to this new addition, then click CLOSE.

Here is my finished hypothesis diagram:

Feel free to try this example for yourself and to add additional hypotheses.  For example, try Doug as a child, grandchild, or great grandchild of Martha, Willard, and/or Anise to see what happens.  I ended up with 18 hypotheses, like so:

 

Interpreting the Results

Using the last version of the tree, the first thing I notice is that some hypotheses (H1, H2, H4, H5, H7, H10, H13, H14, H15, and H16) have red flags with “Score = 0”. Those hypotheses are not possible given how much DNA Doug shares with his matches and what we currently know about the ranges for known relationships. I can “Remove Hypothesis” for those if I like (and the remaining hypotheses will be renumbered), or I can just ignore them.

The remaining hypotheses all have green flags and scores that are positive integers.  Once the score = 0 hypotheses are ruled out, the remaining ones are assigned scores starting with 1 for the least likely (H3 in this case) and scaling up from there.  From highest to lowest, the scores are: H6 (score = 293), H9 (288), H8 (204), H17 (201), H12 (8), H18 (4), H11 (2), and H3 (1).

What does this tell us?  What it doesn’t tell us is precisely where Doug fits into the tree. But it does give us some guidance for where to look next.  Four hypotheses have scores in the 200s, while the other viable hypotheses are all in single digits.  Those low-score hypotheses are not where I’d focus my attention, efforts in contacting family, and testing dollars.

If money were tight, I’d focus solely on Jasper Pickle’s line, because that’s where three of the four highest ranked hypotheses are.  If time were of the essence, I’d try to test descendants of Jasper, Maisie, Gerald, and James.  With either approach, when the new results came in, I’d add them to the tree and re-evaluate the hypotheses.

 

But Wait!  There’s More!

Scroll down below the tree, and you’ll see a listing of the hypotheses, their scores, and a status summary for each.

 

Scroll further for a “Collated Match Data” table, that summarizes the DNA matches, their relationships under each hypothesis, and the probability of each relationship given the cM amounts. (I couldn’t fit all 18 hypotheses into the screenshot.)

 

This table can give you insights into which matches are ruling out which hypotheses and sometimes can point to problems in your hypothesis tree.  For example, if a certain hypothesis has high probabilities for all of the matches except one, which has zero probability, it’s worth a check to make sure that the cM amounts and relationships (especially full vs half) were entered correctly.

Finally, down at the bottom of the page are links to the main testing companies, should you need to buy additional DNA tests to confirm your hypotheses.  Using these links won’t cost you anything extra and will help to subsidize the development of new tools for genealogy.

 

Housekeeping

WATO has some housekeeping features that let you save, share, and delete trees and switch among your saved trees.

 

With an account at DNA Painter, you can same multiple trees and switch among them at will.  The most recent tree will be stored in your browser memory, and saved trees are accessible in the “Switch tree” pulldown.

 

For More Help

If you’re on Facebook, join the WATO group for support, strategies, and new developments.

 

Other posts in this series can be found here:

5 thoughts on “Science the Heck Out of Your DNA — Part 7”

  1. Hi Leah – this is great!! Much easier the old table based method & aesthetically quite nice too! I was quickly able to build up a matrix of several hypotheses.

  2. Hi Leah,

    Will this tool work when the target is related to the DNA testers on both maternal and paternal sides? Example: Adam and Meghan are 3rd cousins. Adam’s mother is Meghan’s father 2nd cousin. We need to find Adam’s father who is a relative of Meghan’s mother. We suspect Adam is a 2C1R of Meghan’s. Also, Adam’s parents are predicted 4th cousins according to gedmatch.

    1. It’s not designed for endogamy, so you’ll have to take the scores with a grain of salt. Hopefully, it can still guide you in the right direction.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.