Genetic Communities Are Here!

On March 28, 2017, AncestryDNA rolled out a novel and exciting feature called Genetic Communities — GCs for short — as part of their standard DNA test (no additional fee).  They are like nothing we’ve seen before in genetic genealogy! As such, they’re a bit hard to describe at first: much more focused than an ethnicity group, which may date to hundreds or thousands of years in the past, and broader than just the DNA relatives on your match list, which typically trace back only a handful of generations.

 

What is a Genetic Community?

Genetic Communities are defined by clusters of people who share long IBD (identical-by-descent) segments of DNA. AncestryDNA’s White Paper on GCs doesn’t define “long”, but the scientific paper on which the method is based specifies that the total shared amount of DNA is 12 cM or more. (Because scientific papers are rather dense, I wrote a layperson’s summary that you can read here.) Members of a community may also match people outside the cluster, but their connections to people within the cluster are more numerous and stronger. You won’t match everyone in a GC to which you belong, but everyone in the group matches other people in it.

The image on the left is an IBD network, with the circles representing individual people and the lines shared DNA. Everyone is connected to some other members in the network but not necessarily all of them. On the right, we see the same network after a community detection analysis that clusters the network members based on the strength of their genetic relationships.

 

The communities aren’t defined by ethnicity, rather by the relationship patterns of the people within them. Multiple GCs can be found within a given ethnicity. For example, there are 16 Irish GCs grouped into three broader clusters (Ulster, Connacht, and Munster).

Geographic ranges of the three broad clusters of Irish genetic communities. There are five Ulster (northern) communities, five Connacht (western) communities, and six Munster (southern) communities.

 

Conversely, the communities can cross ethnicities, such as an African American community with members who derive from African, European, and Native American groups.

Ethnicity estimates for two members of the “African Americans in Louisiana” community.

 

Communities also aren’t defined strictly by geography. The island of Sicily has five GCs, three of which have almost identical boundaries in western Sicily.  That is, groups of people who live side-by-side but are not intermarrying and forging genetic relationships with one another will form distinct communities.

The five communities on the island of Sicily. The three communities for “Sicilians in Western Sicily” have essentially identical geographic boundaries.

 

Historical context

But wait!  If the communities are defined by genetic connections, where do the maps come from? This is where the communities get really interesting. Because IBD segments necessarily mean shared population history, the migration pattern of a community is reflected in the family trees of its members. Once AncestryDNA’s algorithm has defined a genetic cluster, their computers scan the associated pedigrees of the community members for common birth locations and dates. The Ancestry member database is so large that even if some individuals in a cluster don’t have a tree, or have an incorrect tree, the patterns are still obvious. Some GCs can be quite specific (“Germans from Baden-Württemberg in the Dakotas”), and so far, all of the ones I’ve examined are remarkably accurate.

Each community is accompanied by a “Story” that traces the community through time.  They tend to start in the 1700–1800 range and track historical developments in that community in 25- or 50-year blocks of time. To illustrate, I will use my own community, “Acadians in Louisiana Cajun Country”.

Ancestral birth locations of the Acadians in Louisiana Cajun Country community during four 25-year time periods. The dotted lines lead to the final migration destination in south-central Louisiana.

 

Louisiana Cajuns are descended from French who first settled Acadia (now Nova Scotia, Canada) in the early 1600s. You can see from the upper panel in the set of migration maps that between 1700 and 1750, France-born people were still immigrating to the New World. The period of 1750–1775 marks a great tragedy in Acadian life. They were forcibly evicted by the British from their homeland in 1755, an event called Le Grand Dérangement (The Great Upheaval). The Acadians — those that survived — were shipped to locations around the Atlantic. Not until 1765 did the first arrivals settle Louisiana Cajun country. Over the next 20 years or so, word of the new paradise spread, and Acadians who had been separated from their families were able to reunite in what is now called Acadiana.

I know my own culture’s history, but if you were placed in a community with no such knowledge, Ancestry’s historians have done a lot of leg work for you. The “story” that I mentioned earlier summarizes the important events in a community’s history for you.

The historical “story” of the Acadians. The snippet for each time period can be expanded for a fuller description.

 

This. Right. Here.

I will describe some of the practical uses of GCs in a bit, but this right here is what I consider the most important long-term contribution of this feature. Imagine that you’re not particularly interested in genealogy, that you only tested to find out your “ethnicity” or because someone gave you the test as a gift. And when you get your results, directly below your ethnicity estimate, the broad strokes of your family’s history are already painted for you.

The Genetic Communities are positioned directly below the Ethnicity Estimate. This tester was placed in two different communities.

 

You follow the links and see a timeline, an historical summary, and interactive maps.  If this won’t get you interested in building your own tree, nothing will.

 

A Community challenge (pun intended)

In fact, I’m so confident that GCs will attract DNA testers to genealogy, that I propose a challenge: count how many of your top 100 DNA matches do not have trees and post the percentage in the comments.  In 6 months, we can do the same thing to see whether the numbers have improved.

To get a quick count, I copied all of the text of my first two pages of matches (100 total) into a word processor, then sorted by line. That grouped all of the lines that read ” No family tree VIEW MATCH” together, where they’re easy to count.  Of my top 100 matches, 43 do not have trees attached to their DNA results, or 43%.  Let’s see how that changes over time.

Right or wrong, I will give away an AncestryDNA test to one lucky person who contributes both “before” (by 15 April) and “after” (in October) data.  Follow this blog to get a reminder in 6 months.

 

Connections

A link to “Connections” near the top of your community page takes you to an estimate of how strongly connected you are to that community, a link to view your matches who are also in that community, and a list of surnames associated with that particular GC.

My “Connections” page. I have 95% confidence of belonging to the Acadians in Louisiana Cajun Country community. Of the 20 surnames listed, 15 are known to be in my direct ancestry.

 

Practical applications

Thus far, I’ve described the science behind GCs and the gee-whiz mapping and storytelling components, but what can we really do with them?  My favorite aspect of communities is the ability to filter based on their memberships. At the top of the DNA match list, the existing filters have a new button called Communities. Click it to get a pull-down menu of the communities you are in.  In my case, I have Acadians in Louisiana Cajun Country and Acadians, which is a larger cluster that includes other Acadian communities, like Acadians in Central Louisiana, Acadians in the Greater New Orleans Area, and Acadians in the Canadian Maritimes.

Genetic Communities can be used to filter DNA matches.

 

The Communities filter is a great complement the Shared Matches tool.  Remember, not everyone in a community will match everyone else, so Shared Matches can be used to narrowly filter your matches to what we hope is one specific family lineage, while the GC filter will yield a larger group of people who share a population history.

For unknown parentage searches where the parents come from different GCs, the filtering will be invaluable. Here’s an example of an adoptee who is in three communities.

This adoptee is in three genetic communities: Settlers of the South Carolina, Georgia & Northeast Florida Coast (nested within a larger network called Early Settlers of Georgia & Florida), Early Settlers of the Northeast, and Sicilians.

 

We already knew that she had ancestry from all three populations, and we’ve been manually “starring” matches to filter them. Unfortunately, there’s only one star, so we’ve had to rejigger everything whenever we’ve switched from working on her Sicilian matches to her southern matches to her northeastern ones. Now we can switch among these groups with a click, and we don’t even have to do the legwork to assign them to groups the first place.

Communities may also help a great deal by subdividing larger populations.  For example, an unknown parentage case I’m advising knows that her birth father was Cajun. Her GC tells us that not only was he Cajun, his family settled in the New Orleans area. Recall that Cajuns form three distinct GCs, although ultimately, we are all related multiple times over. Knowing that this searcher belongs to the New Orleans group will help to narrow her search within the large endogamous Cajun network.

 

Limitations

Of course, no tool is perfect. Not everyone will be assigned to a GC, although most will be. Of 82 kits that I examined, only 6.1% were not in at least one community. Of the ones who were, 42.7% were in one community, 31.7% were in two, 15.9% were in three, and 3.7% were in four. (The counts here include only the narrowest assignments. I am in Acadians in Louisiana Cajun Country, which is nested within Acadians.  I’m only counting that as one GC.)

Similarly, some of us will not be in communities that we rightfully expected. For example, my father is German and Irish, but the only community I have is for Cajuns. His ancestors immigrated to New Orleans during a massive wave of resettlement in the mid- to late 1800s. I assume that the German–Irish gene pool in New Orleans is not sufficiently interwoven to create a network, and our connections to Ireland and Germany are too weak to place us in communities there. Hopefully, as more people test and more IBD connections are added to the global network, those of us who do not have a GC yet (or are missing GCs that we expect) will gain them.

Another oddity is that my son is in the Acadians in Louisiana Cajun Country GC, but my daughter is only in Acadians. They share similar amounts of DNA with my mother (their Cajun grandmother): 1550 cM and 1485 cM, respectively. I can’t quite wrap my head around why, because all of my daughter’s closest Cajun matches are in the more specific GC. Along the same lines, my son is in one of his paternal grandmother’s two GCs, but my daughter is in neither, despite sharing slightly more DNA with her Grammy than her brother does (1799 cM vs 1824 cM).  Neither of the kids shares a GC with their paternal grandfather, who’s in two.

 

What do you think?

I am really excited by GCs and can’t wait to see how other people use them. How many do you have? Do they make sense given what you know about your family’s history? How do you plan to use them? Share your thoughts in the comments.

101 thoughts on “Genetic Communities Are Here!”

  1. I am so excited about this!! Thank you for this great information!! I can’t wait to see how this helps in adoptee cases. I will send in my information.

    1. 70 with either no tree or trees with only a handful of people!!!
      I must descend from a group of private people. I have 764
      4th cousins or closer.

      I don’t understand the genetic communities at all since my brother, daughter and I all belong to different ones, despite sharing many of the same matches. (We do show up as brother – sister, mother-daughter, etc.) For instance one of the groups that my daughter was placed in listed 3 relatives that I match with also as examples of people that she matches within the group! None of our probabilities of belonging to a group were over 25%. I would guess this is due to having ancestors from all over the East Coast (Puritans as well as Colonial Va, Ga, NC and MD.

  2. % of “No Family Tree” for each kit I administer:

    Me – 37%
    Sibling “JM” – 40%
    Sibling “MM” – 36%
    Uncle “DK” – 49%
    Cousin “HM” – 47%
    Great Uncle “LWD” – 45%
    Great Uncle “RED” – 37%

  3. 29 out of top 100 have no tree connected to dna results. another metric 58 hints compared to 198 4th cousins or closer.

  4. Will this be automatically attached to the DNA test we have already taken? I don’t have to buy a new kit and take another one, right? I think this is exciting! For ME, being of Italian descent, I already see a map for my “line” that mirrors the march across the world as the Roman empire conquered all! 😀 (just teasing!)…. (but really, yes)

    1. Yes, it’ll be included with the test you’ve already taken, where the ethnicity estimate currently is. I think you’ll be surprised at how many Italian GCs there are.

  5. This is so incredibly exciting for the grandson of an adoptee born in another part of the US. I’ve been working on my grandmother’s tree for years and years, and I’m praying for at least one community to help narrow my search for her roots. Thanks for this excellent explanation of the new feature – I feel like a kid on Christmas Eve!

  6. Very interesting introduction to this new feature! I am hoping that when the update rolls out tomorrow I am assigned to GCs! I was shocked to find that 41% of my top 100 matches do not have trees! I had not expected the % to be that high. Here’s hoping your predictions are correct and that number starts to dwindle as more trees are added.

  7. Well, I counted 47 in my top 100 but I have to say that I’m also counting trees with 5 people or less (not that many) but I’ve found that many of these tiny trees have only “private” people or not much info at all. I am trying to find the identity of my maternal grandmas father. No paper trails and my great grandmother could’ve been in a few different states so I was hoping that DNA would help. In 2 years I’ve managed to identify several surnames that are extremely common in our matches but still not even close to finding this illusive family branch. Maybe this will offer some hope.

    1. That’s what I did as well: counted all trees, whether tiny and/or private. I didn’t count unattached trees. I just wanted a quick metric that we could use again in a few months to see if there’s any improvement.

    2. I’m in the same boat, looking for my paternal grandmother’s family.
      I do have her surname which I hope is correct. Most potential matches don’t respond to my inquiries or if they do respond, don’t know anything. Good luck in your search, may be someday…….

  8. Fascinating! I haven’t tested with Ancestry (unless I win your free kit) so I can’t play with this. But can you imagine what a huge ball of interwoven strands there will be for someone with an Ashkenazi Jewish background like mine?

    1. There are a few different Jewish groups based on geographic region of origin. It would be interesting to see where you’re placed.

  9. Genetic communities sound wonderful-perhaps a great way to circle my 208 4th cousins with some control. Having 4 distinct family groupings from Norway and 2 from Sweden is chaotic. The misc. mixture left over is also interesting to sort out. Yes for this brilliant idea and thanks.

  10. Before: 32/100 do NOT have linked trees
    (I guess I’m lucky, because I have seen numbers much higher than this.)
    But, to be fair, I have a handful of others attached to MY own tree as family testers.

  11. Before: 39 that say “no family trees,” though 24 of those do actually have trees not attached to their DNA, so I’m not sure whether to say 39 or 15 🙂

  12. I manage 14 trees. I counted up the No Trees on the first 2 pages of each test and they average overall 51%. Lowest on one test was 32% and highest test was 81%. Most were in the range of 45% – 55%

  13. I have 42 with no trees and see that it could have been worse. I expect to be more n an English in Colonial North Carolina GC.

  14. 43% do not have trees*
    Of the 57% that have trees, the following people statistics obtain:
    Mean = 763
    Median = 171
    Least = 2
    Most = 10,621
    *My father, siblings, children, & grandchildren were excluded.

  15. This is exciting, for sure! I have 42 without trees in my first 100 matches. Have 862 4th cousins or closer and 25 DNA Circles.

  16. When will this be rolled out for everyone?

    I have 4 ancestry kits:
    Kit 1: 52% without tree
    Kit 2: 39%
    Kit 3: 48%
    Kit 4: 42%

  17. Out of over 4300 matches on Ancestry.com my NFT/No Family Tree ratio is 55/100 (…not including a newly found/unbeknownst to me 1/2 sister who magically appeared approx. 2 years ago. I am 82% African(American) per __A.com__:

  18. I administer several, so I picked the one that I hadn’t already culled out the “no tree” matches – my tree R.M. had 40 no tree matches out of top 100 for 40%

  19. Before – 50% …. and I only have 82 4th cousins or closer and no circles, I am hoping I get a lot more of both within a few months when more aussies get theirs done

  20. Wow! This is fantastic! Maybe this will finally narrow down the possibilities of my grandmother’s biological family! Any news on if this is only for new Ancestry DNA kits, or will it be applied to those who have previously tested and already have kits on their account?

    Oh, yes of course. Out of my top 100, 41 do NOT have trees.

  21. Of my first 100 matches, 52 do not have a family tree and 48 do. But that doesn’t tell the whole story. Of the 48 that have family trees, 13 have less than 10 people on their tree! This has been very frustrating to me since I was adopted and the reason I tested was to find my birth family. I found out through Ancestry that I am 47% European Jewish and now know that it was my father who was Jewish. It is the Jewish segment that tends to have either no tree or very small trees.

  22. 55 of my top 100 DNA matches have trees, but I know there are more that just don’t have their DNA attached to their tree.

  23. Thanks for the great article. I am in Early Settlers of PA, OH, and IN and also in Settlers of Chesapeake Bay. Makes perfect sense from what I know of my family.

  24. For my father’s test:
    70/100 have no tree/linked tree. 🙁
    Mother’s test:
    51/100

    I just checked the genetic communities, and everyone was assigned to a community. Results are as expected, but I did think that I would see more.

  25. Me 43 with no trees in first 100
    Mom 44 with no trees in first 100
    Uncle 44 with no trees in first 100

    Surprising how close all mine are even though they aren’t all the same matches.

  26. Of my top 100 DNA matches, I have 39 without trees. So that is 39%. Wow I had not looked at it that way before. Interesting.

  27. As of 3/28/17 top 13 kits are mine and attached to public tree, six siblings

    IMTB 38 no tree, 6 private
    LLT 43 no tree, 8 private
    PET 31 no tree, 6 private
    HAKT 45 no tree, 6 private
    CLT 39 no tree, 5 private
    RAT 34 no tree, 6 private

  28. Of top 100 matches for me:

    58 No Trees
    1 Tree not available (not private, says not available)
    41 Trees (7 of these have 10 or less people in them)

  29. I stumbled across the GC this morning, but so far I have zero, and I have Acadian lineage. My family was one of the first settlers there, Trahan, and a large portion ended up in Lousiana. My branch (2G grandma) instead went west. So I don’t understand why I’m not at least on an Acadian community.

    1. It may be that your Acadian is too dilute to connect you to the GC. One 2-great grandmother would make you 1/16th Acadian (assuming she herself was “full-blooded”). That might not be enough to tie you into the community.

      By the way, Jeanne Trahan was my 10th great grandmother ten times over. Also my 11th great grandmother.

      1. We are related! All Trahans that came from Acadia are the same family.

        I checked out the GC FAQ, and came to the same conclusion. While most Trahans went south, my branch went west…and diluted. 🙂

  30. I’m excited, this will allow me to narrow down some possible genetic relatives, rather than looking at thousands of possibilities.

  31. 44% of my top 100 DNA matches are listed on Ancestry as having no tree. (At the end of March 2017)

  32. 49% of my 100 top DNA matches have no tree, as of 3/29/2017. Thanks for your great overview of the use of this new tool!

  33. I have done mtDNA, autosomal, and Y-chromosome (my brother’s sample), but all with FTDNA. I have an extensive family tree but none posted anywhere on the Internet. Would I be able to use my FTDNA tests (converted in some way to Ancestry ?) to participate? Or do I have to re-take with Ancestry? I, too, am Acadian–90-95% by my estimation. So the results would be pretty predictable. But I’d be curious. Thanks for this blog and all the information, made very understandable for laymen! And thanks for any suggestions.
    Claire Bettag

  34. 45 have no family tree.

    An easy way to do this is do a search for “view match” to see how many connections are on each page. I had 50 on each page. Then on each of those two pages I searched for “no family tree”. It took less than a minute.

  35. It has me in the Ulster Irish genetic community but that then begs the question what the difference is between the Ulster Irish and Scotch-Irish as everyone seems lumped together. The people of Ulster are definitely a different breed from the rest of Ireland and have a lot in common with lowland Scotland.

Leave a Reply

Your email address will not be published. Required fields are marked *