On March 28, 2017, AncestryDNA rolled out a novel and exciting feature called Genetic Communities — GCs for short — as part of their standard DNA test (no additional fee). They are like nothing we’ve seen before in genetic genealogy! As such, they’re a bit hard to describe at first: much more focused than an ethnicity group, which may date to hundreds or thousands of years in the past, and broader than just the DNA relatives on your match list, which typically trace back only a handful of generations.
What is a Genetic Community?
Genetic Communities are defined by clusters of people who share long IBD (identical-by-descent) segments of DNA. AncestryDNA‘s White Paper on GCs doesn’t define “long”, but the scientific paper on which the method is based specifies that the total shared amount of DNA is 12 cM or more. (Because scientific papers are rather dense, I wrote a layperson’s summary that you can read here.) Members of a community may also match people outside the cluster, but their connections to people within the cluster are more numerous and stronger. You won’t match everyone in a GC to which you belong, but everyone in the group matches other people in it.
The communities aren’t defined by ethnicity, rather by the relationship patterns of the people within them. Multiple GCs can be found within a given ethnicity. For example, there are 16 Irish GCs grouped into three broader clusters (Ulster, Connacht, and Munster).
Conversely, the communities can cross ethnicities, such as an African American community with members who derive from African, European, and Native American groups.
Communities also aren’t defined strictly by geography. The island of Sicily has five GCs, three of which have almost identical boundaries in western Sicily. That is, groups of people who live side-by-side but are not intermarrying and forging genetic relationships with one another will form distinct communities.
But wait! If the communities are defined by genetic connections, where do the maps come from? This is where the communities get really interesting. Because IBD segments necessarily mean shared population history, the migration pattern of a community is reflected in the family trees of its members. Once AncestryDNA’s algorithm has defined a genetic cluster, their computers scan the associated pedigrees of the community members for common birth locations and dates. The Ancestry member database is so large that even if some individuals in a cluster don’t have a tree, or have an incorrect tree, the patterns are still obvious. Some GCs can be quite specific (“Germans from Baden-Württemberg in the Dakotas”), and so far, all of the ones I’ve examined are remarkably accurate.
Each community is accompanied by a “Story” that traces the community through time. They tend to start in the 1700–1800 range and track historical developments in that community in 25- or 50-year blocks of time. To illustrate, I will use my own community, “Acadians in Louisiana Cajun Country”.
Louisiana Cajuns are descended from French who first settled Acadia (now Nova Scotia, Canada) in the early 1600s. You can see from the upper panel in the set of migration maps that between 1700 and 1750, France-born people were still immigrating to the New World. The period of 1750–1775 marks a great tragedy in Acadian life. They were forcibly evicted by the British from their homeland in 1755, an event called Le Grand Dérangement (The Great Upheaval). The Acadians — those that survived — were shipped to locations around the Atlantic. Not until 1765 did the first arrivals settle Louisiana Cajun country. Over the next 20 years or so, word of the new paradise spread, and Acadians who had been separated from their families were able to reunite in what is now called Acadiana.
I know my own culture’s history, but if you were placed in a community with no such knowledge, Ancestry’s historians have done a lot of leg work for you. The “story” that I mentioned earlier summarizes the important events in a community’s history for you.
This. Right. Here.
I will describe some of the practical uses of GCs in a bit, but this right here is what I consider the most important long-term contribution of this feature. Imagine that you’re not particularly interested in genealogy, that you only tested to find out your “ethnicity” or because someone gave you the test as a gift. And when you get your results, directly below your ethnicity estimate, the broad strokes of your family’s history are already painted for you.
You follow the links and see a timeline, an historical summary, and interactive maps. If this won’t get you interested in building your own tree, nothing will.
A Community Challenge (Pun Intended)
In fact, I’m so confident that GCs will attract DNA testers to genealogy, that I propose a challenge: count how many of your top 100 DNA matches do not have trees and post the percentage in the comments. In 6 months, we can do the same thing to see whether the numbers have improved.
To get a quick count, I copied all of the text of my first two pages of matches (100 total) into a word processor, then sorted by line. That grouped all of the lines that read ” No family tree VIEW MATCH” together, where they’re easy to count. Of my top 100 matches, 43 do not have trees attached to their DNA results, or 43%. Let’s see how that changes over time.
Right or wrong, I will give away an AncestryDNA test to one lucky person who contributes both “before” (by 15 April) and “after” (in October) data. Follow this blog to get a reminder in 6 months.
A link to “Connections” near the top of your community page takes you to an estimate of how strongly connected you are to that community, a link to view your matches who are also in that community, and a list of surnames associated with that particular GC.
Thus far, I’ve described the science behind GCs and the gee-whiz mapping and storytelling components, but what can we really do with them? My favorite aspect of communities is the ability to filter based on their memberships. At the top of the DNA match list, the existing filters have a new button called Communities. Click it to get a pull-down menu of the communities you are in. In my case, I have Acadians in Louisiana Cajun Country and Acadians, which is a larger cluster that includes other Acadian communities, like Acadians in Central Louisiana, Acadians in the Greater New Orleans Area, and Acadians in the Canadian Maritimes.
The Communities filter is a great complement the Shared Matches tool. Remember, not everyone in a community will match everyone else, so Shared Matches can be used to narrowly filter your matches to what we hope is one specific family lineage, while the GC filter will yield a larger group of people who share a population history.
For unknown parentage searches where the parents come from different GCs, the filtering will be invaluable. Here’s an example of an adoptee who is in three communities.
We already knew that she had ancestry from all three populations, and we’ve been manually “starring” matches to filter them. Unfortunately, there’s only one star, so we’ve had to rejigger everything whenever we’ve switched from working on her Sicilian matches to her southern matches to her northeastern ones. Now we can switch among these groups with a click, and we don’t even have to do the legwork to assign them to groups the first place.
Communities may also help a great deal by subdividing larger populations. For example, an unknown parentage case I’m advising knows that her birth father was Cajun. Her GC tells us that not only was he Cajun, his family settled in the New Orleans area. Recall that Cajuns form three distinct GCs, although ultimately, we are all related multiple times over. Knowing that this searcher belongs to the New Orleans group will help to narrow her search within the large endogamous Cajun network.
Of course, no tool is perfect. Not everyone will be assigned to a GC, although most will be. Of 82 kits that I examined, only 6.1% were not in at least one community. Of the ones who were, 42.7% were in one community, 31.7% were in two, 15.9% were in three, and 3.7% were in four. (The counts here include only the narrowest assignments. I am in Acadians in Louisiana Cajun Country, which is nested within Acadians. I’m only counting that as one GC.)
Similarly, some of us will not be in communities that we rightfully expected. For example, my father is German and Irish, but the only community I have is for Cajuns. His ancestors immigrated to New Orleans during a massive wave of resettlement in the mid- to late 1800s. I assume that the German–Irish gene pool in New Orleans is not sufficiently interwoven to create a network, and our connections to Ireland and Germany are too weak to place us in communities there. Hopefully, as more people test and more IBD connections are added to the global network, those of us who do not have a GC yet (or are missing GCs that we expect) will gain them.
Another oddity is that my son is in the Acadians in Louisiana Cajun Country GC, but my daughter is only in Acadians. They share similar amounts of DNA with my mother (their Cajun grandmother): 1550 cM and 1485 cM, respectively. I can’t quite wrap my head around why, because all of my daughter’s closest Cajun matches are in the more specific GC. Along the same lines, my son is in one of his paternal grandmother’s two GCs, but my daughter is in neither, despite sharing slightly more DNA with her Grammy than her brother does (1799 cM vs 1824 cM). Neither of the kids shares a GC with their paternal grandfather, who’s in two.
What Do You Think?
I am really excited by GCs and can’t wait to see how other people use them. How many do you have? Do they make sense given what you know about your family’s history? How do you plan to use them? Share your thoughts in the comments.