How Are Our Databases Doing?

 

Periodically, I plot the autosomal database sizes to track trends in DNA testing over time. These plots are available for anyone to use in their genealogy presentations, with attribution.

New information was released at the recent RootsTech 2025 conference, so it’s time for an update!

Autosomal database sizes through March 2025

There are now more than 53 million tested DNA kits across the four main DNA testing companies: AncestryDNA, 23andMe, MyHeritage, and FamilyTreeDNA. Of course, not all of those represent unique individuals, because many of us have tested in multiple databases.

Long gone are the heady days of early 2018, when DNA kits were flying off the shelves. By my estimates, from February to March 2018, AncestryDNA alone was adding nearly 28,000 DNA kits per day. Had they continued to grow at that rate, their database would have more than 80 million people in it!

Of course, growth like that was not sustainable; sales would have dropped off no matter what. After all, there are only so many people willing to test.

A graph of time and the curve in it.
Modified from CNX OpenStax, CC-BY4.0

However, the pattern we see does not reflect market saturation. Market saturation would look more like this, with a gradual tapering of growth until everyone who wanted a DNA test already had one.

 

Instead, we’re seeing a new normal. Since the inflection point in mid-2018, AncestryDNA’s growth has been slower than before but steady. 23andMe and MyHeritage both slowed around the same time but continued at a steady pace until roughly January 2023, when MyHeritage’s growth ticked up a little and 23andMe’s ticked down. FamilyTreeDNA and GEDmatch have both seen declines, although it’s hard to see on the graph because of the scale.

(Note that we don’t have enough recent data to gauge how 23andMe’s recent troubles have affected their sales.)

I suspect that the advent of forensic genetic genealogy (FGG) in 2018 and the repeated subsequent scandals caused many—but not all—consumers to rethink DNA testing.

What do you think?

 

31 thoughts on “How Are Our Databases Doing?”

  1. I actually waited until I retired before I took a DNA test. My daughter did 23&Me when it was first offered to the public. She shared her results with us and our medical backgrounds decided we did not want to know about things we cannot change. Everything changed when the same daughter put her DNA into GedMatch and found a paternal half sister to me. She spent four days analyzing all the possibilities before she sat me down and gave the results. No doubt, I have a half sister who had no idea that her mom’s ex husband was not her biological father. I sent my new sister an email and phoned her when we got back to our home. She answered and told me she always wanted a sister. Then, she asked if I was a dog or a flower. That occurred in the summer of 2022.

  2. I think some people are testing to identify what part of the world they are from. So very any of my “matches” have no trees!.

  3. One way to consider ‘saturation’, is to investigate the first derivative. When the rate of change of the data (Y axis – number of sales) changes from a positive number to negative, then ‘saturation’ is seen to have begun.

    In your theoretical graph, this point is close to directly above the letter “e” in the word time.

    With the Ancestry data, it seems to be near Jan. 2019,

    1. There was certainly an inflection in 2018 for most of the companies (MyHeritage is the exception), but the rate of change hasn’t continued to decline. It’s steady now rather than exponential.

  4. I had paid for eight members of my family to take DNA tests. Since then
    Six have asked me to remove

  5. I think there are two additional factors which will affect database size. Firstly, popularity which ebbs and flows and can cause deviations from your proposed model. Secondly, time (population change) in which people will be born and die and may result in newly interested people adding to databases. I guess the challenges may come from companies removing deceased individuals from their databases.

  6. Does the GEDmatch line end at Jul/2024? I would have expected more GEDmatch based on new daily matches.

  7. I definitely support genetic genealogy for serious criminal offences. Although unfamiliar with the related issues, I do recognize that forensic genealogy is extremely important for the families/indigenous groups to determine human remains.

  8. I did my DNA with Ancestry and have continued with a subscription with Ancestry however they have so many mistakes in the data you find with them, for instance, they don’t seem to get surnames on census records typed correctly. If I can read them why can,t they. I spend time in sending them corrections but I don’t think they make changes. It is frustrating. I also think some companies charge too much for their test.

    1. I don’t believe they actually can “make changes” BUT your correction/addition shows up to subsequent viewers.

  9. In addition to individuals testing in multiple databases, there’s people that test more than once in the same database. That’s primarily with 23andme (because they kept introducing “new and improved” chips for people to test on). But that’s even the case with Ancestry. I have several matches on Ancestry who tested twice. You have to wonder why anyone would do that (spend money to get the exact same results they already have).

  10. I was on all testing sites, but deleted all my kits on GEDmatch and 23andMe awhile ago. I have also began to become more privacy focused lately and plan on deleting my kits and trees on the other sites. They’re all just terrible in my opinion and I will not be using them any longer.

  11. Interesting for sure what I do know is that I created a New Genetic Community specifically Albacete, Castilla La Mancha, Spain. I think that Ancestry is using that to replace it’s Sardinia, Italy Genetic Community that it had a couple of years ago? I’m sure that they will get more DNA Testing Kits Results on Ancestry now to reinforce as well as confirm the results from the DNA Testing Kits Results already available for that region? That’s the direction that I see Forensic Genealogy across the different companies going 🤔

    1. I’d be surprised if Albacete replaced Sardinia, as the two should be genetically distinct. But you’re correct that as they receive more DNA tests representing any given region, they’ll refine their genetic communities. To be clear, this isn’t forensic in nature; Ancestry prohibits forensic uses of it’s platform.

  12. With the way Ancestry is putting more and more of the DNA options behind paywalls I bet we see less people willing to test with them. And with MH getting rid of uploads from other sites, there’s no reason to buy additional kits from Ancestry.

    1. Has Ancestry put any existing features behind a paywall? As I recall, the Pro Tools are all new features.

      1. LOTS of stuff that used to be free is now behind a paywall, and requires a paid membership to access. Examples include Thrulines, the SideView parental side info, and even shared matches (just three are shown).

  13. Sorry to be late to this discussion but have you considered that the number of people in the databases is actually made up of two distinct groups.
    1. The Family Historians. Surely after all these years and the cost coming down so much anybody in this group who wanted to test will have tested by now. The growth in this group has indeed saturated and the only growth is made up of people new to the hobby (obsession ?) and those who change their minds about testing. So basically a very small growth rate.
    2. The Ethnicity group. This group has a steady demand, probably correlating to advertising campaigns. This is currently the much larger group and is masking the saturation in the first group. By my, admittedly very unscientific, calculations when I did my Ancestry test (2019) about one-third of my matches had substantial trees (> 100 people). Currently that’s running about one in ten.

    One can understand the testing companies pivot to Ethnicity since the potential market is basically most of the population of the USA, Canada, Australia, New Zealand. All those whose ancestors came from somewhere else. Mine, like many Europeans, basically come from where I am now within a hundred miles or so.

    Unfortunately for us that means that the database growth is not as useful as it would appear as it is made up of about 90% of people who have no useful trees.

    1. I think the market has been driven by ethnicity testers for the past decade. As you note, the family historians have already tested. The ethnicity people, though, can become family historians. And even if they never put up a tree, we can often figure out who they are and use them to advance our research anyway. The more, the merrier!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.