The Autosomal Database Growth Graph is a popular feature of the DNA Geek site. But it’s wrong. And it’s been wrong for some time.
Let me explain. AncestryDNA and MyHeritage both report on their websites the number of people in their DNA databases. 23andMe reports the number of test kits sold. I’ve dutifully plotted those numbers—the only ones available—on the graph all these years assuming that “kits sold” and “kits in database” would track fairly closely to one another.
But I was wrong. To the tune of about 25%. Mea culpa.
Let’s review. As noted above, 23andMe has historically reported “kits sold” rather than “kits genotyped”. That little slight-of-hand, while honest, was misleading about what we genealogists care about: the size of their database. However, in their Spring 2021 Investor Presentation, they came clean with the actual genotyped numbers, dating back to fiscal year 2017, which ended 31 March 2017.
At the time, I was stumped as to what to do. I didn’t want to redo the database growth graph until I knew whether they would continue to report “kits genotyped” going forward or whether they would stick to “kits sold”. Now we have three fiscal quarters in a row that they’ve reported “kits genotyped”, so it’s time for a revamp.
We can now compare how “off” we were in our understanding of the database. As you can see in the graph, the two lines are quite different. The number of kits sold (pale line) is roughly 25% higher than the number genotyped (dark line) in 2019 and 2020. The difference isn’t as stark in 2018, and the values are the same in 2017.
How did this happen?
Late 2016 is when genealogical DNA testing really took off. Around then, 23andMe began selling kits in some drugstores for about $30, not including the lab fee. My guess is two things happened. First, some people bought the $30 kit on impulse but balked at the $169 lab fee. Second, avid genealogists began buying kits in hopes that family and friends would test. (Hands up: Who has a couple of extra kits lying around? 💁♀️) Both situations resulted in kits sold but not getting sent back to be genotyped. I would never have guessed there were so many, certainly not 2 million!
I won’t make that mistake again.
Now, without further ado, here’s the updated graph, as of December 2, 2021. Based on recent trends, I estimate that AncestryDNA has 21.2 million kits, 23andMe 12.1 million, MyHeritage 5.4 million, GEDmatch 1.7 million, and FamilyTreeDNA 1.6 million. With only two data points for Living DNA, I’m not comfortable projecting their current size just yet.