Genealogical Database Sizes—August 2017 Update

It’s time once again for one of my favorite posts: an update of how many people are in the autosomal DNA databases!  AncestryDNA just announced that they have tested 5 million people!  I’ll refer you to Debbie Cruwys Kennett’s excellent blog post about AncestryDNA’s recent sales growth and their expansion into countries outside of the US.

I took the opportunity to check in with GEDmatch, and updated my graph of testing growth for them, as well.  They’re at 650,000, an increase of 22% since April!  The line in the graph for 23andMe is flat because they have fnot announced recent figures. FTDNA has never publicly revealed how many are in their autosomal DNA database; the estimate for them comes from the ISOGG wiki.

Also note that both AncestryDNA and FTDNA tests are on sale right now (AncestryDNA through the 15th, FTDNA through the end of August), so their databases should show the effects of those sales in the coming months.

Feel free to use this graph, with proper credit, in educational presentations to the genealogical community.

15 thoughts on “Genealogical Database Sizes—August 2017 Update”

  1. The statistics you claim here have never been released by FTDNA are on this webpage of theirs:
    https://www.familytreedna.com/why-ftdna.aspx
    It adds up to almost 2.7 million as of January 30, 2018. This is database size, not number of tests done as you are graphing. It doesn’t matter how many each company/organization has tested. Your graph is GIGO.

    Being one who is primarily interested in tracing my paternal line, I find it odd you ignore stats for Y-DNA. Or mt-DNA for those Interested in their maternal line. Where are those graphs?

    1. There are two separate issues here:

      1) I emphatically disagree that “it doesn’t matter how many each company/organization has tested”. The larger the database, the more likely we are to find previously unknown DNA matches that will help with our genealogical questions. I graph the sizes of the autosomal databases to help people decide where to do that type of testing. Because there are several companies offering atDNA testing, and because database size is a primary factor in deciding where to test, an objective comparison of how many people are in each database is useful information.

      I don’t graph the yDNA and mtDNA database sizes because FTDNA is the only game in town for matching for those types of DNA. If you want yDNA and mtDNA matches, it’s FTDNA or nothing; database size is irrelevant.

      2) Let’s review the official FTDNA database numbers as of 30 Jan 2018, per the link you gave. They are:

      9,894 SURNAME PROJECTS
      549,773 unique surnames
      649,146 Y-DNA records in the database
      330,218 25-marker records in the database
      308,794 37-marker records in the database
      163,137 67-marker records in the database
      286,011 mtDNA records in the database
      127,072 FGS records in the database

      If I add up every single number listed, it sums to a little over 2.4 million (not 2.7 million as you said). BUT, that’s not an accurate representation of the sizes of their yDNA and mtDNA databases. First, some of those numbers (surname projects, unique surnames) aren’t DNA tests at all. Second, FTDNA is double-counting some of their tests. For example, everyone who has taken the Y-67 test is included in both the Y-37 and Y-25 totals. (That is, they’re counted three times.) Their total for yDNA records is 649,146 and their total for mtDNA is 286,011. Those are the numbers that matter for anyone “fishing” for DNA matches.

  2. Hi, I was extremely interested and thankful to see this post (I have just subscribed). Thanks for it.

    Database size isn’t the only factor of interest of course. I live in Australia, and I tested with FTDNA several years ago, when, as far as I know, Ancestry wasn’t sending test kits to Australia. I later (last year) also tested with Ancestry. The interesting thing is that if I count 4th cousin or better matches (for which FTDNA has a slightly tighter criterion), I have twice as many matches with FTDNA than with Ancestry. This suggests to me that while Ancestry’s database is way bigger (about 10 times on your figures here), its database includes far less Australians, who are my closest matches.

    Therefore, if ever you were able to find it out, it would be extremely interesting to see the continental breakup for the different databases. My impression is that FTDNA has a greater percentage of non-Americans than Ancestry and My Heritage has a greater percentage of Europeans.

    Do you have any thoughts?

    1. I agree that FTDNA and MyHeritage probably have better representation from non-English-speaking countries. I’ve heard (but can’t document) that Ancestry has tested more people in the UK than any other company. I don’t have even rumors about Australia. Unfortunately, none of the companies release country-by-country figures. I’ve actually been planning a survey to gauge how many DNA matches we get based on country of origin (or, those of our grandparents). The issue that’s held me back so far is that the matching criteria at the various companies are different, and I’m not sure how that will affect the numbers of matches. Thanks for the reminder, though, to look at that again.

      1. My Experience, is that it doesn’t matter the Size of Data Base, or which Country tests more, I took the test to break a brick wall, and prove my research, Most of those people who test on Ancestry don’t even have a tree, which helps? nothing. …..
        The best thing about Ancestry was the price, and the ability to download raw data to Gedmatch, and my Heritage. I also loaded to family tree DNA but I am so far undecided, So I will say the jury is out on that site, except, I expected more, surname mangers dont answer your emails, blanky canvas on one site, happened to be The Surname I needed.
        I thinks it’s important educate test takers as much as possible, as this blog is doing!!!
        I also give a big KISS to the Chromosome Painter.com, it made everything click for me. I am not affiliated with anyone.
        Warmest regards to all you genealogy junkies out there.

  3. Hey, I’ve seen your survey and completed it. But I want to share some interesting statistics. I live in Australia and my ancestors come from UK. I have tested at FTDNA, Ancestry and uploaded to My Heritage. When I analyse my match lists I get these results (I have used 20 cM as my criterion because that seems to be what Ancestry uses):

    No of matches:

    Ancestry 17,900 (approx)
    FTDNA 2214
    My Heritage 2836

    Number of matches above 20 cM total:

    Ancestry 116 (i.e. almost all matches were below 20 cM)
    FTDNA 2211 (i.e. only 3 matches were below 20 cM)
    My Heritage 88 (i.e. 97% were below 20 cM)

    Lowest total cM included in match list:

    Ancestry 6 cM
    FTDNA 19 cM
    My Heritage 8 cM

    That is a pretty amazing result, don’t you think. FTDNA, which ostensibly has the smallest database, has way more useful matches. Have you ever done the same for someone living elsewhere? It would be very revealing.

    Thanks.

  4. Sorry to keep bombarding your comments, but I can see where I got that wrong. Ancestry & FTDNA count different sized segments (Ancestry > 3cM, FTDNA > 1 cM according to ISOGG). So I wasn’t comparing like with like.

    But using FTDNA’s chromosome browser, I am able to calculate what any of FTDNA’s matches would be counted on Ancestry.

    I tried three random matches at the bottom of my FTDNA list (i.e. 20cM, about 2200th on my match list. These worked out as follows:

    A: 18.99 cM on Ancestry, about 150th on their list.
    B: 12.63 cM on Ancestry, about 1000th on their list.
    C: 11.26 cM on Ancestry, about 1700th on their list.

    This seems to show that FTDNA still gives me more and better matches, though not nearly as much as I first said. I still think it shows that database size isn’t as important as we might think, in some locations at least.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.