Every once in a while, untruths spread through the genetic genealogy world that have the potential to do great damage: damage to adoptees seeking their biological families, to genealogists hoping to tackle brick walls, and to a thriving new tech industry that enhances an established, traditional hobby. So, every once in a while, we need to correct the record.
The issue I’m concerned about today is the compound myth that some of the DNA companies are selling our data without our consent; that we are contractually required give them free access to our data in order to test; and that our data is being uploaded to sites like GEDmatch, MyHeritage, and Family Tree DNA to re-identify us. None of those things is happening, and these false rumors need to be squelched whenever they appear. The genealogy community deserves the truth, not fear-mongering.
Let’s look at the facts.
FACT: DNA Companies Are Not Selling Our Data Without Our Permission
None of the major testing companies—AncestryDNA, 23andMe, MyHeritage, Family Tree DNA, or Living DNA—sell our data to third parties without our permission. Here are key sections of their respective policies:
AncestryDNA Terms and Conditions: “Any sharing of Genetic Information for scientific research is governed by our Informed Consent to Research, which only applies if you expressly agree to participate.”
Translation: None of these five companies sell or give our genetic data without our explicit permission.
FACT: Company Research Programs Are Optional
All five of the main testing companies occasionally conduct research studies, either in-house or using outside collaborators, based on the genetic data of their customers. They only do so with the explicit, informed consent of each individual research participant, in line with universal research protocols for studies involving human subjects. These programs are always optional, and customers are under no obligation to participate in the research studies in order to take a DNA test.
AncestryDNA Informed Consent: “Your consent to participate in this research is completely voluntary and is not required to use any of our products or services. Even if you consent to participate in the research, you may withdraw your consent at any time.”
23andMe Research Consent Document: “Your participation in the 23andMe Research study is completely voluntary.” Later, the document says “Choosing not to give consent or withdrawing from 23andMe Research will not affect your access to your Genetic Information or to the Personal Genome Service.”
MyHeritage Informed Consent Agreement: “Participation in this Project is purely voluntary and you may withdraw your consent to participate in the Project at any time.”
Living DNA Research Consent: “You are in charge of how your DNA and the data derived from it will be used, so you can opt out of this project at any time.”
Family Tree DNA‘s Informed Consent for Research is not published on their website.
Translation: These research programs are entirely optional. We do not have to participate to be able to take a DNA test, and if we decide to participate then change our minds, there are no repercussions.
FACT: Our Data Is Not Floating Around Unprotected
Remember that your data is never used for research without your explicit permission. When it’s used, it’s protected as fully as in the main customer databases at each company.
AncestryDNA Informed Consent: “we require our Collaborators and Collaborator Partners to use similar physical, technical, and administrative procedures [to our own privacy protections] to protect the Data and Biological Samples we share with them.”
23andMe Research Consent Document: “Some of these studies may be sponsored by or conducted on behalf of third parties, such as non-profit foundations, academic institutions or pharmaceutical companies.” Even when outside partners sponsor (pay for) research, as with 23andMe’s recent partnership with the pharmaceutical company GlaxoSmithKline, the research itself is done by 23andMe scientists, and only the summary results are shared with the research partner. Your genetic data is only shared outside the company if you also agree the (optional) Individual Data Sharing Consent.
MyHeritage DNA Informed Consent Agreement: “[MyHeritage and its affiliates] use a combination of physical, technical, and administrative procedures to protect the privacy and security of information.”
Living DNA Research Consent: “The Living DNA Global Research Project operate under ISO:27001 for information security and take great care to ensure the integrity of your data so that the risk of loss or leakage is minimized.”
Family Tree DNA‘s Informed Consent for Research is not published on their website.
Translation: No, your data will not end up in GEDmatch or another database to re-identify you. 23andMe and Living DNA do their research studies in-house, so our data doesn’t leave their company without additional, explicit permission from us. Collaborators with AncestryDNA, and MyHeritage must treat our data as securely as the companies themselves do. (Of course, as with any data repository—online or offline, DNA or otherwise—unforeseen events could breach that privacy.)
A Cautionary Tale: GEDmatch and the Golden State Killer
In late April, 2018, the world learned that US law enforcement used a geeky genealogy database called GEDmatch to catch a serial killer and rapist. GEDmatch had long served as a common meeting ground, where genealogists who tested at different companies could compare to one another. However, genetic data from a crime lab was uploaded to GEDmatch, and the so-called Golden State Killer was identified using techniques developed for adoptee searches.
It was an absolutely brilliant investigative strategy. It was also done without the informed consent of any of the 900,000 people in the GEDmatch database at the time. None of us (not even the guys who run GEDmatch, who are good people) were told how the database was being used, and none of us were given the chance to opt out. Genealogists working with law enforcement did it anyway. Others have since jumped on the bandwagon, and GEDmatch is now the de facto cop database. (The Terms of Service now explicitly say that law enforcement is using the database, but an unknown number of DNA profiles there have still not accepted those Terms; their data is being used in criminal investigations without their consent.)
I bring up this example because a loss of public trust in genealogy databases has consequences. The damage done by this breach is hard to quantify, but it exists. I still use GEDmatch, but I withdrew all of my DNA kits from public view when the Golden State Killer story broke. Anecdotally, people are more resistant to testing when I ask them. And I have reason to believe that DNA sales have taken a hit in the aftermath.
I track the database sizes over time. AncestryDNA, Family Tree DNA, and GEDmatch all appear to have experienced declines in growth since April, when the Golden State Killer story became public. MyHeritage does not seem to be affected, perhaps because their market is less US-centric. (23andMe hasn’t released database numbers since February.)
The graph below plots the database sizes over time, since 2012. All of the companies experienced steady or even exponential growth … until April. The black arrows show the inflection points that occurred around the time that the Golden State Killer was arrested.
The graph is not hard evidence that the companies are selling fewer tests because of privacy concerns. For one thing, I don’t have inside sales information, so I don’t know precisely when the inflections occurred. Also, other factors could have caused the declines (if they do in fact exist). The trends are certainly concerning, though.
The lesson is simple: A breach of trust at one site will have ripple effects throughout the industry, impacting even companies that are completely transparent about how our data is used. The impacts occur even when the ultimate use of the data is commendable. After all, who doesn’t want to catch serial killers? And who wouldn’t want to help cure cancer? As long as we consent.
False accusations against the companies only compound the damage and should be countered at every turn.
I don’t know that anyone is being intentionally misleading when they propagate falsehoods against the companies. They genuinely believe that the companies are ‘selling our data without our permission’. Unfortunately, these rumors have gained traction in subsets of our community, despite no evidence. The misinformation needs to be challenged.
False rumors about privacy damage the entire field of genetic genealogy, not just any one company that might be singled out in an accusation. After all, all five of the major testing companies have research programs, and they all have similar consent requirement and data protections. If people can be convinced that one company is selling our data behind our backs, why wouldn’t they believe the same of the other companies?
Genetic privacy absolutely should be a major focus of the genealogy community and the media. After all, there are real risks, and real rewards. If some people evaluate the facts and decide not to test, or not to upload their data to GEDmatch, so be it. That said, the general public deserves accurate information about the benefits and hazards involved in DNA testing so they can make informed choices. Too often, inflammatory media outlets and individuals hype the dangers and misrepresent what the companies do with our DNA results. We must all work to correct the record.
Updates to this post
12 November 2018: Clarified 23andMe’s policy of doing research in-house