Correcting the Record on Privacy

Every once in a while, untruths spread through the genetic genealogy world that have the potential to do great damage: damage to adoptees seeking their biological families, to genealogists hoping to tackle brick walls, and to a thriving new tech industry that enhances an established, traditional hobby. So, every once in a while, we need to correct the record.

The issue I’m concerned about today is the compound myth that some of the DNA companies are selling our data without our consent; that we are contractually required give them free access to our data in order to test; and that our data is being uploaded to sites like GEDmatch, MyHeritage, and Family Tree DNA to re-identify us. None of those things is happening, and these false rumors need to be squelched whenever they appear. The genealogy community deserves the truth, not fear-mongering.

So, let’s look at the facts and sort this out once and for all.

FACT: DNA Companies Are Not Selling Our Data Without Our Permission

None of the major testing companies—AncestryDNA, 23andMe, MyHeritage, Family Tree DNA, or Living DNA—sell our data to third parties without our permission. Here are key sections of their respective policies:

AncestryDNA Terms and Conditions: “Any sharing of Genetic Information for scientific research is governed by our Informed Consent to Research, which only applies if you expressly agree to participate.”

23andMe Privacy Policy: “23andMe will not sell, lease, or rent your individual-level information to any third party or to a third party for research purposes without your explicit consent.”

MyHeritage Privacy Policy: “We will never sell or license DNA samples, DNA Results, DNA Reports or any other DNA information, to any third parties without your explicit informed consent.”

Family Tree DNA Privacy Policy: “We will never share your Genetic Information with pharmaceutical or insurance companies, employers, or third-party marketers without your express consent.”

Living DNA Privacy Policy: “The EU General Data Protection Regulations (which are known as GDPR) apply to us when we collect or use personal information. The regulations were introduced to protect people’s data.” Under GDPR rules, Living DNA cannot share any data with third parties without the explicit consent of the client.

Translation: None of these five companies sell or give our genetic data without our explicit permission.

FACT: Company Research Programs Are Optional

All five of the main testing companies occasionally conduct research studies, either in-house or using outside collaborators, based on the genetic data of their customers. They only do so with the explicit, informed consent of each individual research participant, in line with universal research protocols for studies involving human subjects. These programs are always optional, and customers are under no obligation to participate in the research studies in order to take a DNA test.

AncestryDNA Informed Consent: “Your consent to participate in this research is completely voluntary and is not required to use any of our products or services. Even if you consent to participate in the research, you may withdraw your consent at any time.”

23andMe Research Consent Document: “Your participation in the 23andMe Research study is completely voluntary.” Later, the document says “Choosing not to give consent or withdrawing from 23andMe Research will not affect your access to your Genetic Information or to the Personal Genome Service.”

MyHeritage Informed Consent Agreement: “Participation in this Project is purely voluntary and you may withdraw your consent to participate in the Project at any time.”

Living DNA Research Consent: “You are in charge of how your DNA and the data derived from it will be used, so you can opt out of this project at any time.”

Family Tree DNA’s Informed Consent for Research is not published on their website.

Translation: These research programs are entirely optional. We do not have to participate to be able to take a DNA test, and if we decide to participate then change our minds, there are no repercussions.

FACT: Our Data Is Not Floating Around Unprotected

Remember that your data is never used for research without your explicit permission. When it’s used, it’s protected as fully as in the main customer databases at each company.

AncestryDNA Informed Consent: “we require our Collaborators and Collaborator Partners to use similar physical, technical, and administrative procedures [to our own privacy protections] to protect the Data and Biological Samples we share with them.”

23andMe Research Consent Document: “Some of these studies may be sponsored by or conducted on behalf of third parties, such as non-profit foundations, academic institutions or pharmaceutical companies.” Even when outside partners sponsor (pay for) research, as with 23andMe’s recent partnership with the pharmaceutical company GlaxoSmithKline, the research itself is done by 23andMe scientists, and only the summary results are shared with the research partner. Your genetic data is only shared outside the company if you also agree the (optional) Individual Data Sharing Consent.

MyHeritage DNA Informed Consent Agreement: “[MyHeritage and its affiliates] use a combination of physical, technical, and administrative procedures to protect the privacy and security of information.”

Living DNA Research Consent: “The Living DNA Global Research Project operate under ISO:27001 for information security and take great care to ensure the integrity of your data so that the risk of loss or leakage is minimized.”

Family Tree DNA’s Informed Consent for Research is not published on their website.

Translation: No, your data will not end up in GEDmatch or another database to re-identify you. 23andMe and Living DNA do their research studies in-house, so our data doesn’t leave their company without additional, explicit permission from us. Collaborators with AncestryDNA, and MyHeritage must treat our data as securely as the companies themselves do. (Of course, as with any data repository-online or offline, DNA or otherwise-unforeseen events could breach that privacy.)

A Cautionary Tale: GEDmatch and the Golden State Killer

In late April, 2018, the world learned that US law enforcement used a geeky genealogy database called GEDmatch to catch a serial killer and rapist. GEDmatch had long served as a common meeting ground, where genealogists who tested at different companies could compare to one another. However, genetic data from a crime lab was uploaded to GEDmatch, and the so-called Golden State Killer was identified using techniques developed for adoptee searches.

It was an absolutely brilliant investigative strategy. It was also done without the informed consent of any of the 900,000 people in the GEDmatch database at the time. None of us (not even the guys who run GEDmatch, who are good people) were told how the database was being used, and none of us were given the chance to opt out. Genealogists working with law enforcement did it anyway. Others have since jumped on the bandwagon, and GEDmatch is now the de facto cop database. (The Terms of Service now explicitly say that law enforcement is using the database, but an unknown number of DNA profiles there have still not accepted those Terms; their data is being used in criminal investigations without their consent.)

UPDATE: On 31 January, 2019, Family Tree DNA announced that they are working with the FBI. They had changed their Terms of Service in the previous month without notifying their customers. They have since set up an opt-out system that automatically opts most of their customers in to law enforcement exposure without their explicit, informed consent. I no longer recommend Family Tree DNA as a trusted company for genetic genealogy.

I bring up this example because a loss of public trust in genealogy databases has consequences. The damage done by this breach is hard to quantify, but it exists. I still use GEDmatch, but I withdrew all of my DNA kits from public view when the Golden State Killer story broke. Anecdotally, people are more resistant to testing when I ask them. And I have reason to believe that DNA sales have taken a hit in the aftermath.

I track the database sizes over time. AncestryDNA, Family Tree DNA, and GEDmatch all appear to have experienced declines in growth since April, when the Golden State Killer story became public. MyHeritage does not seem to be affected, perhaps because their market is less US-centric. (23andMe hasn’t released database numbers since February.)

The graph below plots the database sizes over time, since 2012. All of the companies experienced steady or even exponential growth … until April. The black arrows show the inflection points that occurred around the time that the Golden State Killer was arrested.

The graph is not hard evidence that the companies are selling fewer tests because of privacy concerns. For one thing, I don’t have inside sales information, so I don’t know precisely when the inflections occurred. Also, other factors could have caused the declines (if they do in fact exist). The trends are certainly concerning, though.

The lesson is simple: A breach of trust at one site will have ripple effects throughout the industry, impacting even companies that are completely transparent about how our data is used. The impacts occur even when the ultimate use of the data is commendable. After all, who doesn’t want to catch serial killers? And who wouldn’t want to help cure cancer? As long as we consent.

False accusations against the companies only compound the damage and should be countered at every turn.

Facts Matter

I don’t know that anyone is being intentionally misleading when they propagate falsehoods against the companies. They genuinely believe that the companies are ‘selling our data without our permission’. Unfortunately, these rumors have gained traction in subsets of our community, despite no evidence. The misinformation needs to be challenged.

False rumors about privacy damage the entire field of genetic genealogy, not just any one company that might be singled out in an accusation. After all, all five of the major testing companies have research programs, and they all have similar consent requirement and data protections. If people can be convinced that one company is selling our data behind our backs, why wouldn’t they believe the same of the other companies?

Genetic privacy absolutely should be a major focus of the genealogy community and the media. After all, there are real risks, and real rewards. If some people evaluate the facts and decide not to test, or not to upload their data to GEDmatch, so be it. That said, the general public deserves accurate information about the benefits and hazards involved in DNA testing so they can make informed choices. Too often, inflammatory media outlets and individuals hype the dangers and misrepresent what the companies do with our DNA results. We must all work to correct the record.

Updates to this post

  • 12 November 2018: Clarified 23andMe’s policy of doing research in-house
  • 1 February 2019: Referenced the announcement that Family Tree DNA is working with the FBI.
  • 29 April 2019: Added reference to Family Tree DNA’s opt-out policy for law enforcement exposure and rescinded my personal endorsement for the company.

38 thoughts on “Correcting the Record on Privacy”

  1. Thank you for that information on the sharing of DNA data. I had many people telling me they would not now do a test because of their fear of unapproved sharing. Now I can tell them the truth. Much appreciated.

    1. You’re welcome. There will always be people who decline to test—and that’s their right—but they should have correct information to help them decide. Good luck with your family research!

  2. You said: “but I withdrew all of my DNA kits from public view when the Golden State Killer story broke. ”

    But they are still at GEDMatch? Why? Can you see public matches but they can’t see you?

  3. Why remove your data from public view? (That means it’s now a ‘research account’, yes? You can see others, but they can’t see you?) Are you afraid the police might find a criminal in the family or… that some other entity will find a way to abuse data, perhaps in ways we don’t yet know?

    1. The cops are already well aware of the criminals in my family! I withdrew from the public database for two reasons:
      (1) In my opinion, government agents rifling through my genetic data without my consent is an unconstitutional search under the 4th Amendment. Others are welcome to disagree, but I will not participate. Ironically, had LE set up a database based on consent, I would be working for them right now.
      (2) LE was able to use GEDmatch without the knowledge of its owners. We have no idea who else is using GEDmatch or for what purposes.

      1. So as I mentioned, a concern for future abuse by those other than LE. I doubt a case can be made against LE using public data…(esp. when they get away with going through people’s trash.) I believe I read that Blaine T Bettinger (genetic genealogy author, PhD in genetics) released his dna data publicly, indicating he wasn’t worried over what use could be made of it. The only disadvantageous use I can see (now) is by health insurance or employers wanting to deny coverage or employment due to dna potentials for disease/cost… assuming they could match for myriad diseases and owner identification is possible for NGOs and laws are not prohibiting that. Yet I believe there are some laws prohibiting/limiting such use already. Unfortunately, US administrations like the current one may make abuse possible, for a couple more years. I expect to see more protection from future administrations, not less.

        1. A strong case can be made against LE using genealogy databases on 4th Amendment grounds. If they can access our DNA held in a private database (GEDmatch is a privately owned corporation), then privacy no longer exists. Even convicts, who are forced to give DNA samples for the FBI’s CODIS database, have more genetic privacy.

          What Blaine chooses to do with his genetic data is entirely up to Blaine. He himself would agree that he does not speak for anyone else.

  4. I heard from a professional genealogist that the DNA match was that of the GSK’s relative and that led them to stake out the GSK’s property and obtain additional evidence that convicted him.

    1. They used GEDmatch to identify him, then collected DNA from his trash to do a comparison that they could use in court. He hasn’t been convicted yet. Unfortunately, if it turns out that using GEDmatch was an unconstitutional search, he won’t be convicted.

  5. We can talk all day about how the terms of service don’t allow these companies to sell your information or use your information in research with your concern BUT the unfortunately reality with any data stored in a database is that it is susceptible to an eventual hack / breach. When our saved credit card numbers are leaked it is inconvenient to reverse fraudulent charges and to get a new card issued but it isn’t the end of the world. Even with the Equifax breach hundreds of millions of individuals personal information, including addressed and social security numbers, it still isn’t the end of the world; we just have to be hyper aware of what goes on our credit.

    BUT if your DNA is leaked that’s it; there’s no reset button. Once your DNA is out there the’s no putting the lid back on it. How knows what future government might misuse that information. Or how private companies will simply use your data for their own gain without the need for your concert if they obtained the data without your consent if obtained online through a leak. This is why I deleted my DNA records off of Ancestry.com.

    1. Your DNA data without associated trait information—like disease history, behavioral details, etc.—is worthless.

        1. DNA is not trait information. We’ve known for more than 100 years that there is a difference between genetics (genotype) and observable traits (phenotype).

  6. “23andMe and Living DNA, thus far, do their research studies in-house, so our data doesn’t leave their companies.” FALSE. Did you just ignore their 23andme’s $300million contract with GlaxoSmithKline, one of the world’s largest pharma companies, for exclusive rights to their data?!

    Databases like these are also huge targets for hacking. Just because your data isn’t used for research, doesn’t mean that it’s not out there.

    1. The research for GlaxoSmithKline is still conducted in-house by 23andMe. GSK is contracting for queries to be run against the database and gets summary level statistics in anonymized and aggregated form. For example (I am making this up as an illustration), 23% of customers who had high cholesterol and took statins reported that they did not respond well. This subset was more likely to have a certain result for a certain SNP. Your personal data would be just one data point out of many thousands.

      1. Thank you for that clarification, Ann. Do you know whether the in-house model will hold for future collaborations?

        1. Thanks, Ann. That was my interpretation of the Research Consent Document, but the press release for the Glaxo deal discusses data transfer and storage. It wasn’t clear from the press release whether genetic data was being transferred to Glaxo.

  7. So now you are one of those people who mine DNA match sites but you don’t share with anyone else. Better to just remove yourself completely. That’s how you play fair.

        1. I participated in the public matching database and paid for Tier 1 features for years, right up until the database was used by federal agents without anyone’s consent. As I said, direct your anger elsewhere; it won’t do any good here.

  8. You win. I don’t care what you do but I hope you don’t match me because I share with everyone who contacts me and unknowingly I would share with you but would you share with me? Not likely.

    1. Toni, I spend hundreds, if not thousands, of hours a year writing (free) blog posts, moderating (free) Facebook groups, and providing (free) search angel help for people conducting genealogy research. Your anger is woefully misdirected.

  9. Excellent article. I was surprised to see that FTDNA’s Informed Consent for Research isn’t published on their website, so I checked. Wouldn’t section 6 of their privacy statement cover it? Or am I missing something?

    1. Section 6 says that each research project has its own consent form. I italicized the relevant section in FTDNA’s privacy policy below (https://www.familytreedna.com/legal/privacy-statement):

      Using information for research with your consent
      You have the choice to participate in future genetic research opportunities.

      Consent process for research: Users wishing to participate in genetic research projects may indicate so and will be included in a candidate pool. If you become a potential candidate for a research project, you will be contacted to grant specific consent for every individual research project opportunity. Once given, consent cannot be revoked for research that has already occurred or is underway.

      If you no longer consent to FamilyTreeDNA Research: Consent for future research can be revoked by notifying FamilyTreeDNA that you wish to be removed from the candidate pool or by declining invitations to participate in specific research projects. Removing yourself from the research candidate pool means that you will not receive any invitations to participate in research projects.

      1. I agree that Section 6 is not as strong a blanket statement as offered by the other companies. But it does give an assurance that your data won’t be used for research without your consent, even though it’s on a project by project basis. And you have the option to revoke your consent. That’s enough to make me feel comfortable.

    1. Thanks. As with any company, law enforcement can request customer information via legal channels. 23andMe has yet to provide data in response to such a request. All of the requests to Ancestry were related to credit card fraud and identity theft and were not relevant to DNA testing.

      Whether law enforcement is surreptitiously submitting samples to the company databases is another issue.

  10. We have seen numerous examples in the world about using person’s characteristics to remove them from this planet. Holocausts and other killings. Dictators or nefarious groups couldn’t care less about the protection of 4th amendment and if they take over the databases then it will be very easy to target someone just by knowing his or her DNA. Let’s say I want to travel to Spain and the government of Spain has decided that anyone with a German heritage of more than 50% should be refused entry. They could easily do this with those databases. That’s very scary.

  11. You cite the terms of service for these companies which is fine until they change them. (You include the GEDmatch example). Fact is that they can change their TOS at any time. My DNA would already be with them; nothing I can do about it. I can’t get it back!

    1. You’re right:; I was a lot more trusting when I wrote that post a year ago. Companies can change their Terms of Service, and some have even outright violated them, although that exposes them to potential lawsuits. I no longer use or recommend companies that don’t comply with their own ToS.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.