The 23andMe Hack

By now, you’ve probably heard that 23andMe was “hacked” by criminals who stole the data of up to 7 million users.  Technically, it wasn’t a hack; 23andMe’s security systems weren’t breached.  Rather, the criminals acquired emails and passwords from lapses at other websites then logged in to 23andMe accounts that used the same login credentials.  This kind of attack is called credential stuffing.

What data exactly did the crooks get?  That’s a great question.  23andMe has been tight-lipped about those details, but we can guess.  For the customers whose login credentials were stolen, we can assume the criminals got (1) personal details (name, sex, age, etc.), (2) raw DNA data, (3) health and trait reports, (4) Ancestry Composition (ethnicity estimates), (5) DNA match data including shared segments, and (6) in some cases, street address.  Basically, anything accessible to a user from their own account was up for grabs.

That’s Not All

This is huge.  And it gets worse.  Unfortunately, even those of us with unique passwords were exposed, because chances are we share at least some DNA with someone who was compromised.  The criminals claim to have data from at least 7 million 23andMe customers.  I estimate that’s almost everyone who has opted into DNA matching at 23andMe.  That means you.  It means me.  It means all of us who use 23andMe for genealogy.

23andMe’s matching system is particularly vulnerable in this respect.  Not only can it show personal details of our matches and their shared segments with us, it allows us to see the DNA segments that our matches share with one another (assuming they have fully opted in to the DNA Relatives feature).  

In other words, if Alice, Ellie, and Amy are all related and Alice was hacked, the criminal could see all of the DNA segments that Alice shares with Ellie and Amy as well as all of the segments Ellie and Amy share with each other.  If Alice has any biomedically important traits on those segments, the criminals could theoretically infer some of the traits that Ellie and Amy have, too.

In a worst-case scenario, where Alice had 1,500 DNA matches whose shared segments are visible, the criminals would be able to see more than 1 million pairwise comparisons of people who had not been hacked.  They could see whether or not those million-plus customers matched one another and, if so, by how much and where in the genome.

The more accounts that were hacked, the more data the criminals have on all of us.

Next Steps

What can you do to protect yourself?  You can change your password and enable two-factor authentication, but that’s shutting the barn door after the horses have bolted.  Your data is probably already out there.

23andMe can also implement changes to protect their customers.  They could add two-factor authentication directly to their site, like AncestryDNA and MyHeritage do, rather than requiring users to download a third-party app.  Raw data downloads should require email confirmation first.  The same security check should apply to bulk downloads of DNA Relatives data.

Over the long term, perhaps the best thing we can all do is to contact our government representatives to demand that our genetic information receive extra protections, over and above those that already exist for data in general.  This would be good for genealogists, and it would be good for the industry.  

DNA is special.  It contains intensely personal family and health information that can never be changed.  If your credit card is stolen, you can replace it.  If your phone number is leaked, you can screen your calls.  If your genetic data is compromised, it’s compromised forever.  

What’s more, there is currently no federal legislation in the US to stop a total stranger—or even the government—from analyzing your genome simply because they want to.  You shed DNA everywhere you go, and law enforcement considers it fair game because you “abandoned” it.

Layer that onto recent ethical lapses by leaders in forensic genetic genealogy, and genetic privacy does not exist in the US.  That needs to change.

US residents can find their federal and state representatives here.  Write to them now.

Image by OpenClipart-Vectors from Pixabay

31 thoughts on “The 23andMe Hack”

  1. “contact our government representatives to demand that our genetic information receive extra protections, over and above those that already exist for data in general. ”

    Alas, they can’t protect us. What does “extra protection” mean?

    1. It means acknowledging that our genomic data is protected under the Fourth Amendment and enacting laws to protect it from both the government and criminals.

  2. Is there any list of whose kits were exposed? Usually Dashlane and my Discover card alert me to breaches but I would rather l not rely on them

    1. Not to my knowledge. 23andMe has said “If we learn that your data has been accessed without your authorization, we will contact you separately with more information.” It’s unclear from that statement if they will contact the matches of the impacted accounts.

  3. There’s no evidence of raw DNA data being stolen, why do you claim this and on what basis? The raw DNA data isn’t accessible via their general API, the querying of a single SNP allele isn’t repeatable 650,000 times to get all raw DNA data of a single person.

    If you request to download your raw DNA data, an email is sent, same if you request to download the relatives file. I highly doubt that the hackers would do so and risk that the owner of the account is warned by these emails that his account has been taken over.

    1. The hackers didn’t need a third-party API. They logged directly into the vulnerable accounts, and once there they could download the raw data. They could even have changed the email address on the account to cover their tracks. Whether they actually did any of that is a question we need answered.

  4. They cannot contact anyone because they can’t really distinguish between a genuine user accessing a lot of his/her DNA matches vs access by a bad actor.

    Once again, nothing was being hacked, the bad actor used account data from users who used the same email/password combination across several websites/apps and were previously hacked.

    IMO, these email/password combinations were either from the hack at GEDmatch or MyHeritage. It’s pretty easy to identify where a DNA kit was uploaded from at GEDmatch, so they will know where to use these credentials.

    BTW, I’m sure that these email/password combinations also work in part at Ancestry, FTDNA and MyHeritage, as users tend to use the same combination at different DNA websites (if they have tested somewhere there as well).

    There’s also no evidence that it involves 7 million users, or that 1 million users with AJ background have been affected. All of these are just speculations and rumors and unless someone provides this data file so that people can cross check it’s all that, rumors.

    I also disagree that there’s a need to secure our DNA data more. 23andMe offers two-way authentication which ensures that no one can take over your account. So again, if the user doesn’t use this additional security than why should 23andMe (or any other DNA testing company that offers this) be at fault?

    I’m more concerned about the data breach at GEDmatch, MyHeritage as both were confirmed by the respective companies.

    What the legislation should rather do is tighten the access of law enforcement where they use DNA information of people that have never given consent. The DNA Geek has written about this in the past here, where both GEDmatch and FTDNA violated T&C’s and have given access to LE. It should also always be that consent has to be given explicitly!

    1. A few points:
      • Please see the next comment. 23andMe is contacting people who were affected. Presumably they have access to the hacked files.
      • I’m not convinced the login credentials came from MyHeritage; the passwords in that incident were encrypted.
      • FTDNA would not be affected by credential stuffing, because the login is a kit number, not an email address. That’s annoyed me in the past, but now I appreciate the extra security.
      • I agree completely that we need laws to regulate forensic genetic genealogy. We disagree in that I’m advocating for more. All of the DNA testing companies will benefit from an environment where their customers feel safer. AncestryDNA, 23andMe, and FTDNA should be leading the charge for legislation. (I don’t know whether MyHeritage can lobby US congresspeople.)

  5. I was one of those whose data was breached, according to an email I received from 23andme. (Rumor has it that they targeted Ashkenazi Jews—surprise, surprise. When all else fails, persecute the Jews.) According to that email, all that was taken was what was in our “personal profile.” Assuming that is true (an assumption I am reluctant to make), all that was in my personal profile was my first name, last initial, and the surnames and birthplaces of some of my ancestors. I had not given permission to include my year of birth, and I’d not given any other personal information. I deleted all the ancestral info and changed my password. Aside from that, I have to hope for the best. To be honest, 23andme has been the least useful DNA site for me and I am tempted to delete it anyway. I’ll take any advice you or others have to secure whatever else may have been taken despite 23andme’s assertion otherwise.

    1. It is more likely that AJ were predominantly affected (rather than ‘targeted’) because they have a much larger base of DNA matches.. providing a large attack surface of compromised accounts… whereas indigenous South/Central Americans (for example) can tend to have only a few DNA matches…

      1. 23andMe caps matches at 1500 total, so that shouldn’t be a factor for AJ versus other Europeans. You’re right that people from under-tested populations had less exposure.

        1. Many AJ users at my company have bought the subscription and have north of 15k DNA matches.

          It’s also very easy to go around the 1,500 DNA matches limitation. I have never purchased the subscription and I have over 3,000 DNA matches for both myself and my mum’s DNA kit.

          I agree with Oliver’s comment. AJ’s are also a tight knit group, meaning that most AJ’s have almost only AJ DNA matches, with few exceptions.

          So it’s very easy to collect a lot of their profiles once you’re in.

        2. The Plus membership gives you 5,000 matches, not 15,000. Is there any evidence that AJ genealogists are more likely to purchase the subscription than other groups?

  6. Changing the email address (or rather the request of it) will send out an email to the original email address AFAIK.

    The hackers used the API for sure, how else can you scrape a lot of data in a short amount of time? Using humans wouldn’t scale at all. You can use many apps that are running in parallel and accessing different accounts all the while masking your IP address.

  7. Ok, point taken on 23andMe being able to identify a typical behavior that must have repeated in the same pattern over-and-over with many accounts. That’s a great achievement, given how many log files they had to go through and analyze to identify that patterns.

    Secondly, my main source for the email addresses is still GEDmatch (from their data breach) but to be fair I threw in MH as well. But thanks for the info that MH uses kit numbers though I always log-in via email into their service. So the option for both exists .

  8. 23andMe did NOT notify me initially, but they did send me an “update” on the situation.
    I am already suffering daily email spam through a data breach at one of my country’s main providers of mobile telephone services. But I used a different account for 23andMe and have not seen anything – so far: my previous experience has been of a delay of 3-12 months before things happen.
    Two factor identification is now fairly common elsewhere so people should be amenable to this.
    Sharing of family data and the availability of information is critical to family history.
    But it also allows forms of theft or trolling. It is so sad.
    Hopefully we can find a point where protection is adequate, because already people are rightfully becoming more cautious, sometimes to the extent that their caution prevents them from allowing the system or others to find their family for them.

  9. Hello Ann, long time no speak. Yes, I’m referring to the data breach incident mentioned in the link you posted. I read it on Verogen’s blog at that time.

    1. In that case, the combo of email/password did not involve GEDmatch or MyHeritage except through the phishing attack that persuaded a few people to change their password at the phony website myheritaqe.com.

  10. Can 23andMe really have 4 million users from the UK?

    Social media and news outlets are buzzing with claims that millions of 23andMe user data are on the Dark Web, including 4 million “mainly” UK users. Before sharing such information, it’s crucial to fact-check.

    As of February 2023, 23andMe has genotyped 12,200,000 people.

    So, 4 million would roughly represent a 1/3 of it.

    Most 23andMe users know that their DNA matches predominantly come from the US. British Genetic Genealogist Debbie Kennett notes that 23andMe isn’t useful for her as “most matches are from the US” (paraphrased).

    Let’s dive into our database, which contains information from our users 23andMe data, using the country code for the origin of maternal Grandmothers, the most shared information. “GB” is the 3rd largest group at 23andMe, “IE” the 5th largest, representing 5.5% of all their DNA kits, or 669,245 kits from the UK.

    Read further at: https://www.facebook.com/yourDNAfamily/posts/pfbid0QgEQStJs4wx3arwrszyRsFTuw5QZDoD7wbeDBEfRa9v6Z8rxaYU1teMNARkRxeBUl

    Disclaimer:

    I’m the author of the “Your DNA family” app and this link leads to an article posted on our Facebook page. I’m not affiliated with 23andMe.

    1. To assess that claim, we need to know what is meant by “mainly UK.” If they mean of UK descent, I’m surprised it’s only 4 million. If they mean 4 million people living in the UK now, the claim isn’t even close to credible.

      23andMe has genotyped more than 14 million people now, but only about half (very rough guesstimate) are in DNA Relatives, so a max of about 7 million were affected by the hack.

  11. The “mainly” is from the person who wrote this article: https://www.msn.com/en-us/money/other/cybercrim-claims-fresh-23andme-batch-takes-leaked-records-to-5-million/ar-AA1iw8BF

    I’ve found now another article that was referenced in the MSN, now it’s clear that hacker indeed claims 4 million people living in Great Britain:

    Yesterday, a threat actor named ‘Golem,’ who is allegedly behind the 23andMe attacks, leaked an additional 4.1 million data profiles of people in Great Britain and Germany on the BreachForums hacking forum.

    “This additional leak includes 4,011,607 lines of 23andMe data for people living in Great Britain.”

    Source: https://www.bleepingcomputer.com/news/security/hacker-leaks-millions-of-new-23andme-genetic-data-profiles/

    So yeah, completely BS. Debbie Kennett has posted that 23andMe has sold 250k DNA kits in the UK as of June 2020:

    “23andMe had sold 250,000 kits in the UK as of June 2020. See this report from the House of Commons Science and Technology Committee.”

    This is her source: https://publications.parliament.uk/pa/cm5802/cmselect/cmsctech/94/9403.htm

  12. It is in this case, see my updated comments in my Facebook post. There’s a least one person who purchases all data and ran a lot of analysis on it, mostly crossing Y-DNA haplogroups vs location information (coming from the 4 Grandparents in most cases).

    Also a clarification on the Ashkenazi Jews, the Chinese and British data files. So yeah, all his analysis makes sense except the part about the hackers claim to be able to partially reconstruct the DNA profiles of 14 million unhacked profiles by using segment information. That part doesn’t make sense IMO, maybe you can make more sense of it.

    The hacker claims he hacked 100,000 profiles. I guess what he means is that he got access to the DNA matches of 100k profiles.

    1. As much as I question some of the practices of Parabon’s genealogists, I think it’s highly improbably that they’re behind this hack.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.