This post has been updated.
On April 25, 2018, news broke that Joseph James DeAngelo was arrested as the serial rapist and murderer who terrorized California in the 1970s and ’80s. The criminal was variously known as the East Area Rapist, the Original Night Stalker, the Visalia Ransacker, and the Diamond Knot Killer, with brutal rapes and murders committed in 10 counties across the state. Some people have similar fears of crimes carried out in abandoned places across the US thanks to the legacy of this man. Not until years after his last known attack in 1986 were DNA methods available to connect all of those crimes to the same man. Yet, for decades, no one knew who he was. The arrest was the result newer DNA technology being brought to a very cold case that had resisted decades of prior investigation.
But, what were those methods? The Sacramento County Sheriff and District Attorney have been very circumspect in describing what was done. Details are still trickling out, but it’s clear that they used the same methods that genetic genealogists use to identify unknown parents and grandparents.
Consider these quotes:
- “In some way, a DNA link was identified between the suspect in the case and an unknown member of the public who also shared various components of DNA with a bunch of other people, several dozen people. And so the Sheriff’s Department then had to take that information, whittle out the people that couldn’t possibly be suspects, and then zero in on people who could be suspects. Which is how they ended up at Mr DeAngelo’s door.” – Bob Moffitt in an NPR interview, 25 April 2018
- “It was DNA that was related. That’s all I can say,” [Sacramento County Sheriff Scott] Jones said. “I can’t say it’s a family member’s DNA. I can say there was, in employing this technology, there was a link between the DNA that we had and the potential for examining a universe of folks of which our guy was a member.” – Capital Public Radio, 25 April 2018
- “His DNA was not in a criminal database. I would say it that way.” and “The arrest warrant is under seal, and the mechanism of the DNA will eventually come out but as I say it’s really some innovative DNA work that led to him being a suspect and then the sample ultimately which identified him through his DNA.” – Sacramento County District Attorney Anne Marie Schubert, in an interview with Megyn Kelly, 26 April 2018
All point to a genealogy-based search, which was confirmed today by multiple news outlets, including the New York Times.
You probably have questions about how they did it, and I’ll answer to the best of my ability. I want to be clear about a few things first. I do unknown parentage searches every day, but I was not involved in this case. And although I love the science and laud the outcome here, I have deep misgivings about a genealogical database being used by law enforcement without the knowledge or consent of its participants.
Imagine that you are an adoptee who wants to find her biological parents. You would submit your DNA to one or more genealogy-oriented testing companies and hope to be matched to close relatives. In this case, “close” means they share enough DNA with you to be 3rd cousins or better. Sometimes you strike gold and find an uncle, cousin, half sibling, or even parent, but usually the search takes a lot of work.
Closer matches are better of course, but the strategy stays the same: you pore through the family trees of your closest matches looking for connections among them. If you have a few matches who are all descended from John Jacobs and Mary Mallone, it’s a good bet that you’re descended from them as well. You then flesh out the families of these probable ancestors looking for candidates who were in the right place at the right time to be the person you’re seeking.
But how did the police get this guy’s DNA into a genealogy database, given that they didn’t know who he was in the first place?
We can use the case of the Jane Doe known as Buckskin Girl to understand the methodology. In 1981, a young woman was found murdered in a roadside ditch in Troy, Ohio. Her identity was unknown. Recently, a blood sample collected during her autopsy was rediscovered and subjected to whole genome sequencing. Although the sample had degraded over the years, the lab was able to recover more than half of her genome. That was enough to create a mock genealogy DNA test that could be uploaded to GEDmatch.com, a private, third-party database meant to be a common meeting ground for genealogists who tested at different commercial companies. Volunteers for the DNA Doe Project lucked upon a close match to Buckskin Girl who turned out to be a first cousin once removed. That cousin led them to the identity of Marcia King after 37 years of anonymity.
This process is almost certainly what happened with the Golden State Killer, except that the DNA sample came from crime scene evidence rather than an autopsy, and the initial matches may not have been as close. (I infer this from statements by law enforcement that they had to filter through a hundred or so candidates.) AncestryDNA, 23andMe, and MyHeritage, and Family Tree DNA all denied that authorities had asked them to use their databases. Two days after the arrest, The Mercury News confirmed that the database used was GEDmatch.
GEDmatch has since issued a statement:
April 27, 2108 We understand that the GEDmatch database was used to help identify the Golden State Killer. Although we were not approached by law enforcement or anyone else about this case or about the DNA, it has always been GEDmatch’s policy to inform users that the database could be used for other uses, as set forth in the Site Policy (linked to the login page and https://www.gedmatch.com/policy.php). While the database was created for genealogical research, it is important that GEDmatch participants understand the possible uses of their DNA, including identification of relatives that have committed crimes or were victims of crimes. If you are concerned about non-genealogical uses of your DNA, you should not upload your DNA to the database and/or you should remove DNA that has already been uploaded.To delete your registration contact [redacted]@gmail.com
Two Ideas, Both Important
There are two key ideas vying for attention in these stories. One-the most obvious-is the incredible power of genetic genealogy to bring closure and justice to truly horrendous tragedies. If enough DNA is left behind, cases like these can be solved, even decades later. And knowing that they will be caught eventually may be enough to deter violent criminals in the first place.
The second involves ethical considerations that might seem secondary immediately after the arrest of so evil a man. Nonetheless, the long term implications are worth considering. I wish I had answers, but I have mostly questions:
- Should law enforcement be accessing private DNA databases that were created by and for genealogists pursuing a hobby?
- Does this violate the 4th Amendment right against unreasonable searches and seizures? Or perhaps the General Data Protection Regulations (GDPR) set to take effect soon in the European Union?
- What happens when the wrong person is identified publicly?
- What happens when a case is solved using DNA that was put into a database by someone other than the tester?
- Should the genealogists doing this work have formal qualifications?
- Could fear of government overreach deter people from testing in the first place or cause them to delete their results? This would harm both genealogy and forensics.
- Worse, might people sue a database if they learn that their data was used for a purpose to which they hadn’t consented?
- Could defense attorneys use an “unreasonable search” argument to throw out evidence against their clients.
Perhaps the best solution would be for law enforcement to create a separate database comprised only of people who have given explicit informed consent for forensic uses. The responses to both the Buckskin Girl and Golden State Killer cases have been largely positive, so a volunteer database would likely grow rapidly.
In the meantime, those of us doing genetic genealogy as a hobby or profession should ensure that everyone we ask to test or transfer into a database understands how the databases are being used first. In addition to our standard warnings that family secrets might be uncovered, we must tell them that their data might be used by government authorities.
What Do We Tell Our Relatives?
When considering where to test or transfer, the Terms of Service are important considerations. Here’s what the main genealogy databases have to say:
- AncestryDNA is a testing company that does not accept data from other sources: “Any saliva sample you provide is either your own or the saliva of a person for whom you are a parent or legal guardian.” A sample submitted by law enforcement would probably violate these Terms.
- 23andMe is a testing company that does not accept data from other sources for relative matching: “You are guaranteeing that any sample you provide is your saliva; if you are agreeing to these TOS on behalf of a person for whom you have legal authorization, you are confirming that the sample provided will be the sample of that person.” Law enforcement might be considered to have legal authorization over a forensic sample, but since 23andMe does not take transfers, getting a degraded sample into their database would be quite difficult.
- MyHeritage is a testing company that does accept data from other sources: “you represent that any DNA sample you provide and any information that you transfer or upload that associates an individual with his/her DNA Results are either your DNA or the DNA of a person for whom you are a legal guardian or have obtained legal authorization to provide their DNA to us.” Law enforcement might be considered to have legal authorization over a forensic sample and could transfer a file from a private lab into their database.
- FTDNA is a testing company that does accept data from other sources. They do not appear to have any restrictions on who may submit samples to their database. Law enforcement could transfer a file from a private lab into their database.
- GEDmatch is a third-party site that only accepts data transferred from other sources: “Please acknowledge that any sample you submit is either your DNA or the DNA of a person for whom you are a legal guardian or have obtained authorization to upload their DNA to GEDmatch.” (From the upload form rather than their Site Policy.) Law enforcement could transfer a file from a private lab into their database.
Ensuring that we and our relatives are fully informed and consent to how their DNA is used will protect us all. We now know for a fact that law enforcement is using the GEDmatch database, so that must be taken into consideration when deciding whether to transfer data there.
- Police used consumer genealogical websites to identify Golden State Killer suspect – Richard Winton, Joseph Serna, Paige St. John and Benjamin Oreskes, San Diego Union Tribune, April 26, 2018
- Relative’s DNA from genealogy websites cracked East Area Rapist case, DA’s office says – Sam Stanton and Ryan Lillis, Sacramento Bee, April 26, 2018
- Sacramento Police Say They Have Taken The Golden State Killer Into Custody – Bob Moffitt, NPR, April 15, 2018
- After searching for more than 40 years, authorities say an ex-cop is the Golden State Killer – Ray Sanchez, Elizabeth I. Johnson, Steve Almasy and Alanne Orjoux, CNN, April 26, 2018
- After Four Decades, Golden State Killer Suspect Arrested In Sacramento – Bob Moffitt, Capital Public Radio, April 25, 2018
- How a Genealogy Site Led to the Front Door of the Golden State Killer Suspect – New York Times
- ‘Buck Skin Girl’ Case Break Is Success of New DNA Doe Project – Seth Augenstein, Forensic Magazine, April 16, 2018
- The DNA Doe Project Q&A: What did the DNA Doe Project do that led do the identification of “Buckskin Girl” as Marcia King? – The DNA Doe Project Facebook Page
- How private is your DNA on ancestry websites? East Area Rapist case raises questions -Dale Kasler and Anita Chabria, Sacramento Bee, April 26, 2018
- A Serial Killer Was Caught Because Investigators Found His Family’s DNA On A Website – Dan Vergano and Virginia Hughes, BuzzFeed, April 26, 2018
- Here’s the ‘open-source’ genealogy DNA website that helped crack the Golden State Killer case – Matthias Gafni, The Mercury News, April 26 & 27, 2018
- How to find a killer using DNA and genealogy – Kitty Cooper, Kitty Cooper’s Blog, April 29, 2018
Updates to This Post
- 9 December 2020 — Some dates added to quotes for historical context.