This summer, GEDmatch was served with a warrant from an Orlando, Florida, court pertaining to a violent serial rapist. The warrant demanded access to GEDmatch’s entire database of users, including those who had not opted in to law-enforcement matching. A redacted version of that warrant is now available.
Here, I analyze some of the key points in the warrant that are relevant to genetic genealogy. First, a review of the events leading up to this warrant is in order. (A more comprehensive timeline is here.)
- 24 April 2018 — The Golden State Killer, a brutal rapist and murderer who had escaped justice for more than 30 years, was arrested after Barbara Rae-Venter identified him through genetic genealogy. The GEDmatch database had been used without their knowledge.
- 8 May 2018 — Parabon NanoLabs announced a new Genetic Genealogy Service headed by CeCe Moore. Within 9 days, they had uploaded about 100 DNA profiles to GEDmatch.
- 24 May 2018 — GEDmatch changed their Terms of Service to specify that investigations of violent crimes, defined as homicide and sexual assault, were allowed.
- 31 January 2019 — We learned that Family Tree DNA had granted the FBI access to their database of more than a million customers as early as Fall 2018, in contravention of their Terms of Service. FTDNA changed their Terms of Service in December 2018 to allow law enforcement to use their database, but they failed to notify their customers, as required by that very same contract.
- 12 March 2019 — In response to backlash from the earlier Terms of Service controversy, FTDNA introduced an opt-out system for law-enforcement matching. However, they automatically opted in the majority of their customers by default.
- 14 May 2019 — We learned that the owner of GEDmatch had personally granted permission for Parabon to use the database to investigate an assault, although the Terms of Service at the time only allowed homicide and sexual assault investigations.
- 19 May 2019 — GEDmatch implemented a new opt-in system for law enforcement matching. All DNA kits were automatically opted out, and users could choose whether to make their kits available to law enforcement agents.
- 14 June 2019 — The warrant in question was signed by a judge in Orlando, Florida.
- 24 September 2019 — The US Department of Justice issued an interim policy for forensic genetic genealogy investigations that specified such searches could proceed “only in those [genetic genealogy] services that provide explicit notice to their service users and the public that law enforcement may use their service sites to investigate crimes or to identify unidentified human remains” (page 6).
- 5 November 2019 — We learn that GEDmatch was served and complied with a warrant in June or July to provide matching data for users who had not opted in to law enforcement matching.
Thus, on 18 May, law enforcement had access to the DNA profiles of approximately 1.2 million genealogists at GEDmatch. The very next day, they had none, setting the stage for the search warrant signed by a judge less than a month later. It required GEDmatch to turn over matching data of customers who were opted out of law enforcement matching. The warrant was submitted prior to the Department of Justice policy that would have prohibited them from using a database that did not explicitly permit law-enforcement access (e.g., the opted-out kits at GEDmatch).
This Was a Cold Case
The details of the violent sexual assaults are redacted in the warrant (as they should be), ad no dates are given. However, indirect evidence indicates that this was a cold case. First, although this warrant relates to crimes that took place in Orlando, Florida, DNA evidence linked the same perpetrator to assaults in Volusia County that were “also cold cases” (page 17).
Second, the original DNA testing was done using older technology called restriction fragment length polymorphism (RFLP).
The aforementioned [redacted] sexual battery cases were linked using restriction fragment length polymorphism (RLFP), which was DNA testing available in [redacted]. Since then, one (1) case (Orlando Police Department Case Number: [redacted]), has been converted over to Short Tandem Repeat (STR) analysis, which is the common method used today by the FBI CODIS Database and by the Florida Department of Law Enforcement (FDLE). (page 17)
RFLPs require more DNA and are less robust to degraded samples than the technology now used by the CODIS database. Precisely when Florida switched from RFLPs to STRs is unclear, but it appears to have been around the year 2000.
The perpetrator’s STR profile is in the CODIS database and “no matches have ever been made from these attempts,” (page 17), suggesting that he hasn’t been active for at least 20 years. There was no urgency to accessing GEDmatch only weeks after the Terms of Service had changed.
The Suspect Was Not Identified with the CODIS Database
One of the forensic samples was reanalyzed for the CODIS markers and routinely used to search the CODIS and INTERPOL databases, with no success.
The aforementioned DNA/seminal fluid evidence was later entered into the Federal Bureau of Investigation’s (FBI) Combined DNA Index System (CODIS) Database and has been routinely run against the known and unknown offender’s lists. As of this date, no matches have ever been made from these attempts. The aforementioned DNA evidence was also entered into the International Criminal Police Organization, more commonly known as INTERPOL, Database, and has been routinely run against the known and unknown offenders with cooperating international police organizations. As of this date, no matches have ever been made from these attempts. (page 17)
This is reassuring. Genealogists need to know that traditional investigative resources available to law enforcement have been exhausted before agents co-opt hobbyist databases that contain highly private genetic information. Florida allows familial searching of the CODIS database at the state level, meaning that all of the potential information in CODIS had been extracted.
The Terms of Service Were Not Violated Originally
The perpetrator’s DNA was sent for analysis on 27 December 2018 (page 20) and had been uploaded to GEDmatch in by mid-January, 2019.
On January 16, 2019, Parabon NanoLabs, Inc. prepared a report on the DNA analyzed. (page 21)
At the time, sexual assault investigations were permitted in the GEDmatch database, and the site had not yet implemented the opt-in system for law enforcement cases. Therefore, the initial upload was within the Terms of Service at GEDmatch at the time. Why the investigation wasn’t pursued then, and why the detective didn’t have copies of the GEDmatch reports from the original upload, are open questions.
GEDmatch Was Under a Gag Order
GEDmatch was ordered not to reveal the existence of the warrant until the end of the investigation. The crime has not been solved.
PURSUANT TO F.S. 934.25(6) and 18 U.S.C. 2705(b) -YOU ARE ORDERED NOT TO DISCLOSE THE EXISTENCE OF THIS WARRANT TO THE CONCLUSION OF THE INVESTIGATION. ANY SUCH DISCLOSURE WILL IMPEDE THE INVESTIGATION AND THEREBY OBSTRUCT JUSTICE. (page 1)
So, while the warrant granted law-enforcement access to DNA kits that were not opted into law-enforcement matching—a violation of trust for the genealogists who chose not to opt in—GEDmatch was under court order to remain silent. We can’t fault GEDmatch for that.
GEDmatch Complied Almost Immediately
We can, however, fault them for not putting our genetic privacy first. GEDmatch was given 20 working days to fulfill the terms of the warrant.
I FURTHER COMMAND GEDmatch Inc. to provide the requested data to [the Detective] within 20 business days. (page 4)
According to The New York Times, GEDmatch provided the requested data the very next day. This is worrisome. They had ample time to ask a privacy lawyer whether the warrant was overly broad, and to challenge it if so, but they apparently chose not to.
The Warrant Was Overly Broad
The original upload of the forensic sample found two very good matches and seven more promising ones.
The genetic genealogy assessment resulted in two (2) promising matches from a genealogy perspective (>300 cM of shared DNA; second cousin or closer) and seven (7) potential helpful matches (70 cM – 300 cM; third cousin or closer). (page 21)
Again, aspects of this are reassuring. The detective had specific information he hoped to find at GEDmatch, namely nine DNA matches who were likely to be third cousins or closer. He knew the information existed because it had been available to investigators prior to May 19, 2019.
Had he only requested data for those nine matches, the warrant might have been reasonable. However, he demanded (1) the entire one-to-many match list (≥1,500-kits, depending on which version of the tool was used), the vast majority of which will be very distant cousins to the perpetrator, as well as (2) “all one-to-one matches” (page 2), which could be 30,000 or more if GEDmatch interpreted that request literally.
Thus, the search takes on aspects of a fishing expedition.
What’s more, the warrant requests data that a typical user cannot see, like the real names of testers who used aliases, when they last logged in to GEDmatch, and their “registered mobile numbers.”
PROPERTY TO BE PROVIDED BY GEDmatch, INC.
Information provided by GEDmatch indicates the following information is available to law enforcement upon service of proper process:
- GEDmatch kit number for all one-to-one matches
- Letter following GEDmatch kit number for one-to-one matches
- Email address for all one-to-one matches
- Real name associated with all one-to-one matches
- Alias associated with all one-to-one matches
- Date and time stamp of one-to-one matching profile’s creation date
- Most recent logins for all one-to-one matches
- Registered mobile number for all one-to-one matches
- Many-to-one report (pages 2, 5, 39)
(Red emphasis mine.)
I’m not even sure what the second-to-last item means, because GEDmatch doesn’t ask users for their phone numbers. Does “registered mobile number” mean something else, like IP address? Or is GEDmatch collecting our phone numbers some other way?
As an aside, the wording here is odd. “Information provided by GEDmatch” suggests they had previously given the detective a list of data they were willing to give investigators. If true, it suggests that GEDmatch knew in advance that a warrant was coming.
A Focus on Original Testing Company
The warrant specifies that GEDmatch disclose the “letter following GEDmatch kit number for one-to-one matches” and “Kit Number Matches (to include the first character to show testing company)” (pages 2, 5, and 39). I suspect the detective is referring to the same thing in both places and doesn’t realize that the letter preceding older kit numbers (e.g., A123456 for a kit from Ancestry) is an integral part of the number and not something that needs to be requested specifically. Without that letter, it’s not a kit number. (Newer kit numbers have a two-letter prefix that does not correlate to testing company.)
The fact that he requested it, though, gives me pause. If he’s doing the investigative work at GEDmatch, why does he need to know? Is it a tell that he planning to serve warrants on the original testing companies, too? (23andMe and AncestryDNA have both announced that they would contest warrants to access customer genetic data in their databases.)
A Seasoned Detective, New to Genetic Genealogy
The detective who wrote the warrant had been serving the public since 1995. He’s worked in drug enforcement, criminal investigations, assault & battery, robbery, and homicide units. He has also completed more than 600 hours of educational training during his career as an officer. His credentials as a homicide investigator are not in question. (I’ve made an editorial decision not to name him, although he is named in the warrant, because this blog post isn’t about him; it’s about the privacy issues surfaced by the warrant.)
He was, however, new to genetic genealogy. He had solved only one case involving genetic genealogy previously.
Your Affiant has successfully completed a genetic genealogy case (Orlando Police Department 2001-380051), the murder of Christine Franke. (page 8)
According to Wikipedia, the genetic genealogy component of the Franke case was performed by Parabon, not the detective himself. That explains the small errors throughout the warrant, like not knowing that the kit number includes a preceding letter, or referring to a “many-to-one report” instead of One-to-Many (page 2).
Later, the detective writes:
This report is now blocked by GEDmatch …. The data is still present in the system. The public has access to the report but law enforcement does not. (page 31)
That last sentence, “the public has access to the report”, is not true. The DNA data should have been uploaded as a research kit, meaning other users wouldn’t even know it was there and therefore couldn’t run reports on it. (I don’t think the detective was lying; I think he simply didn’t know … which is my point.)
And in the rape case at hand, he didn’t even know the kit number of his own sample. On page 31, he writes “The suspect DNA was assigned a yet to be identified kit number by GEDmatch” (emphasis mine).
Why didn’t the detective know the GEDmatch kit number for his own case? For that matter, why hadn’t he copied the one-to-many, matching segment search, and triangulation reports when they first became available?
All this makes one wonder whether the detective had sufficient experience in genetic genealogy to understand the privacy implications for the thousands or tens of thousands of innocent genealogists who would match the suspect in one-to-one comparisons. The vast majority of that data would be useless to him.
The Case Wasn’t a Slam Dunk
Parabon classified the case as “Level 3” (of five), with only a medium probability of being solved using genetic genealogy techniques.
Level 3: Medium probability of being solved by GG analysis
This case is expected to produce actionable information for your agency. It may even be possible to identify the unknown subject or narrow down their identity to a list of possibilities from within a specific extended family through GG analysis alone. However, this analysis has additional risk, either because 1) the number of unique, potentially informative matches is small, increasing the probability that the detailed family information may not be discoverable — e.g., due to adoption, or 2) a significant amount of family tree building will be required, which likely will not be able to be completed within a standard GG analysis. (page 21)
In other words, the case wasn’t a particularly promising candidate for genetic genealogy. The detective wasn’t going to identify the perpetrator based solely on the nine matches specified in the warrant. At best, he was going to have something to work with that might or might not lead anywhere in a cold case. The case remains unsolved 5 months after the warrant was signed.
Was that worth overriding the privacy settings of more than a million innocent people?
THOUGHTS ON THE WARRANT
There’s Only One Bad Guy Here
I shouldn’t have to say this, but I do. The only bad guy here is the perpetrator. He committed atrocious crimes: sexual battery with a deadly weapon, armed kidnapping, and armed burglary with a battery within (page 17), all of which are punishable by life sentences.
The detective is not a bad guy; he’s doing his job. The judge is not a bad guy; she’s doing hers. GEDmatch is not a bad guy; they were compelled by a warrant. But good people can sometimes do the wrong thing. And I believe that the warrant, as written, was the wrong thing.
Both Reassuring and Alarming
The warrant is reassuring and alarming at the same time. What’s reassuring is that the detective had exhausted more established methods, like CODIS markers, before attempting genetic genealogy. And he asked for information that had once been available, i.e., that the suspect’s kit had nine promising matches. That is specific information that seems like a reasonable request in a warrant.
On the other hand, he asked for a lot more than just those nine matches and a lot more than a normal user of GEDmatch would have. General users can’t see the real names of all matches nor when they last logged in, yet GEDmatch was compelled to provide that information. He also asked for “mobile numbers”, which GEDmatch does not collect to my knowledge.
Of greatest concern is that GEDmatch was given 20 business days to comply with the warrant, yet they turned over the requested information within 24 hours. That’s simply not enough time to have a lawyer read, analyze, and respond to a 42-page warrant on a novel type of search involving the genetic privacy of more than a million people. I will reiterate: the owners of GEDmatch are good people, but they made a mistake.
There was no urgency. This was a cold case. The perpetrator has apparently been inactive for two decades. There was no reason to believe that he would harm anyone new in the week or two needed for GEDmatch to consult a lawyer. For that matter, there was no reason to believe that harm would come by waiting for more people to voluntarily opt in to law-enforcement matching.
All this for a case that only had a moderate probability of being solved.
Will Another Warrant Happen?
The real question is, how many times has it already happened? The warrant’s logic is that the search was reasonable because someone on the genetic genealogy team had seen the match list. Hundreds of forensic kits had been uploaded to GEDmatch by 19 May, 2019. Detectives investigating any one of them could use the same argument for their own cases. Maybe they already have. After all, the only reason we know about the Orlando warrant is because the detective was overheard talking about it.
The Department of Justice interim policy for forensic genetic genealogy attempts to strike a balance between civil liberties and the power of genetic genealogy to make the public safer. It says that such searches should only be used for violent crime (unless the database itself permits otherwise), only after alternative methods have been considered, and only in databases that notify their users that law enforcement may be searching them. The latter should theoretically exclude the opted-out portion of GEDmatch’s database.
However, the Department of Justice policy only took effect on 1 November, 2019. Warrants served on a database prior to that would not be so constrained. And the guidelines don’t apply to state and local agencies unless they receive federal funding.
So, yes, I expect we’ll learn about more warrants on genealogy databases in due time.
Where Do We Go from Here?
The genealogy community continues to sunder as one group advocates for self-control of our data while the other thinks pubic safety outweighs individual privacy. Accusations fly. Discussion of the topic is being censored. Kit sales are down. And the two largest players in the industry, 23andMe and Ancestry, felt compelled to issue statements saying that they prioritize customer privacy within days of the GEDmatch warrant becoming public.
There is a way forward, though, and that’s informed consent: only expose kits to criminal investigations if the tester has granted permission. The testing companies already practice informed consent for their research programs; there’s no reason not to apply similar protocols to criminal investigations. For consent to be informed, however, the tester must be apprised of both pros and cons in an objective way, they should not be pressured to make a particular choice, and their decision should be respected absolutely.
We also need to step back and show respect for different choices: Right now, the civil liberties group is often accused of ‘protecting murderers and rapists’ rather than having their privacy concerns heard. As an analogy, those who opt out of biomedical research are not accused of being pro-cancer, even though the disease kills nearly 600,000 Americans per year. Why should opting out of criminal investigations be any different?