The GEDmatch Warrant

This summer, GEDmatch was served with a warrant from an Orlando, Florida, court pertaining to a violent serial rapist.  The warrant demanded access to GEDmatch’s entire database of users, including those who had not opted in to law-enforcement matching.  A redacted version of that warrant is now available.

Here, I analyze some of the key points in the warrant that are relevant to genetic genealogy.  First, a review of the events leading up to this warrant is in order.  (A more comprehensive timeline is here.)

 

The Timeline

  • 24 April 2018 — The Golden State Killer, a brutal rapist and murderer who had escaped justice for more than 30 years, was arrested after Barbara Rae-Venter identified him through genetic genealogy.  The GEDmatch database had been used without their knowledge.
  • 8 May 2018 — Parabon NanoLabs announced a new Genetic Genealogy Service headed by CeCe Moore.  Within 9 days, they had uploaded about 100 DNA profiles to GEDmatch.
  • 24 May 2018 — GEDmatch changed their Terms of Service to specify that investigations of violent crimes, defined as homicide and sexual assault, were allowed.
  • 31 January 2019 — We learned that Family Tree DNA had granted the FBI access to their database of more than a million customers as early as Fall 2018, in contravention of their Terms of Service.  FTDNA changed their Terms of Service in December 2018 to allow law enforcement to use their database, but they failed to notify their customers, as required by that very same contract.
  • 12 March 2019 — In response to backlash from the earlier Terms of Service controversy, FTDNA introduced an opt-out system for law-enforcement matching.  However, they automatically opted in the majority of their customers by default.
  • 14 May 2019 — We learned that the owner of GEDmatch had personally granted permission for Parabon to use the database to investigate an assault, although the Terms of Service at the time only allowed homicide and sexual assault investigations.
  • 19 May 2019 — GEDmatch implemented a new opt-in system for law enforcement matching.  All DNA kits were automatically opted out, and users could choose whether to make their kits available to law enforcement agents.
  • 14 June 2019 — The warrant in question was signed by a judge in Orlando, Florida.
  • 24 September 2019 — The US Department of Justice issued an interim policy for forensic genetic genealogy investigations that specified such searches could proceed “only in those [genetic genealogy] services that provide explicit notice to their service users and the public that law enforcement may use their service sites to investigate crimes or to identify unidentified human remains” (page 6).
  • 5 November 2019 — We learn that GEDmatch was served and complied with a warrant in June or July to provide matching data for users who had not opted in to law enforcement matching.

Thus, on 18 May, law enforcement had access to the DNA profiles of approximately 1.2 million genealogists at GEDmatch.  The very next day, they had none, setting the stage for the search warrant signed by a judge less than a month later.  It required GEDmatch to turn over matching data of customers who were opted out of law enforcement matching.  The warrant was submitted prior to the Department of Justice policy that would have prohibited them from using a database that did not explicitly permit law-enforcement access (e.g., the opted-out kits at GEDmatch).

THE WARRANT

This Was a Cold Case

The details of the violent sexual assaults are redacted in the warrant (as they should be), ad no dates are given.  However, indirect evidence indicates that this was a cold case.  First, although this warrant relates to crimes that took place in Orlando, Florida, DNA evidence linked the same perpetrator to assaults in Volusia County that were “also cold cases” (page 17).

Second, the original DNA testing was done using older technology called restriction fragment length polymorphism (RFLP).

The aforementioned [redacted] sexual battery cases were linked using restriction fragment length polymorphism (RLFP), which was DNA testing available in [redacted]. Since then, one (1) case (Orlando Police Department Case Number: [redacted]), has been converted over to Short Tandem Repeat (STR) analysis, which is the common method used today by the FBI CODIS Database and by the Florida Department of Law Enforcement (FDLE). (page 17)

RFLPs require more DNA and are less robust to degraded samples than the technology now used by the CODIS database.  Precisely when Florida switched from RFLPs to STRs is unclear, but it appears to have been around the year 2000.

The perpetrator’s STR profile is in the CODIS database and “no matches have ever been made from these attempts,” (page 17), suggesting that he hasn’t been active for at least 20 years.  There was no urgency to accessing GEDmatch only weeks after the Terms of Service had changed.

 

The Suspect Was Not Identified with the CODIS Database

One of the forensic samples was reanalyzed for the CODIS markers and routinely used to search the CODIS and INTERPOL databases, with no success.

The aforementioned DNA/seminal fluid evidence was later entered into the Federal Bureau of Investigation’s (FBI) Combined DNA Index System (CODIS) Database and has been routinely run against the known and unknown offender’s lists. As of this date, no matches have ever been made from these attempts. The aforementioned DNA evidence was also entered into the International Criminal Police Organization, more commonly known as INTERPOL, Database, and has been routinely run against the known and unknown offenders with cooperating international police organizations. As of this date, no matches have ever been made from these attempts. (page 17)

This is reassuring.  Genealogists need to know that traditional investigative resources available to law enforcement have been exhausted before agents co-opt hobbyist databases that contain highly private genetic information.  Florida allows familial searching of the CODIS database at the state level, meaning that all of the potential information in CODIS had been extracted.

 

The Terms of Service Were Not Violated Originally

The perpetrator’s DNA was sent for analysis on 27 December 2018 (page 20) and had been uploaded to GEDmatch in by mid-January, 2019.

On January 16, 2019, Parabon NanoLabs, Inc. prepared a report on the DNA analyzed. (page 21)

At the time, sexual assault investigations were permitted in the GEDmatch database, and the site had not yet implemented the opt-in system for law enforcement cases.  Therefore, the initial upload was within the Terms of Service at GEDmatch at the time.  Why the investigation wasn’t pursued then, and why the detective didn’t have copies of the GEDmatch reports from the original upload, are open questions.

 

GEDmatch Was Under a Gag Order

GEDmatch was ordered not to reveal the existence of the warrant until the end of the investigation.  The crime has not been solved.

PURSUANT TO F.S. 934.25(6) and 18 U.S.C. 2705(b) -YOU ARE ORDERED NOT TO DISCLOSE THE EXISTENCE OF THIS WARRANT TO THE CONCLUSION OF THE INVESTIGATION. ANY SUCH DISCLOSURE WILL IMPEDE THE INVESTIGATION AND THEREBY OBSTRUCT JUSTICE. (page 1)

So, while the warrant granted law-enforcement access to DNA kits that were not opted into law-enforcement matching—a violation of trust for the genealogists who chose not to opt in—GEDmatch was under court order to remain silent.  We can’t fault GEDmatch for that.

 

GEDmatch Complied Almost Immediately

We can, however, fault them for not putting our genetic privacy first.  GEDmatch was given 20 working days to fulfill the terms of the warrant.

I FURTHER COMMAND GEDmatch Inc. to provide the requested data to [the Detective] within 20 business days. (page 4)

According to The New York Times, GEDmatch provided the requested data the very next day.  This is worrisome.  They had ample time to ask a privacy lawyer whether the warrant was overly broad, and to challenge it if so, but they apparently chose not to.

 

The Warrant Was Overly Broad

The original upload of the forensic sample found two very good matches and seven more promising ones.

The genetic genealogy assessment resulted in two (2) promising matches from a genealogy perspective (>300 cM of shared DNA; second cousin or closer) and seven (7) potential helpful matches (70 cM – 300 cM; third cousin or closer). (page 21)

Again, aspects of this are reassuring.  The detective had specific information he hoped to find at GEDmatch, namely nine DNA matches who were likely to be third cousins or closer.  He knew the information existed because it had been available to investigators prior to May 19, 2019.

Had he only requested data for those nine matches, the warrant might have been reasonable.  However, he demanded (1) the entire one-to-many match list (≥1,500-kits, depending on which version of the tool was used), the vast majority of which will be very distant cousins to the perpetrator, as well as (2) “all one-to-one matches” (page 2), which could be 30,000 or more if GEDmatch interpreted that request literally.

Thus, the search takes on aspects of a fishing expedition.

What’s more, the warrant requests data that a typical user cannot see, like the real names of testers who used aliases, when they last logged in to GEDmatch, and their “registered mobile numbers.”

PROPERTY TO BE PROVIDED BY GEDmatch, INC.

Information provided by GEDmatch indicates the following information is available to law enforcement upon service of proper process:

      • GEDmatch kit number for all one-to-one matches
      • Letter following GEDmatch kit number for one-to-one matches
      • Email address for all one-to-one matches
      • Real name associated with all one-to-one matches
      • Alias associated with all one-to-one matches
      • Date and time stamp of one-to-one matching profile’s creation date
      • Most recent logins for all one-to-one matches
      • Registered mobile number for all one-to-one matches
      • Many-to-one report (pages 2, 5, 39)

(Red emphasis mine.)

I’m not even sure what the second-to-last item means, because GEDmatch doesn’t ask users for their phone numbers.  Does “registered mobile number” mean something else, like IP address?  Or is GEDmatch collecting our phone numbers some other way?

As an aside, the wording here is odd.  “Information provided by GEDmatch” suggests they had previously given the detective a list of data they were willing to give investigators.  If true, it suggests that GEDmatch knew in advance that a warrant was coming.

 

A Focus on Original Testing Company

The warrant specifies that GEDmatch disclose the “letter following GEDmatch kit number for one-to-one matches” and “Kit Number Matches (to include the first character to show testing company)” (pages 2, 5, and 39).  I suspect the detective is referring to the same thing in both places and doesn’t realize that the letter preceding older kit numbers (e.g., A123456 for a kit from Ancestry) is an integral part of the number and not something that needs to be requested specifically.  Without that letter, it’s not a kit number.  (Newer kit numbers have a two-letter prefix that does not correlate to testing company.)

The fact that he requested it, though, gives me pause.  If he’s doing the investigative work at GEDmatch, why does he need to know?  Is it a tell that he planning to serve warrants on the original testing companies, too?  (23andMe and AncestryDNA have both announced that they would contest warrants to access customer genetic data in their databases.)

A Seasoned Detective, New to Genetic Genealogy

The detective who wrote the warrant had been serving the public since 1995.  He’s worked in drug enforcement, criminal investigations, assault & battery, robbery, and homicide units.  He has also completed more than 600 hours of educational training during his career as an officer.  His credentials as a homicide investigator are not in question.  (I’ve made an editorial decision not to name him, although he is named in the warrant, because this blog post isn’t about him; it’s about the privacy issues surfaced by the warrant.)

He was, however, new to genetic genealogy.  He had solved only one case involving genetic genealogy previously.

Your Affiant has successfully completed a genetic genealogy case (Orlando Police Department 2001-380051), the murder of Christine Franke. (page 8)

According to Wikipedia, the genetic genealogy component of the Franke case was performed by Parabon, not the detective himself.  That explains the small errors throughout the warrant, like not knowing that the kit number includes a preceding letter, or referring to a “many-to-one report” instead of One-to-Many (page 2).

Later, the detective writes:

This report is now blocked by GEDmatch …. The data is still present in the system. The public has access to the report but law enforcement does not. (page 31)

That last sentence, “the public has access to the report”, is not true.  The DNA data should have been uploaded as a research kit, meaning other users wouldn’t even know it was there and therefore couldn’t run reports on it.  (I don’t think the detective was lying; I think he simply didn’t know … which is my point.)

And in the rape case at hand, he didn’t even know the kit number of his own sample.  On page 31, he writes “The suspect DNA was assigned a yet to be identified kit number by GEDmatch” (emphasis mine).

Why didn’t the detective know the GEDmatch kit number for his own case?  For that matter, why hadn’t he copied the one-to-many, matching segment search, and triangulation reports when they first became available?

All this makes one wonder whether the detective had sufficient experience in genetic genealogy to understand the privacy implications for the thousands or tens of thousands of innocent genealogists who would match the suspect in one-to-one comparisons.  The vast majority of that data would be useless to him.

 

The Case Wasn’t a Slam Dunk

Parabon classified the case as “Level 3” (of five), with only a medium probability of being solved using genetic genealogy techniques.

Level 3: Medium probability of being solved by GG analysis

This case is expected to produce actionable information for your agency. It may even be possible to identify the unknown subject or narrow down their identity to a list of possibilities from within a specific extended family through GG analysis alone. However, this analysis has additional risk, either because 1) the number of unique, potentially informative matches is small, increasing the probability that the detailed family information may not be discoverable — e.g., due to adoption, or 2) a significant amount of family tree building will be required, which likely will not be able to be completed within a standard GG analysis. (page 21)

In other words, the case wasn’t a particularly promising candidate for genetic genealogy.  The detective wasn’t going to identify the perpetrator based solely on the nine matches specified in the warrant.  At best, he was going to have something to work with that might or might not lead anywhere in a cold case.  The case remains unsolved 5 months after the warrant was signed.

Was that worth overriding the privacy settings of more than a million innocent people?

 

THOUGHTS ON THE WARRANT

There’s Only One Bad Guy Here

I shouldn’t have to say this, but I do.  The only bad guy here is the perpetrator.  He committed atrocious crimes:  sexual battery with a deadly weapon, armed kidnapping, and armed burglary with a battery within (page 17), all of which are punishable by life sentences.

The detective is not a bad guy; he’s doing his job.  The judge is not a bad guy; she’s doing hers.  GEDmatch is not a bad guy; they were compelled by a warrant.  But good people can sometimes do the wrong thing.  And I believe that the warrant, as written, was the wrong thing.

 

Both Reassuring and Alarming

The warrant is reassuring and alarming at the same time.  What’s reassuring is that the detective had exhausted more established methods, like CODIS markers, before attempting genetic genealogy.  And he asked for information that had once been available, i.e., that the suspect’s kit had nine promising matches.  That is specific information that seems like a reasonable request in a warrant.

On the other hand, he asked for a lot more than just those nine matches and a lot more than a normal user of GEDmatch would have.  General users can’t see the real names of all matches nor when they last logged in, yet GEDmatch was compelled to provide that information. He also asked for “mobile numbers”, which GEDmatch does not collect to my knowledge.

Of greatest concern is that GEDmatch was given 20 business days to comply with the warrant, yet they turned over the requested information within 24 hours.  That’s simply not enough time to have a lawyer read, analyze, and respond to a 42-page warrant on a novel type of search involving the genetic privacy of more than a million people.  I will reiterate:  the owners of GEDmatch are good people, but they made a mistake.

There was no urgency.  This was a cold case.  The perpetrator has apparently been inactive for two decades.  There was no reason to believe that he would harm anyone new in the week or two needed for GEDmatch to consult a lawyer.  For that matter, there was no reason to believe that harm would come by waiting for more people to voluntarily opt in to law-enforcement matching.

All this for a case that only had a moderate probability of being solved.

 

Will Another Warrant Happen?

The real question is, how many times has it already happened?  The warrant’s logic is that the search was reasonable because someone on the genetic genealogy team had seen the match list.  Hundreds of forensic kits had been uploaded to GEDmatch by 19 May, 2019.  Detectives investigating any one of them could use the same argument for their own cases.  Maybe they already have.  After all, the only reason we know about the Orlando warrant is because the detective was overheard talking about it.

The Department of Justice interim policy for forensic genetic genealogy attempts to strike a balance between civil liberties and the power of genetic genealogy to make the public safer.  It says that such searches should only be used for violent crime (unless the database itself permits otherwise), only after alternative methods have been considered, and only in databases that notify their users that law enforcement may be searching them.  The latter should theoretically exclude the opted-out portion of GEDmatch’s database.

However, the Department of Justice policy only took effect on 1 November, 2019.  Warrants served on a database prior to that would not be so constrained.  And the guidelines don’t apply to state and local agencies unless they receive federal funding.

So, yes, I expect we’ll learn about more warrants on genealogy databases in due time.

 

Where Do We Go from Here?

The genealogy community continues to sunder as one group advocates for self-control of our data while the other thinks pubic safety outweighs individual privacy.  Accusations fly.  Discussion of the topic is being censored.  Kit sales are down.  And the two largest players in the industry, 23andMe and Ancestry, felt compelled to issue statements saying that they prioritize customer privacy within days of the GEDmatch warrant becoming public.

There is a way forward, though, and that’s informed consent:  only expose kits to criminal investigations if the tester has granted permission.  The testing companies already practice informed consent for their research programs; there’s no reason not to apply similar protocols to criminal investigations.  For consent to be informed, however, the tester must be apprised of both pros and cons in an objective way, they should not be pressured to make a particular choice, and their decision should be respected absolutely.

We also need to step back and show respect for different choices:  Right now, the civil liberties group is often accused of ‘protecting murderers and rapists’ rather than having their privacy concerns heard.  As an analogy, those who opt out of biomedical research are not accused of being pro-cancer, even though the disease kills nearly 600,000 Americans per year.  Why should opting out of criminal investigations be any different?

25 thoughts on “The GEDmatch Warrant”

  1. Give law enforcement all the tools needed to catch criminals especially in murder cases. If anyone in my family tree is a felon, rapist, murderer, I’d still want them caught by justice.

  2. After this one I deleted my kits from GedMatch. If I cannot op out with any assurance that GedMatch will attempt to protect the privacy of my data, I’m done.

  3. Another excellent analysis and I read the whole warrant myself (lots to read)!

    There are many excellent comments/suggestions etc in your article but this one hopefully puts the wrong accusations to an end:

    “ As an analogy, those who opt out of biomedical research are not accused of being pro-cancer, even though the disease kills nearly 600,000 Americans per year. Why should opting out of criminal investigations be any different?”

  4. “According to The New York Times, GEDmatch provided the requested data the very next day. This is worrisome. They had ample time to ask a privacy lawyer whether the warrant was overly broad, and to challenge it if so, but they apparently chose not to.” Did you ask if they consulted an attorney or are you just assuming? Did you confirm what was reported in the New York Times? It is your judgment that the founder of GEDmatch made a mistake. Why not make that explicitly clear rather than stating your opinion as fact?

    I am really tired of your rehashing this issue over and over. In fact, GEDmatch had a spike in new uploads after the GSK case broke. I interpret that to mean many people thought it was a good use of their DNA if it got these bad actors off the streets. Your continued hounding and rehashing of this issue borders on harrassment.

    Please go back to thinking and writing about probability in genetic genealogy. You are very good at that!

    1. I encourage anyone who disapproves so passionately of what I choose to write about to unsubscribe from my blog.

    2. It does seem extremely obtuse to accuse someone of rehashing something on their blog site. Maybe there might be a better site for you, say, that already agrees with your foregone conclusion of an opinion.

      Hearing opinions that don’t necessarily agree with your own helps formulate a better argument on the subject matter. I’m assuming that you would like to hear both sides of the story? Or, is it only alright to talk about things if the opinion is already of your liking?

      1. Thank you, JJ. Pam Tabor made that comment 3.5 years ago. It’s instructive to note what we’ve learned since then:
        • GEDmatch was used by law enforcement in violation of the Terms of Service
        • GEDmatch sold their entire database of users (who contributed their DNA for free) to Verogen for $15 million
        • The detective behind this GEDmatch warrant lied to family members in another case to obtain their DNA
        • GEDmatch was hacked at least twice, exposing the entire database, including kits that were set to private/research status
        • GEDmatch began charging law enforcement $199 to upload to the database. They’ve since raised the price twice, to $550 and now $700 per upload.
        • GEDmatch changed their Terms of Service to expose all public kits to Doe cases (including criminal ones) regardless of user privacy settings.
        • Some kits that had been deleted from the system mysteriously reappeared.
        • Verogen, which owns GEDmatch, was purchased by a European company called QIAGEN for $150 million.
        • Meanwhile, both GEDmatch and FamilyTreeDNA have grown at a glacial pace compared to the rest of the industry. Since Pam’s comment, AncestryDNA has added more than 7 million people, 23andMe more than 3.6 million, and MyHeritage more than 3 million, FamilyTreeDNA roughly 370,000, and GEDmatch about 500,000. I interpret that to mean many people thought it was a bad idea to hand their DNA over to companies that collaborate with law enforcement.
        (details here: https://thednageek.com/timeline-of-investigative-genetic-genealogy/)

  5. Leah provides an excellent and free (!) service to the genetic community.

    If you’re not happy Pam then indeed unsubscribe but please don’t dictate what she has to write about.

    That’s exactly the point about freedom of speech and expression.

    Leah, once again thank you for all the many hours you spend on researching the topics you write about and keeping us informed. Appreciated!

  6. It seems to me that law enforcement should be allowed to access the database with certain restrictions: they have exhausted all other avenues of identification first, they are not entitled to the entire database but are limited to 1st or 2nd degree relatives, and a professional genetic genealogist is employed to ascertain that what they deduce from this is accurate since they are law enforcement, not genealogists. I think 3rd preferably or at least 4th degree relatives is too wide in scope (I have found in my research that my accuracy at that level is diminished significantly). I think this needs to be written into the law and aggressively promoted by the genealogy companies. If they don’t people will pull their kits from the sites and their business will dry up. Personally I want to see the violent criminals caught and loved ones identified, but I don’t want to see my DNA information out there hither and yon where anybody could use it for whatever.

    1. The Department of Justice interim policy on forensic genetic genealogy addresses some of your concerns. The policy only allows such searches in databases that explicitly permit it, although it doesn’t limit the number of matches. What’s most concerning to me is that the policy allows surreptitious sampling of innocent people for genetic genealogy tests, which are highly personal.
      https://www.justice.gov/olp/page/file/1204386/download

    2. Barbara, do you mean cousin level (as in 1st or 2nd cousin)? The term degree is used differently in this context: a first degree relative is a parent/child or sibling (sharing 50% of their DNA); a second degree relative would be grandparent/grandchild or avuncular or half-sibling (sharing 25% of their DNA); a 3rd degree relative would be like a 1st cousin (sharing 12.5% of their DNA); and so on. Some cases have been solved with ancestors going back to the 1800’s if there are enough cousins.

      FTDNA is limiting LE match lists to the first 25 or so. I did post a query on a Facebook page asking if people would be more comfortable if they knew the GEDmatch match list would be limited. It didn’t seem to make a difference.

      For the record, I remain opted in to LE matching.

  7. I urge everyone to watch the movie ‘Minority Report’.

    No, this isn’t actually it but this can go so terribly wrong so suddenly when Law Enforcement can tap these databases and find nebulous connections and begin trying to ‘back into’ the perpetrator’s DNA.

    It has already happened in a case in Idaho where a distant cousin of the actual perpetrator was taken into custody (I don’t think he was, technically, ‘arrested’) and questioned for hours.

    He was, eventually cleared. And, to be fair, the DNA technology was still, relatively speaking, in its infancy. The technology has improved considerably.

    BUT, the risks are enormous. There will come a moment when a person is arrested on the basis of a weak connection and held and interrogated until he (yes, almost always male) ‘confesses’ just to make the interrogation stop. Those confessions are so common that all confessions following lengthy interrogations become suspect.

    1. There’s a genuine risk of innocent people being harassed by LE for information, but CODIS testing should clear anyone who is falsely accused.

  8. As an attorney, I find most of your analysis spot on. The warrant was a fishing expedition and entirely too broad. The detective also either lied or at a minimum misrepresented another aspect in the warrant: that it takes weeks for a gedmatch user to obtain a kit number. I deleted most of my information from gedmatch after this latest fiasco. In my opinion, gedmatch never had any intention of challenging the warrant. They wanted something to cover their backsides. Even worse, we would not have known about this had the detective not been bragging at a police conference.

    1. midnirdr, I wish there was a version of the warrant where I could search for a key word, but I didn’t spot a statement about it taking weeks to obtain a kit number. If you’re referring to this statement on p 31 “assigned a yet to be identified kit number”, I imagine the process went something like this. Parabon performed the DNA analysis of the crime scene evidence, the GEDmatch upload, and the initial assessment about its suitability for further investigation. They demonstrated this to the LE officer before proceeding with the contract, but the TOS changed in the meantime. That’s why the LE officer says he “witnessed” some of this but did not know the kit number.

  9. I meant out to 1st or 2nd cousins, 1st or 2nd aunts & uncles, nieces & nephews.

    Equally important I think is the genealogist as intermediary. It seems that police could present a warrant to the genealogy company, who assigns a genealogist to provide a maximum number of 5 or 10 close relatives if any are in the database, and if there are none close, then to deny any further exposure of the private data. It should be written into the law what is allowed so police, genealogy company and customers know exactly what is permitted. It should also be made explicit what police may do with the information that they are given.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.