The Endogamy Files: What Is Endogamy?

Endogamy is a word that gets bandied around a lot in genetic genealogy circles, but what it means and how it affects our work is less clear. This post is the first in a series about what endogamy is, why it matters, how to detect it, and how to work with it.

Endogamy is the practice of mating within a specific group.  All human populations have practiced endogamy to one extent or another.  Some still do.  Endogamy can occur because the group is geographically isolated from other people, like Native Hawaiians were; because they prefer to marry within their religion, ethnicity, language, and/or social caste, as most cultures do; or for other reasons, like consolidating power among royalty.

Key to endogamy is that the group is small enough that, over time, marriages occur between cousins.  Not necessarily first or even second cousins (although that can occur), but between third, fourth, and more distant cousins.  Over and over.  And over.

It’s important to remember that endogamy is not incest, which is sexual relations between close relatives, like a father and daughter or uncle and niece.  Incest is associated with a substantial risk of early death or genetic disorders in the child, while marriages between even first cousins are much safer.

I’ll discuss the health implications of incest and endogamy in a later post.


Endogamy and Pedigree Collapse

Endogamy causes something called pedigree collapse, but not all pedigree collapse rises to the level of endogamy.

Consider the example below.  The home person (at the bottom) is the child of parents who were third cousins to one another.  That is, the parents shared a pair of great-great grandparents.  As a result, their child (the home person in the diagram) has 30 unique great-great-great grandparents instead of the expected 32.  One set of 3-great grandparents shows up twice in the child’s tree.  We say the pedigree is “collapsing” rather than doubling in number with each generation back, as we’d expect.

But pedigree collapse is not endogamy.  Pedigree collapse is one or a few isolated incidents of cousin marriage, while endogamy occurs repeatedly over many, many generations.

This is endogamy:

Note that this second diagram also incorporates birth year, so the generations are not aligned with one another vertically like they were in the hypothetical example above.  To see your own tree this way, use the Exploring Family Trees tool.

This is my mother’s tree.  She’s Cajun, a culture that was geographically and culturally isolated in southern Louisiana and, before that, in what is now Nova Scotia.  Cajuns have been marrying mainly within their own population since the 1600s.

My mother’s parents were fourth cousins.  I don’t think they knew, because my grandmother’s father was born out of wedlock.  My grandfather’s parents were third cousins; they definitely knew.  There’s no known incest in this tree, but the cousin marriages go on and on, back to the earliest settlers in Port-Royal, Acadia (now Annapolis Royal, Nova Scotia) in the early 1600s, because there simply weren’t a lot of options for marriage partners.

The closest cousin marriages I’ve identified in this tree are between first cousins.  Consider Isaure Marie Guidry (1863–1933), my mother’s great grandmother.  In the diagram above, she is the left-most female (pink) ancestor just below the horizontal line for 1850.  Her parents, Alexis Onésime Guidry and Palmire Dupré, were first cousins through their shared grandparents Louis–David Guidry and Marie Modeste Borda.

To complicate matters even more, Onésime had been widowed before marrying Palmire.  His previous wife, Celestine, was Palmire’s older sister, and Celestine had a daughter named Marie.  So Marie and Isaure were half sisters through their father Onésime, and first cousins through their mothers Celestine and Palmire.  (This combination is often termed three-quarter siblings.)  But they were also second cousins through Onésime and their mothers.  Each unique relationship path is shown in red below.


Technically, Marie and Isaure were second cousins twice over, once through Onésime and Celestine and once through Onésime and Palmire, but you get the picture.  It’s enough to make your head spin!


How Does Endogamy Affect Genetic Matching?

Isaure and Marie died more than 75 years before the advent of genetic genealogy using autosomal DNA, but what would their match to one another look like if we could analyze their genomes today?  As half sisters, we’d expect them to share about 1750 cM, as first cousins another 850 cM or so, and as double second cousins roughly 200 cM twice over.  In many parts of their genomes, they’d match on both copies of their two chromosomes, much like full siblings do.  In fact, they might well be indistinguishable from full sisters using the methods we currently use for genealogy.

While Isaure and Marie are an extreme case, DNA matching is affected to some degree in all endogamous populations.  People who are no closer than fourth cousins might share enough DNA to be predicted as third cousins, because they’re picking up “extra” shared DNA through their other relationships.

For example, my mother shares 184 cM with D.M.  If you were to plug that number into the DNA Painter SCP tool, you’d see a combined probability of 89.1% that they were either in the second cousin group (38.8% chance) or the second cousin once removed group (50.3%).


In fact, their closest relationship is third cousins, who average only about 50 cM. On the other hand, Mom and D.M. are also third cousins once removed twice over, fourth cousins once removed, and fifth cousins … that we know of.  All those distant relationships add to the shared centimorgan tally.

Thus, the overall effect of endogamy is to make many of our DNA matches appear to be more closely related than they really are.  This complicates everything, from basic relationship prediction to more advanced and powerful techniques, like the Leeds method and the What Are the Odds? tool.

In subsequent posts, we’ll address how to identify endogamy in a family tree, how to identify it using only DNA match lists, gauging how much endogamy is present, best practices for genetic genealogy, and some health implications.


A Fun Thought Experiment

Is the entire human population endogamous?  After all, we only mate (well, mate successfully) with other humans and have been doing so for ten thousand years or more, since the last archaic humans, like Denisovans and Neanderthals, died out.  Technically, we’re all (very distant) cousins, and all of our pedigrees collapse eventually.

What do you think?

57 thoughts on “The Endogamy Files: What Is Endogamy?”

  1. That’s interesting. Here in Canada earlier today by chance I was watching this: It’s a mini history lesson about Acadia.

    I have a lot of East Frisians and have wondered about endogamy in that group. I haven’t heard about it… but I wouldn’t be too surprised if it’s a factor.

    1. The expulsion was barbaric. Thousands died, and families were separated forever.

      It would be interesting to look at endogamy in East Frisians. Unfortunately, not many Germans have tested yet.

  2. I have DNA matches which suggests endogamy. My daughter and I have matches where we share the same amount of centimorgans or very close amounts with the match. I also have matches which are shared with both my parents.
    Looking forward to learning more

    1. If you and your daughter share just one segment with those matches, it might not be endogamy. It’s possible the same segment simply got passed down to you intact rather than being whittled down by crossing over. In an upcoming post, I’ll show how to gauge the amount of endogamy in your match list.

  3. Thank you for this very interesting post. I look forward to the rest of the series. I have encountered much difficulty over the years attempting to solve some NPEs in the New Mexican branch of my family tree as my efforts continue to be complicated by endogamy.

    I am grateful for the information you’ve posted about the amount of atDNA Isaure and Marie might have been expected to share. If I have read it correctly, you appear to be saying that the amount of DNA they would have been expected to share is roughly the sum of all of the amounts of shared DNA from each of their different reationships to each other combined. This is something I have often wondered but have had difficulty finding expressly stated. Thank you.

    1. In most cases, you can assume that the total shared amount of DNA is roughly the sum of the individual relationships (assuming you can identify all the relationships, and also allowing for the fact that DNA inheritance is messy). However with three-quarter siblings, there’s an added complication that I only alluded to in the post.

      Full siblings have spots in their genomes where they don’t match at all (roughly 25%), spots where they match on one of their two chromosome copies (roughly 50%; called “half identical”), and spots where they match on both (25%; called “fully identical”). Most testing sites treat half identical and fully identical segments exactly the same. In other words, a fully identical segment of 100 cM would only count as 100 cM toward the total, even though it’s really two 100-cM segments.

      The presence of fully identical regions (FIRs) is a characteristic of full siblings, but we also see them in three-quarter siblings to a lesser extent. I haven’t thought through how much FIRs Isaure and Marie would have had given the additional 2nd cousin relationships.

  4. Thanks for this, Leah. I am looking forward to the rest of series. As you know, I have found endogamy a real stumbling block to using DNA to further my genealogy research.

  5. Pedigree collapse and endogamy get brought up pretty often, as you mentioned, when discussing DNA. Thank you for this clear post on how they are alike and how they are different. And, the graphs are particularly helpful. I’ve meaning to chart my own family’s pedigree collapse, so thanks for the reminder!

  6. Thank you for this series. I have French Canadians as well as some Germans in Southwest Virginia who have made my DNA matches difficult.

      1. I am really struggling. I suspected exactly what your article describes, but I have to be able to prove it. One family says I am their brother’s child; while my mother says this is impossible. The DNA is HIGH. If we are off even 300 cm’s it changes from Aunt to 1st cousin. How can I find out who my father is….both are deceased.

        1. Sounds like you have some great matches and are on the right track. I’ll reach out to you by email.

  7. Thank you for this article. The line that It will make your head spin is 100% true! I have a lot of French Canadian and Colonial American ancestors- nothing surprises me on who’s related anymore!

  8. Leah, many thanks! This puts a finer point on my unstanding of the topic. I posted this at the FTDNA Forum. Think I got your permission to do this previously.

  9. Thank you so much for this information! I’ve been struggling with the endogamy and pedigree collapse throughout my Dad’s paternal line for years. I’ve traced the group back to their arrival in New Amsterdam and their migration patterns from that point. I’ve narrowed it down to basically three groups that went separate directions but each group falls into the endogamy relationship pattern for a number of generations. My Dad’s branch managed to break out of the group around 1850 when they moved from Illinois to Minnesota. In addition to the endogamy, I believe there is at least one pedigree collapse in his direct line where two brothers or cousins (we can not be certain of whether they were brothers or cousins at this point!) but they married two sisters and as a result, the descendants of both lines show up as much closer dna connections… that could also be due to the endogamy within the groups!

    1. We see the same pattern over and over with immigrants and emigrants. People tended to migrate in groups of relatives and friends, and they often married within that group until they became established in their new locations.

  10. I have a situation with two families that immigrated to the US from Alsace Loraine. Siblings of one family had been marrying siblings of the other family for several generations. Now, in trying to solve an NPE by triangulating DNA matches, all roads lead to both families, though cM numbers are higher for matches to one family than the other. With this situation be considered endogamy?

    1. The difference between endogamy and pedigree collapse is one of degree: an isolated cousin marriage is pedigree collapse. When it occurs repeatedly over many generations, it’s endogamy. There’s no sharp line between the two. Your case sounds like endogamy to me.

  11. “ENDOGAMY”… What an interesting topic!
    I thoroughly enjoyed reading about this topic and looking forward to learning more about this informative area of genealogy.

  12. I love your work, and follow it to the best I can.
    My parents have at least one shared relationship a few generations back, with the possibility of more. All my family lineages have been in America for 300+ years, and all came from Western Europe. In the earliest of times, there weren’t a lot of options for marriage.
    I have a problem of remembering where the match or matches are because of a brain injury. I don’t have a wall chart made yet. Ancestry doesn’t show a tree like you do. Is there a publicly available program that I could use that shows endogamy?

  13. My paternal grandparents came to America from Lebanon. I recently heard the expression “Everyone in Lebanon is related to each other.” My paternal DNA matches are so confusing and I know I need to learn how endogamy effects DNA. Plus records listing great grandparents different names.
    I have a lot to learn.

    1. That sounds a lot like the expression “All Cajuns are cousins”! I’d love to have a peak at some data from your matches (no names needed). I will send you an email.

  14. Great article! I have always been confused between pedigree collapse and endogamy – this was so helpful. I am struggling to identify my paternal great grandmother’s family because of endogamy – the geographical isolation kind from a rural area of NC. Not helping is that my great grandmother was born about 1832. I have identified a set of relatives via DNA I’m confident are on her line. A gentleman I believe to be my GG’s nephew or first cousin moved to Florida all the generations of his family stayed there. They did not continue to marry into the same lines from here in rural NC – I was so excited to discover them!!! I feel so close to figuring this out but just can’t get over the brick wall – just trying to read and learn all I can. Thx!

    1. I’m so glad the article helped. Endogamy is so hard to work with. I’m hoping we’ll have some new tools come online in the next year or two to help.

  15. Ooops – I just posted this on “A Major Update” in error I meant to post here…

    I used WATO v2 to root out a very probable 3xGGf who has eluded me for years. The person lived in the early 1800’s where endogamy was rampant in this tiny German town of cousins. I’ve built out the trees of dozens of 4th-6th closest cousin matches and using shared cm tool and a succession of WATO trees, came up with a tree with a common ancestor born in 1748 for 9 of the matches. Every node in the tree has a fully documented paper trail. WATO suggested a likely candidate who was already in my tree. (Sounds so simple here – I spent years on this.)

    To test the hypothesis, I added my own family using the newly hypothesized ancestral couple to the tree which now contained 15 matches in total. WATO gave me a single green hypothesis with a value of 1 in my uncle’s spot and ruled out the others. This implies there are no other possibilities, a jaw dropping moment.

    My concern is that excluding my family, and a couple known descendants of my 3xggm, the average cMs was about 30 for the 9 distant cousins tree (possibly a bloated number) and over 200 when my family members were included. My 3xggm was definitely a product of endogamy but her paramour much less so although not exempt. There was no further endogamy in their descendants nor in those of the other mutual matches. I recall reading in my travels that the effects of endogamy dissipate over time and am not sure if I have an endogamy issue or not; the documented endogamy predates my 3xggs.

    I tried to disprove the winning hypothesis by copying the WATO tree, flipping the target person to a different match and entering the corresponding cMs for that person, but it worked every time. (Several cousins shared their ancestry matches with me.). I’m not sure if WATO was capable of failing since I built the tree using mutual data. I’m at a loss to come up a different way to challenge the conclusion.

    I believe I’ve successfully found the right person with WATO’s guidance, even without taking DNA into account. The location, age, timeline all make sense and before DNA, this would have been all the “proof available.” Even though the bulk of the matches are small, 15 simultaneous matches coexisted peacefully in 1 WATO tree and I was given a green light.

    Clearly, I was successful, but did I actually PROVE anything? WATO specifically says it’s less reliable at cM levels below 40 (which applies to the initial tree of 9 yet still identified a likely candidate). Once the average climbed over 200 did that make the question of endogamy go away or did I stack the deck with too many close relatives and merely cloak it.

    I want to be able to present and defend my conclusion with good data but I’m not sure what I can factually assert based on the results obtained from WATO. I am beyond thrilled that it most likely led me to my mystery ancestor, it’s stated purpose. The fact that I corroborated 15 separate DNA matches simultaneously and came up with a score of 1, indicating there are no other choices holds weight and feels like an important data point but is it proof? Do I need to consider whether this conclusion is tainted with the question of endogamy? I may be asking more of the tool than it was designed to do.
    Gotta Love WATO!!!

    PS – Even though I entered a birth date for my target in the 1940’s, I was getting hypotheses generated for ancestors in the 1800’s. Maybe I was pushing the limits of the tool with such a distant ancestor.

    Here’s the first WATO tree for the 9 distant matches

    Here’s a link to the 2nd cleaned up WATO tree which includes my family as descendants of the newly hypothesized ancestral couple:

    1. When your research question (3GGF) and target (Richard) aren’t the same person, you should only have one tested person representing the RQ person. In your case, you only get one possible hypothesis because you included Richard’s nieces/nephews and a 1C1R, as well as a few more distant cousins.

  16. Really helpful post. I have a question: Would it be accurate to describe a community like Barra in the Outer Hebrides of Scotland as endogamic? The surnames are quite limited. In 1790 the population was only 1,604. The other surrounding islands were much the same; Catholics and Protestants generally didn’t intermarry. I’m curious if it’s okay to make a generalization about that particular community based on its isolation, low population and limited surnames.


    1. I’m not familiar with that population, but it’s probably safe to assume that most isolated populations, especially if they were small, were endogamous.

  17. The tree of one of my paternal-paternal Great Grandmother is interesting in this way and it bears out in autosomal matching as well. She should have 128 4th Great Grandparents but only has 128. She also has 62 3rd Great Grandparents as well. Most of this cousin marrying occurred in the early to mid 18th century near Botecourt Virginia. I believe because of this I do get higher match estimates than are actual.

  18. my father is from a crypto Sepharim Jewish family that engaged in endogamy for hundreds of years post inquisition and typically married 1st or second cousins. however, they were know to marry sibling, half sibling and other incestuous combinations presumably if cousins were not available. my paternal grandparents were the last arranged marriage. needless to say, my dad’s family tree is a huge, tangled web of disturbing connections.

  19. Hi yes it looks quite unusual and disturbing. I don’t have permission to share externally as I did it before and got dinged by ancestry as they somehow tracked it. The most interesting finding is that I share 1/3rd more dna with my paternal cousins than my maternal ones. I can share parts of my tree though because that has a share function if want to see it.

  20. Nice explanations and what great software to show the collapse!
    I would like to add that Irish genealogy which is crucial to many across the world is also subject to collapse and endogamy on a frequent basis according to experts. I have so many connections that are unfindable because of this. I have also begun to find the same effect happening in New York’s immigrant Irish population but I have only just started this project.
    However, I have discovered an effect it has with me personally. Often I do not get a match when I have a contact with names, location and dates in really close proximity to my Irish relatives. When I look at the visual results on GEDMatch there is sometimes a noticeable green and yellow area(s) between 20 – 45% of the length from the left hand side of Chromosome 6. I have some spectacular results from this and the green areas are huge but no match – a sign of endogamy I think. From over 50 of these results now I can safely tell people that if the effect is there then they have a high probability of Irish ancestry – probably from Tipperary. So, there is a useful side in this respect as many have not been able to make the transition back into Ireland.
    I have a result on my new fledgling website that has been out for a month and will be replaced by mark II this week with the same and maybe some more.

      1. Sorry, I never looked at numbers because GedMatch never gives a match and so does not give details. What I discovered since is that I did get a match on this zone which seemed long visually but is only 4cM. The estimate I gave is based on holding a ruler to the screen. Real scientific? 🙂
        I can email a bunch of screenshots if they would be of interest. I have put one example on the family history website (below) under the DNA / Genetics tab then Endogamy.

        1. GEDmatch should tell you the cMs if you lower the matching threshold. That said, anything below 7 cM is increasingly likely to be a false positive.

        2. Hello,
          Sorry for the confusion but GedMatch doesn’t record matches here so I don’t get numbers. It just looks like a match with solid bars of yellow and sometimes green but no match. That was what aroused my curiosity in the first place. Why would it look like a match visually but not be recorded as one?
          I regularly run my matches on 3cM when I am trying to know if a tree owner or suspected Irish descendant, is from Irish descent. The effect on Chr6 happens sometimes when I compare my result who are descended from people of south-west Ireland, generally but not in all cases. What I also see on more rare occasions, because as you explained so well in your Rootstech video I can often see masses of small matches with an occasional 6cM or better. Endogamous indications could be a pointer (not proof) that people with the same surname are from Ireland and not the English variety that stems from Norfolk England. This is important to people who think they may be Irish but have no idea where to look. Obviously better matches to people who are definitely 100% Irish is better but many people start with surnames in Irish genealogy even though they have DNA test results because there will be thousands of results with names from all over the world and most people don’t know where to go first when they start looking.

        3. In the One-to-One tool, make sure you tick the box to prevent hard breaks. If that doesn’t get GEDmatch to call the segment, you can try looking at the match in full resolution to guessimate the start/stop points, then plug that into the cM Estimator tool at DNA Painter to see how big it is:

  21. Hi, I’m loving learning about genetic genealogy here! Can you point me to the follow up posts you mentioned in this particular blog? Thank you!

    1. I’m afraid the pandemic got in the way of my writing, so I haven’t followed up on the series. Thank you for the reminder!

  22. Thank you for this post. My head has been spinning for over a year due to endogamy. I have been quest to figure out who my great grandmother’s biological parents were. Because of an almost non existent paper trail, I used my father’s DNA thinking, “He is genetically closer, so easy breezy.” I was so wrong. My father’s grand parents were born in the heart of Appalachia. I am positive that my father’s grandmother and grandfather are fairly closely related with endogamy from Virginia to the hills of Kentucky. Teasing my great grandmother’s family from her husband’s family, to isolate probable bio parents, has been impossible because so many of them are the same relatives. In this situation, would it be beneficial to have my father complete a mitochondrial DNA test? Would the test have enough I specificity with the closer generations? I look forward your future posts on dealing with endogamy. Thanks, again.

    1. Mitochondrial DNA is best used when you have some candidate women in mind. That said, it might be worth testing your father anyway to have for future reference. If you go that route, you’ll want to do the full mitochondrial test at FamilyTreeDNA.

  23. Hello Cousin,

    Furthering my knowledge of the Acadian side of my heritage and i love the Visualizer you presented in this blog post, it is how i stumbled upon this post.
    We appear to be related through your grandmother as 2nd & 3rd Cousins 11 to 12 times removed by way of Louis–David Guidry and Marie Modeste Borda.

    I look forward to your follow up on this series.

    Keep up the good work

  24. Mennonite endogamy is crazy to wade through. My tree has no identifiable relatives marrying as far back as the late 1700s. (DNA matches and paper trails confirm.) However there is extreme endogamy because the population was descended from a small group of families.

  25. Just stumbled on your site. It’s great! Do you know of any sources for Canary Islanders beyond the Baton Rouge sacramental and Father Hebert sacramental books. I am at dead on several Canary Islander lines.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.