DNA Basics: DNA Is Not Just for Genealogy

DNA has been the best thing to hit genealogy since the invention of the census, but do you ever wonder what it’s doing when it’s not helping you identify relatives past and present?

Image credit: lisichik/Pixabay


DNA is often called “the molecule of life”, because it’s that important.  In fact, DNA is so important that its presence is part of the most common definition of life, thrice over.

Living organisms:

    • acquire and use energy;
    • are made up of cells;
    • possess genetic information (DNA);
    • can reproduce (which involves copying DNA); and
    • are the product of evolution (when DNA changes).

To be honest, biologists love to debate the definition of life, so many will not agree with all of these criteria.  In fact, I don’t myself.  However, they comprise the most widely taught definition of life and are a good starting point for understanding how living organisms differ from non-living entities.


What Is DNA?

You can think of DNA as an instruction manual for cells.  It tells them what to do and when to do it.  More accurately, it’s the language in which the instruction manual is written.

DNA itself is a chemical, deoxyribonucleic acid.  It’s a very, very long chemical—a macromolecule in scientific jargon—made up of thousands or millions of individual subunits, called nucleotides or nucleobases.  (Technically, a ‘base’ is just part of a nucleotide, but we can use the terms interchangeably here.)

There are four DNA bases—adenine (A), cytosine (C), guanine (G), and thymine (T)—and they are connected end-to-end like beads on a string.  DNA is actually two such strands twisted around one another, like two-ply yarn.  We call this twisting structure a double helix and the bases across from one another base pairs.

Image credit: genome.gov/genetics-glossary/acgt


As a rule, the base A always pairs with T and C always pairs with G.  This is important, and we’ll come back to it later.  These paired bases are only weakly attached to one another, so the whole thing can “unzip” as needed then close back up.

The DNA in a single human cell would stretch six feet if laid out in a line.  That length is unmanageable for a microscopic cell.  Instead, our 3.2 billion base pairs are organized into chromosomes, each of which is just a portion of our total DNA.  Humans have 23 pairs of chromosomes.  We inherit one complete set (46 total) from each parent.

The chromosomes differ in size.  Most are numbered from 1 to 22, with chromosome 1 being the longest.  These are the so-called autosomes and give us the name of the most common DNA tests for genealogy: autosomal tests.

There are also two allosomes, X and Y, which influence sexual anatomy.  For that reason, these two are often called the sex chromosomes. Most people who have two X chromosomes develop as female and most who have the XY configuration develop as male.  (There are exceptions, which will be addressed in another post.)


DNA Can Be Reproduced

Most of the time, our DNA is loosely packed, so the cell can access the information in its code.  To continue the yarn analogy, it looks like this.

Image credit: Anita Smith/Pixabay


When the cell duplicates, though, it must copy all of its DNA so that each daughter cell gets a set.  Parceling a stringy mess precisely into each daughter cell would be difficult and prone to error.  For efficiency, after the chromosomes are copied, they condense into tight packages the way yarn is wound into skeins.

Image credits: annekarakash/Pixabay, genome.gov/13514624


In the right half of the image, note that each chromosome has a twin; there are two greens, two pinks, two blues, etc.  Those are the maternal and paternal copies of each chromosome.  If you look closely, you may also notice that each of those twins has two parallel parts.  Those are are the duplicate copies.   Each daughter cell will get one of each and end up with the same genetic makeup as the parent cell.

A more complicated process occurs when sex cells are made.  In that case, each resulting egg or sperm only has one copy of each chromosome rather than a pair.  That’s a whole ‘nother story!

How is the DNA itself copied?  Recall that the bases always pair with one another in the same way:  A with T, C with G.  To make a copy, a specialized protein in the cell “unzips” the double helix, then another protein comes along and uses the base-pairing rules to create new paired strands for each half of the original.  This is called DNA replication.  It’s quite brilliant!

Image credit: wikipedia.org/wiki/DNA_replication#/media/File:DNA_replication_split.svg


DNA Is Genetic Information

The sequence of bases on a strand of DNA is a code.  That code tells how to make proteins, which are the workhorses of the cell.  Each protein has a specific function, which it does over and over like a robot in a factory.  The bits of DNA that code for proteins are called genes.

Proteins are macromolecules, like DNA.  Unlike DNA, though, they are strings of amino acids rather than nucleic acids.  What’s more, DNA is stored in a compartment of the cell called the nucleus, while proteins are made out in the main part of the cell, the cytoplasm.

Image credit: Arek Socha/Pixabay


To get the genetic information from one place to the other, the cell uses an intermediary called messenger RNA, or mRNA.  (Yes, this is the same mRNA made famous by the covid vaccines.)

RNA is similar to DNA.  For our purposes, the key difference is that RNA uses a base called uracil (U) instead of thymine (T).  To create RNA from the master template DNA, the code is transcribed into RNA using similar base pairing rules as in replication.  This is analogous to how a genealogist might transcribe a hand-written document into a typewritten version; the content is exactly the same, but the format is different.

Partial transcription of a probate document.

After an mRNA molecule is made, it moves out of the nucleus into the cytoplasm, where it is decoded to create proteins.  This decoding process is called translation, because the “language” of nucleic acids (DNA and RNA) is being converted to the “language” of amino acids (proteins).

The cell can regulate which proteins are being made when and in what quantities by controlling how much mRNA is floating around.  mRNA breaks down naturally within minutes or hours, so if the nucleus stops transcribing it, the mRNA levels in the cytoplasm will decline and the protein will no longer be made.

Circling back to the covid vaccine briefly, the fragile nature of mRNA is why the Pfizer and Moderna versions need to be stored in ultracold freezers.  At room temperature, the mRNA will break down rapidly.  Degraded mRNA isn’t harmful—your cells are full of it all the time—but it won’t work as a vaccine.

Image credit: Hakan German/Pixabay


DNA Can Mutate

The third aspect of DNA that makes it integral to life is that it can change.  Usually, it’s copied faithfully from one generation to the next.  Occasionally, though, errors occur.  An A might get replaced with a C; that’s called a single nucleotide polymorphism or SNP.  A string of 10 ACT in a row might become 11.  Entire genes can be duplicated or lost.  They can even move from one chromosome to another.

Mutation gets a bad rap, but without it, every living thing on Earth would be genetically identical.  In truth, without mutation, life would have gone extinct billions of years ago when conditions changed.

Yes, some mutations are harmful, but the vast majority of them have no effect at all, and occasionally they are beneficial.  A mutation might make a protein more efficient.  Duplication can allow one copy of a gene to develop a whole new function without losing the advantages of the original.  A key mutation even allows some people to drink milk as adults simply because we keep making a protein, lactase, into adulthood that turned off at weaning in our ancestors.

Photo credit: Eiliv-Sonas Aceron/Unsplash


Beneficial mutations are more likely to get passed on.  As they become more common, we say the population evolves.

Different populations will evolve in different ways, depending on conditions.  If your culture doesn’t raise dairy animals, producing lactase into adulthood would waste energy.  Dark skin is an advantage in sunny climates but not overcast ones.

Populations can evolve by chance, too.  Remember, most mutations have no effect, and they can become more or less frequent by luck of the draw.  Many of the genetic differences used in our ethnicity estimates are of this type.


DNA and Genealogy

Without mutations, we couldn’t do genetic genealogy.  Everyone would have the exact same mitochondrial DNA.  All men would have identical Y chromosomes.  Autosomal DNA couldn’t distinguish a 5th cousin from an identical twin.  And “ethnicity” would be meaningless.

You can thank your DNA for carrying and reproducing genetic information, but not always copying it perfectly.  You can thank DNA for your life.



It might seem odd to add a disclaimer to a post about science, but biology is complicated, and there are exceptions to much of what is written here.  Some cells don’t have a nucleus.  Not all genes make proteins; others make more than one version.  Someone can be born with a Y chromosome and female anatomy or lack a Y and have male anatomy.  Single sentences in this post could form the basis of an entire graduate-level university course.

Consider what’s written here as a basic overview for the layperson.  If something’s not clear, please ask about it in the comments.

9 thoughts on “DNA Basics: DNA Is Not Just for Genealogy”

  1. Can we put a percentage on the size of elements of the genome that are frequently tested in genetic genealogy (Y, X, At, Mt) ?

    1. Are you asking about which “types” are used most frequently? Roughly 30–35 million people have done atDNA tests (which include the X chromosome). According to FamilyTreeDNA, the main tester for yDNA and mtDNA, they have 767,156 yDNA records and 202,965 mtDNA records as of today. Of course, it’s not quite that straightforward, but those numbers are a good ballpark.

      1. Sorry, my english ain’t that great, i’ll try to rephrase.

        How much of the genome does the ydna occupy in percentage ?

        Also same question for the (x, mt and at), can we put a percentage on their part of the genome ?

        1. The Y is about 57 million base pairs. The X is about 156 million. The mitochondrial genome is about 16,500 bp. And the entire genome is about 3.1 billion. That should give you a rough estimate.

        2. Thx for the info, it’s exactly what i wanted to know 🙂

          How many base pairs does the autosomal part contain?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.