🧬 Biology · Genetics
📅 Березень 2026 ⏱ ≈ 10 хв читання 🟡 Середній

How DNA Works

Coiled inside almost every one of your 37 trillion cells is about 2 metres of DNA — a molecule that carries complete instructions for building and running a human being. Written in an alphabet of just four chemical letters, it is the most information-dense storage medium ever discovered.

Structure of DNA

DNA (deoxyribonucleic acid) is a polymer — a long chain of repeating units called nucleotides. Each nucleotide consists of three parts:

Nucleotides link together via phosphodiester bonds between the phosphate of one and the sugar of the next, forming the "backbone" of one strand. Two antiparallel strands coil around each other held together by hydrogen bonds between the bases — this is the famous double helix, first described by Watson and Crick in 1953 based on Rosalind Franklin's X-ray crystallography data.

Double helix dimensions: One full helical turn spans 10 base pairs (bp) and about 3.4 nm in length. The helix diameter is ~2 nm. A human genome contains ~3.2 billion base pairs per haploid set — uncoiled, one set would stretch ~1 m.

The Base-Pairing Rules

The two DNA strands are complementary: the sequence of one strand determines the sequence of the other, via strict base-pairing rules first deduced from Erwin Chargaff's measurements in 1950:

A — T (Adenine–Thymine)
Paired by 2 hydrogen bonds. Adenine (purine, double-ring) pairs only with Thymine (pyrimidine, single-ring).
G — C (Guanine–Cytosine)
Paired by 3 hydrogen bonds. Guanine (purine) pairs only with Cytosine (pyrimidine). This extra bond makes G-C pairs stronger.
5'–ATGCAGTCG–3' (Template strand) | | | | | | | | | 3'–TACGTCAGC–5' (Complementary strand)

Because the base-pairing rules are so rigid, knowing the sequence of one strand tells you the sequence of the other exactly. This property is what makes DNA replication and information transfer possible.

DNA Replication

Before a cell divides, it must copy all of its DNA so each daughter cell gets a complete genome. DNA replication is semi-conservative: each new double helix consists of one original strand and one newly synthesised strand.

1
Unwinding — Helicase
The enzyme helicase breaks the hydrogen bonds between base pairs, unzipping the double helix at a "replication fork". Energy comes from ATP hydrolysis.
2
Priming — Primase
Primase synthesises a short RNA primer (~10 nucleotides) that provides the 3'-OH end DNA polymerase needs to start building.
3
Synthesis — DNA Polymerase III
DNA Pol III reads the template strand 3'→5' and adds complementary nucleotides 5'→3' at about 1,000 bases per second. The leading strand is synthesised continuously; the lagging strand in short Okazaki fragments.
4
Sealing — DNA Ligase
Ligase joins Okazaki fragments and removes RNA primers, replacing them with DNA. Error rate: ~1 mistake per 109–1010 base pairs thanks to proofreading.
Speed of replication: Human DNA polymerase adds ~1,000 nucleotides/second. The full human genome (3.2 billion bp) has ~30,000 replication origins firing simultaneously — it still takes about 8 hours. E. coli copies its 4.6 million bp genome in 40 minutes from a single origin.

Transcription — DNA to mRNA

Cells don't use DNA directly to make proteins. Instead, the relevant section of DNA is first copied into a single-stranded messenger RNA (mRNA) molecule — a process called transcription, performed by RNA polymerase.

RNA differs from DNA in two ways: it uses ribose (not deoxyribose) as its sugar, and instead of Thymine (T) it has Uracil (U), which pairs with Adenine.

DNA template: 3'–TACGCATGG–5' ↓ RNA Polymerase mRNA transcript: 5'–AUGCGUACC–3' Rule: T → A, A → U, G → C, C → G

In eukaryotes (animals, plants, fungi), the pre-mRNA is processed in the nucleus: introns (non-coding segments) are spliced out, a 5'-cap and poly-A tail are added, and the mature mRNA exits to the cytoplasm.

Translation — mRNA to Protein

Ribosomes read the mRNA strand in groups of three nucleotides called codons. Each codon specifies one amino acid, or a start or stop signal. This mapping is the genetic code.

Codon (mRNA) Amino acid Note
AUG Methionine Start codon — translation begins here
UUU / UUC Phenylalanine
GAA / GAG Glutamic acid
GGU / GGC / GGA / GGG Glycine Four synonymous codons
CCU / CCC / CCA / CCG Proline
UAA / UAG / UGA (stop) Terminates translation

Transfer RNA (tRNA) molecules carry the correct amino acid and have an anticodon that base-pairs with the mRNA codon. The ribosome catalyses the formation of peptide bonds between successive amino acids, building the polypeptide chain which then folds into a functional protein.

1
Initiation
Ribosome assembles on mRNA at the start codon (AUG). First tRNA (carrying Methionine) docks in the P-site.
2
Elongation
Next tRNA enters the A-site. Peptide bond forms between amino acids. Ribosome translocates one codon. Uncharged tRNA exits. Repeat ~20 amino acids/second.
3
Termination
Stop codon is reached. Release factor triggers hydrolysis of the last tRNA–polypeptide bond. The ribosome disassembles. The protein chain then folds spontaneously (often with chaperone help).

Genes and the Genome

A gene is a sequence of DNA that encodes a functional molecule — usually a protein, sometimes a functional RNA. The complete set of DNA in an organism is its genome.

Key numbers for the human genome:

"Junk DNA" isn't junk: The human genome project initially called non-coding DNA "junk". Follow-up projects like ENCODE (2012) showed at least 80% of the genome has some biochemical activity — regulatory elements, chromatin structure anchors, and long non-coding RNAs that influence gene expression.

Mutations

A mutation is a permanent change in the DNA sequence. Mutations are the raw material of evolution — without them, all life would be genetically identical. Most are neutral; some are harmful; rare ones are beneficial.

Types of mutation

Original: ...AAU GAG CCG UGA... Asn Glu Pro STOP Substitution (G→A at position 5): Mutant: ...AAU AAG CCG UGA... Asn Lys Pro STOP ← one amino acid changed

CRISPR — Editing the Code

CRISPR-Cas9 (Clustered Regularly Interspaced Short Palindromic Repeats) is a molecular tool repurposed from bacterial immune systems that allows scientists to cut DNA at a precise location and edit the sequence. Jennifer Doudna and Emmanuelle Charpentier were awarded the 2020 Nobel Prize in Chemistry for its development as a gene-editing tool.

A guide RNA (gRNA) is designed to match the target DNA sequence. The Cas9 protein follows the gRNA, finds the matching sequence in the genome, and cuts both strands of the double helix. The cell then repairs the break using one of two pathways:

Medical applications (2025–26): The first CRISPR therapy Casgevy (exagamglogene autotemcel) was approved for sickle cell disease and β-thalassaemia in the UK (2023) and US (2023). Dozens more trials are underway for hereditary blindness, cancers, and HIV.

Try It Yourself

The cellular automata simulation shows how complex self-replicating patterns can emerge from simple binary rules — a beautiful analogy for how genetic information unfolds:

🎮 Game of Life — Self-Replicating Patterns →

Reaction-diffusion models the kind of chemical signalling that controls gene expression in developing embryos (Turing patterns):

🧪 Reaction-Diffusion Simulation →