In elementary algebra you learn to solve equations—find the numbers that make them true. Galois theory asks a stranger, deeper question: can you swap the solutions around without anyone noticing?
That question—which sounds almost like a prank—turns out to be one of the most powerful ideas in all of mathematics. It explains why we have a quadratic formula but no quintic formula, connects polynomial roots to the geometry of the icosahedron, and even sheds light on why deep neural networks work. This post walks through the main ideas, starting from the simplest possible example.
01The "Hello World" Example: $\mathbb{Q}(i)$
Consider the equation $x^2 + 1 = 0$. Over the rationals $\mathbb{Q}$, this has no solutions. So we expand the universe: adjoin a root, call it $i$, and build the extension field
This is our stage. The rationals $\mathbb{Q}$ sit inside it as the base field—the part of the world we already understand. The new element $i$ is defined entirely by the constraint $i^2 = -1$.
Now here's the key observation. There are two numbers in $\mathbb{Q}(i)$ that square to $-1$: the element $i$, and the element $-i$. From the viewpoint of $\mathbb{Q}$ alone, these two are algebraically identical—no polynomial test with rational coefficients can tell them apart. They are algebraic twins.
What Is an Automorphism?
An automorphism of $\mathbb{Q}(i)$ over $\mathbb{Q}$ is a bijection $\sigma\colon \mathbb{Q}(i) \to \mathbb{Q}(i)$ that:
1. Preserves structure: $\;\sigma(\alpha + \beta) = \sigma(\alpha) + \sigma(\beta)$ and $\sigma(\alpha \cdot \beta) = \sigma(\alpha) \cdot \sigma(\beta)$.
2. Fixes the base field: $\;\sigma(q) = q$ for all $q \in \mathbb{Q}$.
Since every element in $\mathbb{Q}(i)$ looks like $a + bi$ and $\sigma$ must fix $a$ and $b$ (they're rational), the entire map is pinned down by a single choice: where does $i$ go?
Applying $\sigma$ to the defining relation $i^2 = -1$ gives $\sigma(i)^2 = -1$, so $\sigma(i)$ must itself be a square root of $-1$. The only candidates are $i$ and $-i$. That gives us exactly two automorphisms:
Identity: $\sigma(i) = i$. Everything stays put.
Conjugation: $\sigma(i) = -i$. This sends $a + bi \mapsto a - bi$—a reflection across the real axis.
The set of these two maps forms the Galois group $\operatorname{Gal}(\mathbb{Q}(i)/\mathbb{Q}) \cong C_2$.
Why the Swap Works
It's worth checking that conjugation doesn't break anything. The defining relation says $i \cdot i = -1$. Replace $i$ with $-i$:
The swap is legal precisely because the algebraic twin obeys the same rules. This isn't a coincidence—it's the entire point.
02The Cascade Effect
At first glance, defining a map on an infinite field sounds daunting. But there's a beautiful shortcut.
If $K = \mathbb{Q}(\alpha_1, \ldots, \alpha_n)$, then any automorphism $\sigma$ fixing $\mathbb{Q}$ is completely determined by the values $\sigma(\alpha_1), \ldots, \sigma(\alpha_n)$.
The idea is simple: every element of $K$ is a polynomial expression in the generators with rational coefficients. Structure preservation and base-fixing together force $\sigma$ to propagate through the polynomial like falling dominoes:
Once you decide where the generators go, the fate of the entire infinite field is sealed.
Indistinguishability
This cascade has a striking consequence. If $P(x)$ is any polynomial with rational coefficients and $P(i) = 0$, then applying $\sigma$ to both sides immediately gives $P(-i) = 0$.
No algebraic test using rational coefficients can distinguish $i$ from $-i$.
A quick sanity check: the polynomial $P(x) = x^3 + i$ has $i$ as a root but not $-i$. Contradiction? No—the coefficient $i$ is not in $\mathbb{Q}$. The theorem only applies to base-field polynomials. Applying conjugation to $x^3 + i = 0$ transforms the equation itself into $x^3 - i = 0$.
03Why the Reals Are Rigid
Can we play the same game with $\mathbb{R}$?
Attempt 1—Flip signs: Try $\sigma(x) = -x$. This preserves addition, but $\sigma(1 \cdot 1) = -1$ while $\sigma(1)\cdot\sigma(1) = 1$. Multiplication breaks.
Attempt 2—Scale: Try $\sigma(x) = \alpha x$. Then $\sigma(xy) = \alpha xy$ but $\sigma(x)\sigma(y) = \alpha^2 xy$, forcing $\alpha = 1$ or $\alpha = 0$.
More deeply: positive reals have square roots in $\mathbb{R}$ while negative reals don't. Any automorphism must send squares to squares, which locks the ordering in place. The upshot: $\mathbb{R}$ has no non-trivial automorphisms—it's rigid. Galois theory is the study of fields that are "floppy" enough to admit non-trivial swaps.
04The Galois Group Is Not Just Permutations
A polynomial of degree $n$ has $n$ roots, so the Galois group always lives inside the symmetric group $S_n$. But it is often much smaller, because roots can be locked together by algebraic relations.
The Cyclotomic Example
Take the 5th cyclotomic polynomial $x^4 + x^3 + x^2 + x + 1 = 0$, whose roots are $\zeta, \zeta^2, \zeta^3, \zeta^4$ with $\zeta = e^{2\pi i/5}$. Four roots, so naively up to $4! = 24$ symmetries. But the Galois group has only 4 elements.
Why? The roots are powers of each other. If $\tau$ sends $\zeta \mapsto \zeta^2$, then it's forced to send $\zeta^2 = \zeta \cdot \zeta \mapsto \zeta^2 \cdot \zeta^2 = \zeta^4$. Try swapping $\zeta$ and $\zeta^2$ while leaving $\zeta^3$ alone: checking the multiplication rule gives $\zeta \neq \zeta^4$—a contradiction.
The roots move as a rigid constellation, not a loose bag of marbles. Hidden algebraic relations drastically cut down the number of valid rearrangements. The Galois group equals the full symmetric group $S_n$ only when no algebraic relations exist among the roots beyond the polynomial itself.
05Structure, Constraint, and Symmetry
A philosophical principle runs through all of this:
Structure = Constraint → Consistency → Symmetry.
When we build a field extension, we impose strict rules: addition and multiplication must still work, the base field must be respected. These constraints are not limitations—they're what make the whole enterprise stable. They guarantee that two different computation paths always arrive at the same answer (path independence).
It is precisely this rigidity that makes symmetry meaningful. In a chaotic set with no structure, "symmetry" is vacuous—there's no shape to preserve. Symmetry only becomes interesting when there are rules that a transformation might break but doesn't. The automorphisms of a field extension thread this needle: they rearrange elements while leaving every algebraic relationship intact.
06The AIT Connection: Symmetry as Leftover Ambiguity
Here the paper makes an elegant connection to Algorithmic Information Theory (AIT). The field $\mathbb{Q}(i)$ can be specified by a very short "program":
Start with $\mathbb{Q}$. Adjoin an element $x$ subject to $x^2 + 1 = 0$.
Formally, this is the quotient construction $\mathbb{Q}(i) \cong \mathbb{Q}[x]/(x^2+1)$. In Kolmogorov complexity terms, this is a highly compressed description—a short program generating an infinite structure.
But notice what the program doesn't specify: which root gets the name "$i$." The constraint $x^2 + 1 = 0$ pins down a relation ($x^2 = -1$), not a unique identity. The leftover ambiguity—the freedom to choose the $+$ root or the $-$ root—is exactly the Galois group:
One bit of unresolved choice. In this reading, Galois symmetry measures the degrees of freedom that remain after the structural constraints have fixed everything they can.
07Solvability: When Symmetry Can Be Unwound
Now we reach the historical punchline. Why does the Galois group matter?
Galois's great insight was that an equation is solvable by radicals—its roots expressible using $+, -, \times, \div$ and $n$-th roots—if and only if its Galois group is a solvable group: one admitting a subnormal series with abelian factors.
Each abelian layer corresponds to extracting a single $n$-th root. The quadratic, cubic, and quartic formulas all exist because $S_2$, $S_3$, and $S_4$ are solvable—their symmetry can be peeled off one commutative layer at a time.
But $S_5$ contains the alternating group $A_5$, which is simple (no non-trivial normal subgroups). It's a monolithic block of symmetry that cannot be factored. This is why there is no general quintic formula: the symmetry group refuses to be unwound into sequential radical extractions.
Solvability = Compositionality
This is perhaps the paper's deepest reframing. A formula like the quadratic formula is a sequential program: compute the discriminant, take a square root, do some arithmetic. This step-by-step procedure works because the symmetry group decomposes into layers.
Solvable groups are compositional: they have internal hierarchy that can be exploited one piece at a time. Non-solvable groups are monolithic: $A_5$ is an indivisible block. There's no way to peel it apart, so there is no sequence of radicals that can reconstruct the roots.
08The "Nuclear Option": Inventing New Primitives
The Abel–Ruffini theorem says the quintic can't be solved with radicals. But it doesn't say the roots don't exist—only that our standard toolbox is too weak.
The Bring Radical. Any quintic can be reduced to $x^5 + x + a = 0$. The Bring radical $BR(a)$ is defined as the real root of this equation. Adjoin it to the toolbox and all quintics become expressible.
Elliptic Modular Functions. A deeper approach: the symmetry of the quintic is governed by $A_5$, isomorphic to the rotation group of the icosahedron. Standard radicals correspond to circular (cyclic) geometry. To match icosahedral symmetry, Hermite and Kronecker showed that elliptic modular functions—which naturally carry this richer symmetry—can solve the quintic.
Match the symmetry of the tool to the symmetry of the problem. Radicals have abelian symmetry; they fail on non-abelian problems. Upgrade the tool to one with the right internal symmetry, and the "unsolvable" becomes solvable.
09The Continuous Frontier: From Lie to Learning
The story extends beyond polynomials. Sophus Lie asked: can we build a Galois theory for differential equations? The answer is yes, and the same structural dichotomy appears:
If the symmetry group of a differential equation (now a Lie group) is solvable, the equation can be solved by stacking integrals—a "reduction of order" procedure. If the Lie group is simple (like $SL_2(\mathbb{C})$ for the Airy equation), the solution defines an irreducible new function that can't be decomposed into elementary steps.
The Deep Learning Connection
Tomaso Poggio and collaborators have shown that deep neural networks succeed precisely when the target function is compositional—decomposable into a hierarchy of simpler functions: $F(x) = f_L(f_{L-1}(\ldots f_1(x)\ldots))$. Generic high-dimensional functions require exponentially many samples (the curse of dimensionality), but compositional functions can be approximated with polynomial sample complexity.
The parallel to Galois theory is exact:
| Domain | "Solvable" means… | "Unsolvable" means… |
|---|---|---|
| Algebra | Group decomposes into abelian layers → formula by radicals | $A_5$ is monolithic → no radical formula |
| Calculus | Lie group is solvable → solution by quadratures | Simple Lie group → irreducible special function |
| Learning | Target is compositional → efficient approximation | Monolithic structure → curse of dimensionality |
In every case, solvability is decomposability.
10The Staircase of Understanding
The paper's final synthesis uses the Kolmogorov Structure Function to visualize the progress of an intelligent agent. Plotting remaining error against model complexity, the curve doesn't descend smoothly—it forms a staircase, revealing two alternating modes:
Phase 1—Composition (the slope). The agent rearranges existing primitives to minimize error. This is like solving equations within a fixed radical toolkit. Eventually it hits a plateau—a "monolithic barrier" where the current language isn't expressive enough.
Phase 2—Discovery (the drop). The agent identifies a recurrent irreducible pattern and compiles it into a new primitive—a new cognitive core. Mathematically, this is like defining the Bring radical or the Airy function. A small increase in descriptive complexity yields a massive drop in error.
Phase 3—Re-composition (the new slope). With the expanded toolkit, the agent re-enters compositional search, combining old and new primitives. The cycle repeats.
Intelligence is not finding one formula. It is a cycle of inventing new primitives and composing them.
Galois theory teaches us that the composition phase is the "easy" part. The history of mathematics—and of AI—is defined by the hard part: the moments when someone (or something) discovers the new atomic operations that make the previously unsolvable, solvable again.
∞Further Reading
- The full paper: BCOM WP0052 on Zenodo (CC BY 4.0)
- J. S. Milne, Fields and Galois Theory—free course notes
- E. Artin, Galois Theory (Notre Dame lectures, 1971)
- T. Poggio & H. Mhaskar, "Why and when can deep—but not shallow—networks avoid the curse of dimensionality" (2017)
- G. Ruffini, "Structured Dynamics in the Algorithmic Agent" (2024)
- T. Poggio, Thirty Brief Lectures on Foundations of Deep Learning (Draft, 2026)