Number Theory

Number theory is the study of the integers $Z$ and their generalisations — divisibility, primes, congruences, Diophantine equations, algebraic integers, and the deep arithmetic of rational points on varieties. It is the oldest branch of pure mathematics (Euclid’s Elements, c. 300 BC, proved the infinitude of primes and the fundamental theorem of arithmetic) and the source of some of the most consequential modern results: Andrew Wiles’s 1995 proof of Fermat’s Last Theorem, the prime number theorem (Hadamard / de la Vallée Poussin 1896), the proofs of the Weil conjectures (Deligne 1974), and the partial results toward the Langlands program. It is also the source of practically every cryptographic primitive deployed at scale — RSA, elliptic-curve cryptography, lattice-based post-quantum schemes, isogeny-based protocols — and so connects deep abstract mathematics to the most quotidian engineering. This note covers divisibility and the Euclidean algorithm, primes and the distribution theorems, congruences, Fermat-Euler-Wilson, multiplicative functions, Dirichlet series and $L$ -functions, quadratic reciprocity, an algebraic-number-theory primer, elliptic curves, modular forms, the Wiles proof outline, and cryptographic applications.

1. Divisibility and the Euclidean algorithm

1.1 Divisibility

For $a, b \in Z$ with $b \neq = 0$ , $b ∣ a$ ( $b$ divides $a$ ) means $a = bk$ for some $k \in Z$ . The divisibility relation is reflexive, transitive, and antisymmetric on $Z_{\geq 1}$ .

1.2 Greatest common divisor

For $a, b \in Z$ not both zero, $g cd (a, b)$ is the largest positive integer dividing both. By convention $g cd (0, 0) = 0$ , $g cd (a, 0) = ∣ a ∣$ .

Bézout’s identity (Étienne Bézout 1779): there exist $x, y \in Z$ with $a x + b y = g cd (a, b)$ . The coefficients are not unique but the gcd is.

1.3 Euclidean algorithm

Euclid c. 300 BC. Compute $g cd (a, b)$ by iteration: $a = b q_{1} + r_{1}, b = r_{1} q_{2} + r_{2}, r_{1} = r_{2} q_{3} + r_{3}, \dots$ with each $r_{i} < r_{i - 1}$ . The last nonzero remainder is $g cd (a, b)$ . Reverses to give Bézout coefficients via back-substitution.

Complexity: at most $O (lo g min (a, b))$ iterations (Gabriel Lamé 1844 showed the worst case is consecutive Fibonacci numbers). Each iteration costs $O (lo g^{2} max (a, b))$ bit operations naively, or $O (M (n))$ with fast multiplication. Total: $O (n^{2})$ bit operations on $n$ -bit inputs, improvable to $O (n lo g^{2} n lo g lo g n)$ via Schönhage’s half-gcd.

1.4 Least common multiple

$lcm (a, b) = ∣ ab ∣/ g cd (a, b)$ .

2. Primes and the fundamental theorem

2.1 Primes

A prime is an integer $p \geq 2$ whose only positive divisors are $1$ and $p$ . Smallest primes: $2, 3, 5, 7, 11, 13, 17, 19, 23, 29, \dots$ .

Euclid’s theorem: there are infinitely many primes. Proof: given primes $p_{1}, \dots, p_{n}$ , the number $N = p_{1} \dots p_{n} + 1$ is either prime or has a prime factor not among the $p_{i}$ . Either way we get a new prime.

2.2 Fundamental theorem of arithmetic

Every integer $n \geq 2$ factors uniquely (up to order) as a product of primes: $n = p_{1}^{a_{1}} p_{2}^{a_{2}} \dots p_{k}^{a_{k}} .$ Existence is straightforward by induction. Uniqueness uses Euclid’s lemma: if $p$ is prime and $p ∣ ab$ , then $p ∣ a$ or $p ∣ b$ .

2.3 Mersenne, Fermat, and other special primes

Mersenne primes: $M_{p} = 2^{p} - 1$ prime, $p$ prime. As of 2024, 52 known; the largest known prime is a Mersenne prime ( $M_{82589933}$ found 2018). Conjecturally infinitely many.
Fermat primes: $F_{n} = 2^{2^{n}} + 1$ . Only $F_{0} = 3, F_{1} = 5, F_{2} = 17, F_{3} = 257, F_{4} = 65537$ known to be prime; $F_{5} = 4294967297 = 641 \cdot 6700417$ (Euler 1732).
Twin primes: pairs $(p, p + 2)$ . Conjecturally infinitely many (Polignac 1849). Yitang Zhang 2013 Annals of Math 179: there are infinitely many prime pairs differing by at most $70 \times 1 0^{6}$ . The Polymath8 project + James Maynard reduced this to 246.
Sophie Germain primes: $p$ such that $2 p + 1$ is also prime.

3. Prime distribution

3.1 Prime counting function

$π (x)$ = number of primes $\leq x$ . Sample values: $π (10) = 4$ , $π (100) = 25$ , $π (1 0^{6}) = 78498$ , $π (1 0^{10}) = 455052511$ .

3.2 Prime number theorem

Conjectured by Gauss (age 14, 1792) and Legendre (1798); proven independently by Jacques Hadamard 1896 and Charles-Jean de la Vallée Poussin 1896: $π (x) \sim \frac{x}{l o g x}, x \to \infty.$

Equivalently $π (x) \sim Li (x) = \int_{2}^{x} d t / lo g t$ . The error term is best-stated using the latter: $π (x) = Li (x) + O (x e^{- c l o g x})$ unconditionally (de la Vallée Poussin), and $π (x) = Li (x) + O (x lo g x)$ under the Riemann hypothesis (Helge von Koch 1901).

3.3 The Riemann hypothesis

Riemann 1859 conjectured that all non-trivial zeros of $ζ (s)$ have real part $1/2$ . Equivalent: $π (x) - Li (x) = O (x lo g x)$ . Eighth Hilbert problem (1900), first Clay Millennium Prize problem. Verified for the first $1 0^{13}$ zeros (van de Lune et al. and successors); widely believed.

3.4 Other distribution results

Chebyshev (1852): $0.92 \cdot x / lo g x < π (x) < 1.11 \cdot x / lo g x$ .
Bertrand’s postulate (Chebyshev 1850, conjectured Bertrand 1845): for every $n \geq 1$ there is a prime in $(n, 2 n)$ .
Dirichlet’s theorem on primes in APs (Peter Gustav Lejeune Dirichlet 1837): for $g cd (a, q) = 1$ , there are infinitely many primes $\equiv a (mod q)$ , distributed equally among residue classes coprime to $q$ : $π (x; q, a) \sim \frac{1}{φ ( q )} \cdot \frac{x}{l o g x} .$
Green-Tao theorem (Ben Green and Terence Tao 2008 Annals of Math 167): the primes contain arbitrarily long arithmetic progressions.

4. Congruences

4.1 Definition

$a \equiv b (mod n)$ means $n ∣ (a - b)$ . The relation is an equivalence relation; the quotient $Z / n Z$ is a finite commutative ring of order $n$ . The multiplicative group of units is $(Z / n Z)^{\times} = {a : g cd (a, n) = 1 mod n}$ , of order $φ (n)$ .

4.2 Chinese Remainder Theorem

Sun Tzu c. 4th century AD (in special cases); general statement clear by the 13th century. For pairwise coprime $n_{1}, \dots, n_{k}$ : $Z / (n_{1} \dots n_{k}) Z ≅ Z / n_{1} Z \times \dots \times Z / n_{k} Z$ via $a \mapsto (a mod n_{1}, \dots, a mod n_{k})$ . The system $x \equiv a_{i} (mod n_{i})$ has a unique solution mod $\prod n_{i}$ .

Algorithmic version: use Bézout coefficients of $n_{i}$ and $N / n_{i}$ to construct the explicit solution.

4.3 Fermat’s little theorem

Pierre de Fermat 1640 (no proof); first proof by Euler 1736. For prime $p$ and $a \in Z$ with $p ∤ a$ : $a^{p - 1} \equiv 1 (mod p) .$ Equivalently $a^{p} \equiv a (mod p)$ for all $a$ .

4.4 Euler’s theorem

Leonhard Euler 1763. For $g cd (a, n) = 1$ : $a^{φ (n)} \equiv 1 (mod n) .$ Generalises Fermat’s little theorem (case $n = p$ ). The Euler totient $φ (n) = ∣ {1 \leq k \leq n : g cd (k, n) = 1} ∣$ .

4.5 Wilson’s theorem

John Wilson 1770 (conjecture), Lagrange 1771 (proof). For prime $p$ : $(p - 1)! \equiv - 1 (mod p) .$ Converse holds: this is a primality test. Computationally inefficient ( $O (p)$ vs $O ((lo g p)^{c})$ for AKS) but theoretically elegant.

4.6 Order of an element

The order of $a \in (Z / n Z)^{\times}$ is the smallest $k \geq 1$ with $a^{k} \equiv 1 (mod n)$ . Divides $φ (n)$ . For prime $p$ , $(Z / p Z)^{\times}$ is cyclic of order $p - 1$ (Gauss); a generator is a primitive root mod $p$ .

5. Multiplicative functions

5.1 Definitions

$f : Z_{\geq 1} \to C$ is multiplicative if $f (mn) = f (m) f (n)$ whenever $g cd (m, n) = 1$ . Completely multiplicative if the condition holds for all $m, n$ .

5.2 Key examples

$1 (n) = 1$ for all $n$ .
$n^{s}$ for any complex $s$ .
$φ (n) = n \prod_{p ∣ n} (1 - 1/ p)$ — Euler’s totient.
$σ_{k} (n) = \sum_{d ∣ n} d^{k}$ . Special cases: $σ_{0} = d$ (number of divisors), $σ_{1} = σ$ (sum of divisors).
$μ (n)$ — Möbius function: $μ (1) = 1$ , $μ (n) = (- 1)^{k}$ if $n$ is a product of $k$ distinct primes, $μ (n) = 0$ if $n$ has a squared prime factor.
$Λ (n)$ — von Mangoldt function (not multiplicative): $Λ (p^{k}) = lo g p$ , $Λ (n) = 0$ otherwise.

5.3 Dirichlet convolution

For arithmetic functions $f, g$ : $(f * g) (n) = \sum_{d ∣ n} f (d) g (n / d) .$ This is associative and commutative, with identity $ε (n) = [n = 1]$ . The space of arithmetic functions with Dirichlet convolution is a commutative ring; multiplicative functions form a subgroup under convolution.

5.4 Möbius inversion

August Ferdinand Möbius 1832. If $g = 1 * f$ (i.e., $g (n) = \sum_{d ∣ n} f (d)$ ), then $f = μ * g$ (i.e., $f (n) = \sum_{d ∣ n} μ (d) g (n / d)$ ). Equivalently $μ = 1^{- 1}$ in the Dirichlet convolution ring.

Examples:

$\sum_{d ∣ n} φ (d) = n$ , so $φ = μ * Id$ .
$\sum_{d ∣ n} μ (d) = [n = 1]$ .
$\sum_{d ∣ n} Λ (d) = lo g n$ .

6. Dirichlet series and $L$ -functions

6.1 Dirichlet series

For an arithmetic function $f$ : $D (f, s) = \sum_{n = 1}^{\infty} \frac{f ( n )}{n ^{s}} .$ Converges absolutely in a right half-plane $Re (s) > σ_{a}$ for some $σ_{a} \in [- \infty, \infty]$ .

Convolution becomes multiplication: $D (f * g, s) = D (f, s) D (g, s)$ .

6.2 Riemann zeta function

$ζ (s) = D (1, s) = \sum 1/ n^{s}$ . Euler product (1737): $ζ (s) = \prod_{p} (1 - p^{- s})^{- 1}, Re (s) > 1.$ This is the analytic translation of unique factorisation.

See complex-analysis for analytic continuation, functional equation, and the Riemann hypothesis.

6.3 Dirichlet $L$ -functions

For a Dirichlet character $χ : (Z / q Z)^{\times} \to C^{\times}$ (extended by zero): $L (s, χ) = \sum_{n = 1}^{\infty} \frac{χ ( n )}{n ^{s}} = \prod_{p} \frac{1}{1 - χ ( p ) p ^{- s}}, Re (s) > 1.$

Non-trivial $L (s, χ)$ extends analytically to $C$ (entire if $χ$ is non-principal). Non-vanishing at $s = 1$ is equivalent to Dirichlet’s theorem on primes in APs. Generalised Riemann hypothesis (GRH): all non-trivial zeros lie on $Re (s) = 1/2$ .

6.4 Dedekind zeta function

For a number field $K$ : $ζ_{K} (s) = \sum_{a} \frac{1}{N ( a ) ^{s}} = \prod_{p} \frac{1}{1 - N ( p ) ^{- s}}$ summing over nonzero ideals and primes of $O_{K}$ . Generalises $ζ = ζ_{Q}$ . Class number formula relates the residue at $s = 1$ to the class number, regulator, discriminant, and unit group of $K$ .

6.5 The Selberg class and the Langlands program

Atle Selberg 1992 axiomatised the class of “nice” $L$ -functions (functional equation, analytic continuation, Euler product, Ramanujan-Petersson bound). The Langlands program (Robert Langlands 1967 letter to Weil) predicts a vast unification: $L$ -functions of automorphic representations of $GL_{n} (A_{K})$ correspond to $L$ -functions of $n$ -dimensional Galois representations. Many cases are now theorems (Wiles for $GL_{2} / Q$ , much of $GL_{n}$ in the function-field case via Drinfeld-Lafforgue, recent advances by Fargues-Scholze 2024).

7. Quadratic reciprocity

7.1 Legendre symbol

For odd prime $p$ and $a \in Z$ with $p ∤ a$ : $(\frac{a}{p}) = {+ 1 - 1 if a is a QR mod p, if a is a QNR mod p .$ Euler’s criterion: $(\frac{a}{p}) \equiv a^{(p - 1) /2} (mod p)$ .

7.2 Quadratic reciprocity

Conjectured Euler-Legendre 18th century; first proof Gauss 1796 (age 19). Gauss produced 8 distinct proofs; over 240 published proofs now exist.

For distinct odd primes $p, q$ : $(\frac{p}{q}) (\frac{q}{p}) = (- 1)^{\frac{p - 1}{2} \cdot \frac{q - 1}{2}} .$ Supplements: $(\frac{- 1}{p}) = (- 1)^{(p - 1) /2}$ and $(\frac{2}{p}) = (- 1)^{(p^{2} - 1) /8}$ .

7.3 Jacobi symbol

Extends Legendre to all odd $n$ : $(\frac{a}{n}) = \prod_{i} (\frac{a}{p _{i}})^{e_{i}}$ for $n = \prod p_{i}^{e_{i}}$ . Reciprocity extends. Computable in $O (lo g^{2} n)$ without factoring — central in primality testing (Solovay-Strassen).

7.4 Higher reciprocity

Cubic reciprocity (Gauss / Eisenstein), biquadratic reciprocity, and the general statement subsumed by class field theory (Takagi 1920, Artin 1927): the abelian extensions of a number field are classified by ideal class groups / idele class groups. Quadratic reciprocity emerges as the case of quadratic extensions of $Q$ and the Artin reciprocity isomorphism. Modern non-abelian generalisation: the Langlands program.

8. Algebraic number theory primer

8.1 Algebraic integers and number fields

A number field $K$ is a finite-degree field extension of $Q$ . The ring of integers $O_{K}$ is the set of $α \in K$ satisfying a monic polynomial in $Z [x]$ .

Examples:

$K = Q$ : $O_{K} = Z$ .
$K = Q (i)$ : $O_{K} = Z [i]$ (Gaussian integers).
$K = Q (2)$ : $O_{K} = Z [2]$ .
$K = Q (- 5)$ : $O_{K} = Z [- 5]$ . Not a UFD: $6 = 2 \cdot 3 = (1 + - 5) (1 - - 5)$ .
$K = Q (ω)$ with $ω = e^{2 πi / n}$ : $O_{K} = Z [ω]$ — cyclotomic field.

8.2 Failure of unique factorisation; ideals

The Gaussian integers $Z [i]$ are a UFD (Gauss). But many $O_{K}$ are not — the example above with $Z [- 5]$ failed Kummer’s attempts to prove FLT. Ernst Eduard Kummer’s insight (1847): factor ideals instead of elements. In $O_{K}$ every nonzero ideal factors uniquely as a product of prime ideals — a Dedekind domain.

8.3 Class group

The ideal class group $Cl (K) = I (K) / P (K)$ , where $I (K)$ is the group of fractional ideals and $P (K)$ the principal fractional ideals, is a finite abelian group. The class number $h_{K} = ∣ Cl (K) ∣$ measures the failure of unique factorisation of elements.

$h_{K} = 1$ iff $O_{K}$ is a PID iff UFD.
For $K = Q (- d)$ with $d > 0$ squarefree: $h_{K} = 1$ for exactly nine values $d \in {1, 2, 3, 7, 11, 19, 43, 67, 163}$ (Heegner 1952, Stark-Baker 1966).
Gauss class number problem for real quadratic fields ( $K = Q (d)$ , $d > 0$ ) — conjectured infinitely many with $h = 1$ ; still open.

8.4 Units

$O_{K}^{\times}$ is described by Dirichlet’s unit theorem (Dirichlet 1846): $O_{K}^{\times} ≅ μ (K) \times Z^{r_{1} + r_{2} - 1}$ where $μ (K)$ is the group of roots of unity in $K$ (finite), $r_{1}$ = number of real embeddings, $r_{2}$ = number of conjugate pairs of complex embeddings.

For real quadratic $K = Q (d)$ : $O_{K}^{\times} ≅ {\pm 1} \times Z$ , generated by $- 1$ and a fundamental unit. Computing the fundamental unit is equivalent to solving Pell’s equation $x^{2} - d y^{2} = \pm 1$ .

8.5 Splitting of primes

For a prime $p \in Z$ and a number field $K$ : $p O_{K} = p_{1}^{e_{1}} \dots p_{g}^{e_{g}}$ with $\sum e_{i} f_{i} = [K : Q]$ where $f_{i}$ is the residue degree. Behaviour types: split (all $e_{i} = 1$ , $g$ large), inert ( $g = 1$ , $e_{1} = 1$ , $f_{1} = [K : Q]$ ), ramified (some $e_{i} > 1$ ).

For quadratic $K = Q (d)$ : $p$ splits, is inert, or ramifies according to the value of $(\frac{D _{K}}{p})$ — Legendre/Kronecker symbol with the discriminant. Quadratic reciprocity governs the splitting.

9. Elliptic curves

9.1 Definition

An elliptic curve over a field $k$ (char $\neq = 2, 3$ ) is a smooth projective curve of genus $1$ with a marked point. Affine form: $E : y^{2} = x^{3} + a x + b, Δ = - 16 (4 a^{3} + 27 b^{2}) \neq = 0.$ The smoothness condition is $Δ \neq = 0$ .

9.2 Group law

The set of $k$ -rational points $E (k)$ is an abelian group with the marked point $O$ (at infinity) as identity. Geometrically: $P + Q + R = O$ iff $P, Q, R$ are collinear (counted with multiplicity).

Explicit formulas (chord-tangent construction):

$- P = - P$ has the same $x$ -coordinate, negated $y$ .
For $P = (x_{1}, y_{1}), Q = (x_{2}, y_{2})$ with $P \neq = - Q$ : slope $λ = (y_{2} - y_{1}) / (x_{2} - x_{1})$ if $P \neq = Q$ , $λ = (3 x_{1}^{2} + a) / (2 y_{1})$ if $P = Q$ . Then $x_{3} = λ^{2} - x_{1} - x_{2}$ , $y_{3} = λ (x_{1} - x_{3}) - y_{1}$ .

9.3 Mordell-Weil theorem

Louis Mordell 1922 for $E / Q$ , André Weil 1929 for general number fields: $E (K) ≅ E (K)_{tors} \oplus Z^{r}$ for a number field $K$ . Here $E (K)_{tors}$ is the (finite) torsion subgroup and $r$ is the rank.

Torsion is classified: Barry Mazur 1977 (Inventiones) for $K = Q$ — only 15 possible groups. The largest is $Z /12$ or $Z /2 \times Z /8$ .
Rank: no known algorithm to compute in general. Maximum known rank for $E / Q$ is $\geq 28$ (Elkies 2006). Conjecturally rank is unbounded — Bhargava-Skinner-Zhang show $E / Q$ has rank $0$ or $1$ “most of the time” (in a precise sense over the family of elliptic curves ordered by height).

9.4 Birch-Swinnerton-Dyer conjecture

Bryan Birch and Peter Swinnerton-Dyer 1965, based on numerical computations on the EDSAC. Statement: for $E / Q$ , the rank of $E (Q)$ equals the order of vanishing of $L (s, E)$ at $s = 1$ : $ord_{s = 1} L (s, E) = rank (E (Q)) .$ The leading coefficient is given by a precise formula involving the Tamagawa numbers, regulator, $∣ E (Q)_{tors} ∣^{2}$ , and the Tate-Shafarevich group $\Sha (E)$ .

Status: known for rank $0$ and $1$ (Coates-Wiles 1977; Kolyvagin 1989, plus Gross-Zagier 1986). Open for higher ranks. Clay Millennium Prize problem.

9.5 Reduction mod $p$ and the Hasse bound

For $E / Q$ and a prime $p$ of good reduction: $∣ E (F_{p}) ∣ = p + 1 - a_{p}, ∣ a_{p} ∣ \leq 2 p$ (Helmut Hasse 1933, proved as an analogue of the Riemann hypothesis for curves over finite fields).

The function $a_{p}$ governs the $L$ -function: $L (s, E) = \prod_{p} (1 - a_{p} p^{- s} + p \cdot p^{- 2 s})^{- 1}$ for primes of good reduction, with adjusted Euler factors at bad primes.

10. Modular forms

10.1 Definition

A modular form of weight $k$ for $SL_{2} (Z)$ is a holomorphic function $f : H \to C$ (upper half-plane) with:

$f (\frac{a z + b}{cz + d}) = (cz + d)^{k} f (z)$ for $(a c b d) \in SL_{2} (Z)$ .
Bounded as $Im (z) \to \infty$ (“holomorphic at the cusp”).

A cusp form additionally vanishes at the cusp.

The space $M_{k}$ of weight- $k$ modular forms is finite-dimensional; $dim M_{k}$ is given by an explicit formula (zero unless $k$ is even non-negative).

10.2 Eisenstein series

$E_{k} (z) = \frac{1}{2} \sum_{(m, n) \neq = (0, 0)} (m z + n)^{- k}, k \geq 4 even .$ Modular form of weight $k$ . The space $M_{k}$ for $SL_{2} (Z)$ is spanned by products $E_{4}^{a} E_{6}^{b}$ .

10.3 Discriminant cusp form

$Δ (z) = (2 π)^{12} η (z)^{24} = q \prod_{n = 1}^{\infty} (1 - q^{n})^{24}, q = e^{2 πi z} .$ Weight-12 cusp form; unique up to scalar. The Fourier coefficients $τ (n)$ are the Ramanujan tau function. Ramanujan-Petersson conjecture: $∣ τ (p) ∣ \leq 2 p^{11/2}$ — proved by Deligne 1974 as a consequence of the Weil conjectures.

10.4 Hecke operators

Erich Hecke 1937. Operators $T_{n}$ acting on modular forms. Their eigenforms have multiplicative Fourier coefficients. The simultaneous eigenforms (Hecke eigenforms) are the “atoms” — they give rise to $L$ -functions: $L (s, f) = \sum_{n = 1}^{\infty} \frac{a _{n}}{n ^{s}} = \prod_{p} \frac{1}{1 - a _{p} p ^{- s} + p ^{k - 1} p ^{- 2 s}} .$

10.5 Modularity theorem (Taniyama-Shimura-Weil)

Statement: every elliptic curve $E / Q$ is modular: there is a weight-2 cusp form $f$ for $Γ_{0} (N)$ ( $N$ = conductor of $E$ ) such that $L (s, E) = L (s, f)$ .

Wiles 1995 Annals of Math 141 proved the semistable case (sufficient for FLT). Breuil-Conrad-Diamond-Taylor 2001 Journal of the AMS 14 completed the general case.

11. Fermat’s Last Theorem

11.1 Statement

Pierre de Fermat 1637 marginal note in his copy of Diophantus: $x^{n} + y^{n} = z^{n}$ has no solution in positive integers for $n \geq 3$ . The note’s claim of a proof was almost certainly mistaken.

11.2 Historical attempts

Fermat: proof for $n = 4$ (infinite descent).
Euler 1770: $n = 3$ (with gap in unique factorisation, later filled).
Dirichlet, Legendre: $n = 5$ .
Lamé 1839: $n = 7$ (very intricate).
Kummer 1850s: $n$ regular prime — succeeded for many $n$ but failed at irregular primes.
20th century: computational verification up to $n = 4 \times 1 0^{6}$ (Buhler-Crandall-Sompolski 1992 and beyond).

11.3 Wiles’s proof outline

The reduction was discovered by Gerhard Frey 1986 and made precise by Ribet 1990 (epsilon conjecture).

Frey curve: if $a^{p} + b^{p} = c^{p}$ is a hypothetical solution with $p \geq 5$ prime, form the elliptic curve $E_{a, b, c} : y^{2} = x (x - a^{p}) (x + b^{p})$ .
Ribet’s theorem: $E_{a, b, c}$ would be “not modular” in a specific sense (the corresponding mod- $p$ Galois representation would not arise from a modular form of weight $2$ and the right level).
Modularity theorem (Wiles): every semistable $E / Q$ is modular.
Contradiction: $E_{a, b, c}$ is semistable, hence modular, contradicting step 2.

The proof of modularity occupies hundreds of pages and uses deformation theory of Galois representations, $R = T$ theorems (Wiles, Taylor-Wiles), and a Galois-cohomological obstruction calculus. A widely accessible exposition: Cornell, Silverman, Stevens (eds.) Modular Forms and Fermat’s Last Theorem 1997.

12. Cryptography

12.1 RSA

Rivest-Shamir-Adleman 1977 (and Clifford Cocks 1973, independently at GCHQ, declassified 1997). Choose primes $p, q$ , compute $n = pq$ and $φ (n) = (p - 1) (q - 1)$ . Pick public exponent $e$ coprime to $φ (n)$ and private $d \equiv e^{- 1} (mod φ (n))$ . Public key $(n, e)$ ; private key $d$ .

Encryption: $c = m^{e} mod n$ . Decryption: $m = c^{d} mod n$ , valid by Euler’s theorem.

Security rests on the hardness of integer factorisation. Best classical algorithm: General Number Field Sieve (Pollard 1988, Buhler-Lenstra-Pomerance 1993), subexponential complexity $exp (((64/9)^{1/3} + o (1)) (lo g n)^{1/3} (lo g lo g n)^{2/3})$ . Current key sizes: 2048-3072 bit for long-term security, 4096-bit for high-security.

12.2 Elliptic-curve cryptography

Neal Koblitz 1987 / Victor Miller 1985 (independent). Use $E (F_{p})$ for cryptographic primitives. Discrete log problem: given $P, Q \in E (F_{p})$ , find $k$ with $Q = k P$ .

Best generic attack: Pollard rho, $O (n)$ where $n = ∣ E (F_{p}) ∣$ . No subexponential attack for “generic” curves (unlike $(Z / p Z)^{\times}$ ).

Practical curves: P-256 (NIST), secp256k1 (Bitcoin, Ethereum), Curve25519 (Daniel Bernstein 2005, used in TLS 1.3 / Signal). 256-bit ECC gives roughly the same security as 3072-bit RSA, with much smaller keys and faster operations.

12.3 Pairing-based cryptography

Bilinear pairings on elliptic curves (Weil pairing, Tate pairing). Enable identity-based encryption (Boneh-Franklin 2001), short signatures (BLS — Boneh-Lynn-Shacham 2001), and the foundations of zk-SNARKs in many SNARK schemes (Groth16, KZG commitments). Used in Ethereum precompiles (BN254, BLS12-381).

12.4 Lattice-based post-quantum

Shor’s algorithm (Peter Shor 1994) breaks RSA and ECC on a sufficiently large quantum computer. NIST PQC standardisation 2022-2024 selected lattice-based schemes:

CRYSTALS-Kyber (key encapsulation), based on Module-LWE.
CRYSTALS-Dilithium (signatures), based on Module-LWE + Module-SIS.
FALCON (signatures), based on NTRU lattices.

Hardness: Shortest Vector Problem (SVP) and Closest Vector Problem (CVP) on Euclidean lattices. Best classical algorithms: BKZ lattice reduction (Schnorr 1987, refined by Chen-Nguyen 2011), running time exponential in the lattice dimension.

12.5 Isogeny-based cryptography

Use isogenies between supersingular elliptic curves. SIDH (Jao-De Feo 2011) was broken in 2022 (Castryck-Decru attack on SIDH, using Kani’s theorem). SQISign (De Feo-Kohel-Leroux-Petit-Wesolowski 2020) survives and is a NIST round-4 alternative. CSIDH (Castryck-Lange-Martindale-Panny-Renes 2018) is a slower non-broken alternative.

12.6 Other crypto primitives from number theory

Diffie-Hellman key exchange (1976): $g^{ab}$ from $g^{a}, g^{b}$ . Multiplicative group of $F_{p}^{\times}$ or elliptic curve.
DSA / ECDSA: digital signatures.
Schnorr signatures: simpler and provably secure; used in Bitcoin Taproot (BIP340).
Paillier cryptosystem: additively homomorphic; foundation of older private-set-intersection protocols.
Hash-to-curve: maps inputs to elliptic-curve points; foundational for BLS signatures, VRFs.

13. Diophantine approximation

13.1 Liouville’s theorem

For an algebraic $α$ of degree $d$ , there is a constant $c > 0$ such that for all $p / q$ rational with $q \geq 1$ : $α - \frac{p}{q} > \frac{c}{q ^{d}} .$ Used by Joseph Liouville 1844 to construct the first transcendental numbers (Liouville constant $\sum 1 0^{- n!}$ ).

13.2 Roth’s theorem

Klaus Roth 1955 (Fields 1958): for any algebraic irrational $α$ and $ε > 0$ , there are only finitely many $p / q$ with $α - \frac{p}{q} < \frac{1}{q ^{2 + ε}} .$ The exponent $2$ is best possible (Dirichlet shows $∣ α - p / q ∣ < 1/ q^{2}$ has infinitely many solutions for any irrational $α$ ). Inexplicit — the constants depending on $α, ε$ are not effective.

13.3 Schmidt’s subspace theorem

Wolfgang Schmidt 1972: generalises Roth to simultaneous Diophantine approximations. The far-reaching consequences include effective bounds for solutions of $S$ -unit equations and norm-form equations.

13.4 Faltings’s theorem (Mordell conjecture)

Gerd Faltings 1983 Inventiones Mathematicae 73 (Fields 1986): a smooth projective curve of genus $\geq 2$ over a number field has only finitely many rational points. Conjectured by Mordell 1922. Bombieri 1990 gave a different proof; Vojta 1991 gave a third. Effective bounds remain open.

14. Computational number theory

14.1 Primality testing

Fermat test: $a^{n - 1} \equiv 1 (mod n)$ — necessary but not sufficient (Carmichael numbers fool it).
Miller-Rabin (Gary Miller 1976, Michael Rabin 1980): probabilistic, $O ((lo g n)^{3})$ per round, fails with probability $\leq 1/4$ per round.
Solovay-Strassen (1977): probabilistic via Jacobi symbol.
AKS (Agrawal-Kayal-Saxena 2002 Annals of Math 160): first deterministic polynomial-time primality test. $\tilde{O} ((lo g n)^{6})$ , later improved to $\tilde{O} ((lo g n)^{15/2})$ . Theoretical; in practice slower than Miller-Rabin.
ECPP (Atkin 1986, Goldwasser-Kilian 1986): generates a primality certificate. Used for very large primes.

14.2 Integer factorisation

Trial division: $O (n)$ .
Pollard rho (1975): $O (n^{1/4})$ expected, low memory.
Pollard $p - 1$ (1974): exploits smooth $p - 1$ .
Quadratic Sieve (Pomerance 1981): subexponential.
Number Field Sieve (Pollard 1988, GNFS 1993): subexponential $exp (((64/9)^{1/3}) (lo g n)^{1/3} (lo g lo g n)^{2/3})$ . Current record: RSA-250 (829 bits) factored 2020 by Boudot et al.
Shor’s algorithm (1994): polynomial-time on a quantum computer. Largest reliable Shor factorisations to date are very small (15, 21, 35) — engineering, not theoretical, barrier.

14.3 Discrete logarithm

In $F_{p}^{\times}$ : index calculus, similar complexity to GNFS for factoring. In small-characteristic finite fields ( $F_{2^{n}}$ , $F_{3^{n}}$ ): quasi-polynomial algorithms (Barbulescu-Gaudry-Joux-Thomé 2013, Eurocrypt).

In generic groups (including elliptic curves on most curves): Pollard rho, $O (n)$ . No subexponential attack known for properly chosen elliptic curves.

14.4 Algorithmic tools

Lenstra-Lenstra-Lovász (LLL) lattice basis reduction (Lenstra-Lenstra-Lovász 1982): polynomial-time approximation of shortest vector. Foundation of practical lattice cryptanalysis and many integer-relation algorithms.
Coppersmith’s method (1996): finding small roots of polynomials mod $N$ , leveraged in lattice attacks on RSA with small private exponent or small messages.
PARI/GP, SageMath, Magma, Pari, FLINT — software for computational number theory.

15. Open problems

Riemann hypothesis and the generalised RH for $L$ -functions.
Goldbach’s conjecture (1742): every even $n \geq 4$ is a sum of two primes. Vinogradov 1937 proved every sufficiently large odd $n$ is a sum of three primes; Helfgott 2013 ternary Goldbach for all odd $n \geq 7$ .
Twin prime conjecture: infinitely many $(p, p + 2)$ both prime. Polymath/Maynard reduced the gap-of-bounded-difference to 246.
abc conjecture (Oesterlé-Masser 1985): for $a + b = c$ coprime, $c ≪_{ε} rad (ab c)^{1 + ε}$ . Shinichi Mochizuki’s 2012 IUT proof is widely disputed.
Birch-Swinnerton-Dyer conjecture.
Langlands program in full generality.
Class number 1 problem for real quadratic fields.
Catalan’s conjecture: $8$ and $9$ are the only consecutive perfect powers (Mihailescu 2002).
Schanuel’s conjecture: transcendence-theoretic; implies many open transcendence results.

16. Connections to other libraries

Algebra: number theory is a major consumer of group theory (Galois groups), ring theory (Dedekind domains), and homological algebra (group cohomology).
Algebraic geometry: schemes over $Z$ , étale cohomology, arithmetic schemes; see algebraic-geometry-foundations.
Complex analysis: $ζ$ , $L$ -functions, modular forms; see complex-analysis.
Combinatorics: additive combinatorics (Gowers, Tao); see graph-theory.
Computer science: cryptography, complexity (factoring is in NP $\cap$ co-NP but not known P or NP-complete), pseudo-random number generation.
Physics: quasi-crystal diffraction (Meyer sets and Pisot numbers); chaos at the critical line.

Compendium

Explorer

Number Theory

Number Theory

See also

1. Divisibility and the Euclidean algorithm

1.1 Divisibility

1.2 Greatest common divisor

1.3 Euclidean algorithm

1.4 Least common multiple

2. Primes and the fundamental theorem

2.1 Primes

2.2 Fundamental theorem of arithmetic

2.3 Mersenne, Fermat, and other special primes

3. Prime distribution

3.1 Prime counting function

3.2 Prime number theorem

3.3 The Riemann hypothesis

3.4 Other distribution results

4. Congruences

4.1 Definition

4.2 Chinese Remainder Theorem

4.3 Fermat’s little theorem

4.4 Euler’s theorem

4.5 Wilson’s theorem

4.6 Order of an element

5. Multiplicative functions

5.1 Definitions

5.2 Key examples

5.3 Dirichlet convolution

5.4 Möbius inversion

6. Dirichlet series and L-functions

6.1 Dirichlet series

6.2 Riemann zeta function

6.3 Dirichlet L-functions

6.4 Dedekind zeta function

6.5 The Selberg class and the Langlands program

7. Quadratic reciprocity

7.1 Legendre symbol

7.2 Quadratic reciprocity

7.3 Jacobi symbol

7.4 Higher reciprocity

8. Algebraic number theory primer

8.1 Algebraic integers and number fields

8.2 Failure of unique factorisation; ideals

8.3 Class group

8.4 Units

8.5 Splitting of primes

9. Elliptic curves

9.1 Definition

9.2 Group law

9.3 Mordell-Weil theorem

9.4 Birch-Swinnerton-Dyer conjecture

9.5 Reduction mod p and the Hasse bound

10. Modular forms

10.1 Definition

10.2 Eisenstein series

10.3 Discriminant cusp form

10.4 Hecke operators

10.5 Modularity theorem (Taniyama-Shimura-Weil)

11. Fermat’s Last Theorem

11.1 Statement

11.2 Historical attempts

11.3 Wiles’s proof outline

12. Cryptography

12.1 RSA

12.2 Elliptic-curve cryptography

12.3 Pairing-based cryptography

12.4 Lattice-based post-quantum

12.5 Isogeny-based cryptography

12.6 Other crypto primitives from number theory

13. Diophantine approximation

13.1 Liouville’s theorem

13.2 Roth’s theorem

13.3 Schmidt’s subspace theorem

13.4 Faltings’s theorem (Mordell conjecture)

14. Computational number theory

14.1 Primality testing

14.2 Integer factorisation

14.3 Discrete logarithm

6. Dirichlet series and $L$ -functions

6.3 Dirichlet $L$ -functions

9.5 Reduction mod $p$ and the Hasse bound