Functional Analysis — Banach + Hilbert Spaces, Operators, Spectral Theory, PDEs

Functional analysis is the study of infinite-dimensional vector spaces equipped with topology — normed spaces, Banach spaces, Hilbert spaces — and the continuous linear maps between them. It is the language of partial differential equations, quantum mechanics, signal processing, kernel methods, and modern probability. This note maps the territory: spaces, operators, spectral theory, distributions, Sobolev spaces, RKHS, and connections to applied domains. SI units where physical quantities appear.


1. Topological vector spaces and norms

1.1 Vector spaces

Take a real or complex vector space X. We need a topology compatible with addition and scalar multiplication. The cleanest case: a norm.

A norm ||·||: X → [0,∞) satisfies:

  1. ||x|| ≥ 0, ||x|| = 0x = 0 (positive definite).
  2. ||αx|| = |α| · ||x|| (homogeneous).
  3. ||x + y|| ≤ ||x|| + ||y|| (triangle).

A norm induces a metric d(x,y) = ||x - y|| and hence a topology.

1.2 Banach and Hilbert spaces

  • Banach space: a normed vector space that is complete under its norm — every Cauchy sequence converges in the space. Stefan Banach’s 1932 Théorie des opérations linéaires is the foundational text.
  • Inner product space: ⟨·,·⟩: X × X → C linear in the first argument, conjugate-symmetric, positive-definite. Induces norm ||x|| = sqrt(⟨x,x⟩).
  • Hilbert space: complete inner-product space. Named after David Hilbert; the modern abstract definition is due to John von Neumann (1929, “Allgemeine Eigenwerttheorie Hermitescher Funktionaloperatoren”, Math. Ann. 102).

Every Hilbert space is Banach. Not every Banach space is Hilbert: the norm must satisfy the parallelogram law

||x + y||² + ||x - y||² = 2(||x||² + ||y||²)

for it to come from an inner product (Jordan-von Neumann 1935).


2. Canonical Banach spaces

2.1 L^p spaces

For a measure space (Ω, Σ, μ) and 1 ≤ p < ∞:

L^p(Ω) = { f measurable : ||f||_p = (∫ |f|^p dμ)^{1/p} < ∞ } / a.e. equality

For p = ∞: ||f||_∞ = ess sup |f|. All L^p are Banach; only L^2 is Hilbert (with ⟨f,g⟩ = ∫ f ḡ dμ).

  • L^1: integrable functions; useful for finite measures.
  • L^2: square-integrable; the centrepiece of analysis — Fourier theory, quantum mechanics, signal processing.
  • L^∞: essentially bounded.

Hölder’s inequality: ||fg||_1 ≤ ||f||_p · ||g||_q with 1/p + 1/q = 1. Minkowski’s inequality: triangle inequality for ||·||_p.

2.2 Sequence spaces

ℓ^p = { (x_n) : Σ |x_n|^p < ∞ }    (Banach; Hilbert only for p=2)
c_0   = { (x_n) : x_n → 0 }         (sup norm; Banach, not reflexive)
c     = convergent sequences         (Banach)

2.3 Continuous function spaces

C[a,b] with sup norm ||f||_∞ = max |f|: Banach but not Hilbert.

2.4 Hölder spaces

C^{0,α}(Ω) for 0 < α ≤ 1:

||f||_{C^{0,α}} = ||f||_∞ + sup_{x≠y} |f(x) - f(y)| / |x - y|^α

α = 1 is Lipschitz. Used in elliptic regularity theory (Schauder estimates).

2.5 BV (bounded variation)

Functions with finite total variation. Banach; central to image processing (Rudin-Osher-Fatemi 1992 ROF total-variation denoising) and conservation-law theory.

2.6 Sobolev spaces

Treated in detail in §6. Briefly: W^{k,p}(Ω) = functions whose weak derivatives up to order k are in L^p(Ω). H^k = W^{k,2}, Hilbert.


3. Hilbert space geometry

3.1 Orthogonality and projection

x ⊥ y if ⟨x,y⟩ = 0. For closed subspace M ⊆ H, every x ∈ H decomposes uniquely as x = m + m^⊥ with m ∈ M, m^⊥ ∈ M^⊥. The map x ↦ m is the orthogonal projection P_M, self-adjoint and idempotent.

3.2 Orthonormal bases

A set {e_α} is orthonormal if ⟨e_α, e_β⟩ = δ_{αβ}. Complete if span{e_α} is dense. Every separable Hilbert space has a countable orthonormal basis; every separable infinite-dim Hilbert space is isometrically isomorphic to ℓ^2.

Classical orthonormal systems on intervals (after normalisation):

  • Fourier: e_n(x) = e^{inx}/sqrt(2π) on [-π, π]. See fft-spectral.
  • Hermite: H_n(x) e^{-x²/2} on R (weight e^{-x²} natural).
  • Legendre: P_n on [-1,1].
  • Chebyshev: T_n(cos θ) = cos(nθ).
  • Laguerre: L_n(x) e^{-x/2} on [0,∞).

3.3 Riesz representation theorem

For Hilbert H, every bounded linear functional f: H → C has the form f(x) = ⟨x, y_f⟩ for a unique y_f ∈ H, with ||f|| = ||y_f||. The map y ↦ ⟨·,y⟩ is a conjugate-linear isometry H → H^*.

3.4 Weak convergence

x_n ⇀ x if ⟨x_n, y⟩ → ⟨x, y⟩ for all y ∈ H. Bounded sets are weakly sequentially compact (consequence of Banach-Alaoglu, §5). Crucial for PDE existence proofs.


4. Bounded linear operators

4.1 Definition and norm

T: X → Y linear is bounded if

||T|| = sup{ ||Tx||_Y : ||x||_X ≤ 1 } < ∞

For linear maps, bounded ⇔ continuous.

B(X,Y) is the Banach space of bounded operators; B(X) = B(X,X). For Hilbert H, B(H) is a C*-algebra.

4.2 Adjoint

For T ∈ B(H), the adjoint T* is the unique operator with ⟨Tx, y⟩ = ⟨x, T*y⟩ for all x,y. Properties: (αT + βS)* = ᾱT* + β̄S*, (TS)* = S*T*, T** = T, ||T*|| = ||T||, ||T*T|| = ||T||² (the C*-identity).

4.3 Special classes

  • Self-adjoint (Hermitian): T = T*. Real spectrum.
  • Normal: TT* = T*T. Spectral theorem applies.
  • Unitary: T*T = TT* = I. Preserves inner product.
  • Positive: ⟨Tx, x⟩ ≥ 0. Has a unique positive square root.
  • Projection (orthogonal): T = T* = T².
  • Isometry: ||Tx|| = ||x||. Unitary if also surjective.

4.4 Compact operators

K ∈ B(X,Y) is compact if it maps bounded sets to relatively compact (precompact) sets. Equivalently: every bounded sequence has an image with a convergent subsequence.

Properties:

  • Compact operators form a closed two-sided ideal K(X) in B(X).
  • Limit (in operator norm) of finite-rank operators ⇒ compact. (Converse holds in Hilbert space — not in every Banach; counter-example: Per Enflo 1973.)
  • For Hilbert H and self-adjoint compact K: countable real eigenvalues λ_n → 0 and an orthonormal eigenbasis (Hilbert-Schmidt theorem).

Sub-classes:

  • Hilbert-Schmidt T: ||T||_{HS}² = Σ ||T e_n||² < ∞ (independent of basis). Forms a Hilbert space.
  • Trace class T: ||T||_1 = tr(|T|) < ∞. Strictly stronger; ||T||_{HS}² = tr(T*T). Dual of compact operators is trace class.

Hilbert-Schmidt and trace-class operators show up everywhere in kernel methods and quantum statistical mechanics. See kernel-methods.


5. The four cornerstones

5.1 Hahn-Banach theorem

Hans Hahn 1927, Stefan Banach 1929: any bounded linear functional defined on a subspace of a normed space extends to the whole space with the same norm. Consequence: the dual X* “sees” enough — X separates points of X*.

Geometric / separation forms: any two disjoint convex sets (one open) can be separated by a hyperplane. Foundation of duality, optimisation, weak topology.

5.2 Open Mapping Theorem

A bounded surjective linear operator between Banach spaces is open. Corollary (Banach Isomorphism Theorem): bijective bounded linear operator between Banach spaces has bounded inverse.

5.3 Closed Graph Theorem

For Banach spaces, a linear operator with closed graph is bounded. Hugely useful: lets you prove boundedness by checking graph closure (easier than checking continuity directly).

5.4 Uniform Boundedness Principle (Banach-Steinhaus)

A pointwise-bounded family of bounded operators on a Banach space is uniformly norm-bounded. Hugo Steinhaus 1927. Underlies many “if it works on a dense set, it works everywhere” arguments.

5.5 Banach-Alaoglu

The closed unit ball of X* is compact in the weak* topology. Foundational for variational PDE theory: extract weakly convergent subsequences from energy-bounded sequences.


6. Sobolev spaces and embeddings

6.1 Weak derivatives

For f ∈ L^1_{loc}(Ω), g is the weak α-th partial derivative if

∫ f · ∂^α φ dx = (-1)^|α| ∫ g · φ dx     for all φ ∈ C_c^∞(Ω).

6.2 Definition

W^{k,p}(Ω) = { f ∈ L^p(Ω) : ∂^α f ∈ L^p(Ω) for all |α| ≤ k }
||f||_{W^{k,p}} = (Σ_{|α| ≤ k} ||∂^α f||_p^p)^{1/p}

H^k(Ω) := W^{k,2}(Ω) is Hilbert.

W_0^{k,p} = closure of C_c^∞ — functions vanishing on ∂Ω in a weak sense.

6.3 Sobolev embedding theorem (Sergei Sobolev 1938)

For Ω ⊂ R^n Lipschitz, with kp < n:

W^{k,p}(Ω) ↪ L^{q}(Ω)   for   1/q = 1/p - k/n

For kp = n: embedding into all L^q, q < ∞. For kp > n: embedding into C^{m, α} with m + α = k - n/p.

Rellich-Kondrachov compactness: when the embedding is strict, it is also compact (bounded sets are precompact in the target space). Essential for nonlinear PDE.

6.4 Trace theorem

For f ∈ H^1(Ω), the boundary trace f|_{∂Ω} exists in H^{1/2}(∂Ω). Defines boundary conditions weakly.

6.5 Poincaré inequality

For f ∈ W_0^{1,p}(Ω), Ω bounded:

||f||_p ≤ C(Ω,p) · ||∇f||_p

Underlies coercivity of variational problems.


7. Lax-Milgram and weak solutions

7.1 Lax-Milgram theorem (Peter Lax & Arthur Milgram 1954)

Let H be Hilbert, a: H × H → R bilinear with:

  • Continuity: |a(u,v)| ≤ M ||u|| ||v||.
  • Coercivity: a(u,u) ≥ α ||u||² for some α > 0.

Then for every f ∈ H* there is a unique u ∈ H with a(u,v) = f(v) for all v ∈ H, and ||u|| ≤ ||f||/α.

7.2 Application

For the Dirichlet problem -Δu = f on Ω with u = 0 on ∂Ω:

  • H = H_0^1(Ω).
  • a(u,v) = ∫_Ω ∇u · ∇v dx.
  • Coercivity from Poincaré.

Lax-Milgram gives existence and uniqueness of the weak solution. This is the entire backbone of variational PDE — FEM (see ode-pde-solver-catalog) is the discretisation of exactly this weak form.

7.3 Galerkin approximation

Replace H by finite-dim subspace H_n and solve a(u_n, v) = f(v) for v ∈ H_n. Convergence follows from Céa’s lemma: ||u - u_n|| ≤ (M/α) inf_{w ∈ H_n} ||u - w||.


8. Distributions

8.1 Test functions and distributions

Laurent Schwartz (Fields Medal 1950 for distribution theory) developed the modern framework.

D(Ω) = C_c^∞(Ω) with the inductive limit topology (sequence convergent if supports in fixed compact set + uniform convergence of all derivatives).

Distribution: continuous linear functional on D(Ω). Space D'(Ω).

Every L^1_{loc} function defines a distribution by ⟨f, φ⟩ = ∫ f φ. Examples beyond functions:

  • Dirac delta δ_a: ⟨δ_a, φ⟩ = φ(a).
  • Principal value pv(1/x): ⟨pv(1/x), φ⟩ = lim_{ε→0} ∫_{|x|>ε} φ(x)/x dx.
  • Derivatives of Dirac δ^{(k)}: ⟨δ^{(k)}, φ⟩ = (-1)^k φ^{(k)}(0).

8.2 Distributional derivative

Every distribution is infinitely differentiable: ⟨∂T, φ⟩ = -⟨T, ∂φ⟩. Heaviside step H has H' = δ.

8.3 Tempered distributions

Test space S(R^n) = Schwartz functions (rapidly decreasing smooth). Dual S' = tempered distributions. The Fourier transform extends from S to S':

⟨F(T), φ⟩ = ⟨T, F(φ)⟩

Useful Fourier transforms in S':

  • F(δ) = 1, F(1) = (2π)^n δ.
  • F(x^α) = i^|α| (2π)^n ∂^α δ.
  • F(e^{ia·x}) ∝ δ(ξ - a).

9. Spectral theory

9.1 Spectrum

For T ∈ B(X), the spectrum is

σ(T) = { λ ∈ C : T - λI is not invertible in B(X) }

Decomposes:

  • Point spectrum σ_p(T): eigenvalues.
  • Continuous spectrum σ_c(T): T - λI injective with dense range but not surjective.
  • Residual spectrum σ_r(T): T - λI injective but range not dense.

The spectrum is non-empty, compact, contained in {|λ| ≤ ||T||}. Spectral radius: r(T) = lim ||T^n||^{1/n} ≤ ||T||.

9.2 Compact self-adjoint operators

Hilbert-Schmidt theorem: for K = K* compact on Hilbert H, there is an orthonormal basis {e_n} of eigenvectors with real eigenvalues λ_n → 0, and K = Σ λ_n ⟨·, e_n⟩ e_n.

This is the abstract version of “every Hermitian matrix diagonalises by an orthonormal basis”, lifted to infinite dimensions but only for compact operators.

9.3 Spectral theorem for bounded self-adjoint operators

Multiplication operator form: for T = T* bounded on Hilbert H, there is a measure space (Ω, Σ, μ), a measurable function f: Ω → R, and a unitary U: H → L^2(Ω, μ) with

U T U^* = M_f       (multiplication by f).

Projection-valued measure form: there exists a unique spectral measure E: B(σ(T)) → projections with

T = ∫_{σ(T)} λ dE(λ)

These let you define g(T) = ∫ g(λ) dE(λ) for bounded Borel g — the functional calculus.

9.4 Unbounded self-adjoint operators

Most physical operators (position, momentum, Hamiltonians) are unbounded. Defined on a dense domain D(T) ⊂ H. Symmetric: ⟨Tx, y⟩ = ⟨x, Ty⟩ for x,y ∈ D(T). Self-adjoint: also D(T*) = D(T). Spectral theorem extends.

9.5 Stone’s theorem

Marshall Stone 1932: one-parameter strongly continuous unitary groups {U_t} on H are in bijection with self-adjoint operators via U_t = e^{itA}. The infinitesimal generator A = (1/i) dU_t/dt |_{t=0} is self-adjoint.

This is the abstract justification for unitary time evolution in quantum mechanics: iħ ∂ψ/∂t = Hψψ(t) = e^{-iHt/ħ} ψ(0).

9.6 Hille-Yosida theorem

Einar Hille 1948 / Kosaku Yosida 1948: characterises generators of strongly continuous (C_0) semigroups T_t = e^{tA}. The Laplacian generates the heat semigroup; the Stokes operator generates Navier-Stokes locally. Foundation of evolution equations.

9.7 Spectral mapping theorem

For g continuous on σ(T), σ(g(T)) = g(σ(T)).


10. Fredholm theory

10.1 Fredholm operators

T ∈ B(X,Y) is Fredholm if ker T and coker T are finite-dimensional and range is closed. Index:

ind(T) = dim ker T - dim coker T

The index is invariant under compact perturbations: ind(T + K) = ind(T) for compact K.

10.2 Fredholm alternative

For compact K on Hilbert H and the equation (I - K) x = y:

  • Either (I - K) is invertible (unique solution for every y), or
  • (I - K) has nontrivial kernel, and solutions exist iff y ⊥ ker(I - K*).

This is the operator-theoretic generalisation of finite-dim “exists & unique vs orthogonality condition”.

10.3 Atiyah-Singer

Far-reaching: the index of an elliptic operator on a compact manifold is computed topologically (Atiyah-Singer 1963; Fields-class theorem). Outside the scope here.


11. Semigroups and evolution equations

11.1 C_0 semigroups

A family {T_t : t ≥ 0} on Banach X is a strongly continuous (C_0) semigroup if:

  • T_0 = I.
  • T_{t+s} = T_t T_s.
  • T_t x → x as t → 0^+ for every x ∈ X.

Generator A = lim_{t→0} (T_t - I)/t on its natural domain.

11.2 Examples

  • Heat semigroup on L^2(R^n): T_t f(x) = ∫ f(y) (4πt)^{-n/2} e^{-|x-y|²/(4t)} dy. Generator: (1/2)Δ. Smooths instantly.
  • Schrödinger group: e^{itΔ} — unitary, not smoothing.
  • Wave: a related cosine family with generator .
  • OU semigroup: Ornstein-Uhlenbeck, generator (1/2)Δ - x·∇; invariant Gaussian. See stochastic-calculus.

11.3 Contraction semigroups

||T_t|| ≤ 1. Hille-Yosida: A generates a contraction C_0 semigroup iff A is densely defined, closed, and (0,∞) ⊂ ρ(A) with ||(λ - A)^{-1}|| ≤ 1/λ.


12. Reproducing Kernel Hilbert Spaces

12.1 Definition

Nachman Aronszajn “Theory of reproducing kernels” (Trans. Amer. Math. Soc. 68, 1950): a Hilbert space H of functions on X is RKHS if point evaluation f ↦ f(x) is bounded for each x. By Riesz, there exists K_x ∈ H with f(x) = ⟨f, K_x⟩. The reproducing kernel is K(x,y) = ⟨K_x, K_y⟩.

12.2 Mercer’s theorem

James Mercer 1909: for K continuous, symmetric, positive-definite on compact X,

K(x,y) = Σ_n λ_n φ_n(x) φ_n(y)

with λ_n ≥ 0, {φ_n} orthonormal in L^2, convergence absolute and uniform.

12.3 Kernel machines

The “kernel trick” of SVMs replaces inner products with K(x,y) — effectively working in the RKHS. Common kernels:

KernelFormula
Linearx · y
Polynomial(x · y + c)^d
Gaussian (RBF)`exp(-
Matérnparameter ν; family of GP kernels
Laplace`exp(-
ANOVAΣ exp(-(x_i - y_i)²/σ²)

See gaussian-processes for the GP/Bayesian-regression view and kernel-methods for kernel-machine algorithms.

12.4 Representer theorem

Kimeldorf-Wahba 1971: the solution to regularised loss minimisation in an RKHS lies in the span of kernels at the training points. Reduces an infinite-dim optimisation to an n-dim one — the entire reason RKHS works computationally.


13. Operator algebras

13.1 Banach algebras and C*-algebras

A Banach algebra is a Banach space with a compatible associative product. With an involution * satisfying ||a*a|| = ||a||² it is a C-algebra*.

Gelfand-Naimark theorem (1943): every commutative C*-algebra with unit is isometrically isomorphic to C(X) for some compact Hausdorff X (its spectrum). Non-commutative: every C*-algebra embeds as a closed *-subalgebra of B(H) for some Hilbert H.

13.2 von Neumann algebras

Francis Murray & John von Neumann 1936-1943: a *-subalgebra of B(H) closed in the weak-operator topology. Equivalent to its double commutant (M = M''). Factor types I, II, III; classification major theme of mid-20th-century math. Modern: free probability (Voiculescu), Connes’ work (Fields 1982).


14. Quantum mechanics formalism

This is the canonical applied home of functional analysis.

  • States: unit vectors ψ in Hilbert space H (up to phase); or density operators ρ (positive trace-class, tr ρ = 1) for mixed states.
  • Observables: self-adjoint operators on H. Spectrum = possible measurement outcomes.
  • Expectation: ⟨A⟩_ψ = ⟨ψ, Aψ⟩.
  • Born rule: probability of measuring λ ∈ B is ⟨ψ, E(B) ψ⟩ where E is the spectral measure of A.
  • Time evolution: ψ(t) = U_t ψ(0) with U_t = e^{-iHt/ħ} unitary (Stone). Hamiltonian H self-adjoint.
  • Canonical commutation: [X, P] = iħI — only realisable by unbounded operators (Stone-von Neumann uniqueness theorem 1931).
  • Compact symmetries / coherent states / second quantisation: built on Fock space F(H) = ⊕_n H^{⊗_s n}.

15. Pseudo-differential operators and microlocal analysis

A pseudo-differential operator of order m has symbol a(x, ξ) with |∂_x^α ∂_ξ^β a| ≤ C_{αβ} (1 + |ξ|)^{m-|β|}:

(P f)(x) = (2π)^{-n} ∫ e^{ix·ξ} a(x,ξ) F(f)(ξ) dξ

Lars Hörmander (Fields 1962) systematised the theory in his four-volume The Analysis of Linear Partial Differential Operators (1983-1985). Used in PDE regularity, index theory, and propagation of singularities.


16. Famous classical theorems

  • Stone-Weierstrass (1885 / 1937): a unital subalgebra of C(K) that separates points is dense in the sup norm.
  • Arzelà-Ascoli (1883/1895): bounded equicontinuous family in C(K) is relatively compact (uniform convergence).
  • Mazur’s theorem: closed convex sets have the same closure under norm and weak topologies.
  • Krein-Milman (1940): a compact convex set in a locally convex space is the closed convex hull of its extreme points.
  • Stampacchia variational inequality (1964): for closed convex K ⊂ H and coercive bilinear a, there exists unique u ∈ K with a(u, v - u) ≥ f(v - u) for all v ∈ K. Foundation of variational inequalities / obstacle problems.
  • Banach fixed point theorem (1922): contraction on complete metric space has unique fixed point. Picard iteration, ODE existence.

17. Connections to applied mathematics

  • PDE theory: variational formulations, semigroup theory, energy methods — all functional-analytic. See pde-methods, ode-pde-solver-catalog.
  • Signal processing: L^2 Fourier theory, wavelets, frames. See fft-spectral.
  • Probability theory: L^2 random variables, conditional expectation as projection, Karhunen-Loève (compact-operator decomposition of covariance). See probability-fundamentals.
  • Optimisation in Hilbert spaces: convex analysis, subdifferentials, proximal operators. See convex-optimization.
  • Inverse problems: regularisation in Hilbert spaces, Tikhonov, SVD-based truncation. See svd-pca-spectral.
  • Numerical PDEs: FEM is Galerkin in H^1. See numerical-methods-reference.

18. Adjacent