Learn Next — Math Recommendation Graph
If you’ve worked through one Math note, what should you read next to gain the most leverage? This guide is a learning-path overlay on top of the per-topic notes and the two _compare_* synthesis notes (optimization methods, probability frameworks). The recommendations follow the natural dependency structure of mathematics: linear algebra and probability are foundational; calculus + ODE/PDE + optimization are the engines; measure theory + functional analysis + complex analysis sit above as the rigorizing layer.
How to use this guide
For each per-topic note, you get one to three “next” recommendations tagged:
- (foundation) — a layer below this one you need to make full sense of it.
- (extension) — same idea generalized to harder cases (curved spaces, infinite-dimensional, dependent variables).
- (application) — how the topic lands in Compute, Engineering, or Finance.
- (synthesis) — a
_compare_*note that ties this topic into a wider decision space.
The closing Reading paths section composes them into named multi-step tracks (ML researcher, Quant trader, Control engineer, Statistician, Pure mathematician).
Foundational track — linear algebra + calculus + probability
From linear-algebra-essentials
- → numerical-linear-algebra (extension): What changes when matrices live in float32 — conditioning, stability, BLAS, QR, Cholesky, iterative methods (CG, GMRES).
- → svd-pca-spectral (application): The single most-used matrix decomposition in ML / statistics / signal processing.
- → eigenvalue-problems (extension): The deeper structural theory — Jordan forms, perturbation theory, generalized eigenvalues, sparse iterative methods (Lanczos, Arnoldi).
From numerical-linear-algebra
- → eigenvalue-problems (extension): The numerically hard cases that motivate Lanczos, Arnoldi, and Krylov-subspace methods.
- → ode-numerical-methods (application): Most ODE/PDE solvers reduce to numerical linear algebra at every timestep.
- → cuda-triton-gpu-programming (application, cross-library): GEMM, GEMV, batched matmul — where numerical linear algebra meets GPUs.
From svd-pca-spectral
- → eigenvalue-problems (foundation): SVD is the symmetric-eigenvalue problem of A*A; spectral methods generalize the SVD to operators.
- → variational-inference (application): Probabilistic PCA, factor analysis, VAEs are SVD’s probabilistic descendants.
- → rag-embeddings-vector-search (application, cross-library): Embeddings are SVD-like; HNSW, IVF, ScaNN are the engineering layer.
From multivariate-calculus
- → tensor-calculus (extension): Multivariate calculus generalized to higher-rank objects — the language of GR, elasticity, and modern ML autodiff.
- → gradient-descent-variants (application): What you do once you can take a multivariate gradient — SGD, Adam, AdamW, Lion, Muon.
- → convex-optimization (application): The full optimization theory — KKT, duality, Lagrangians — built on multivariate calculus.
From tensor-calculus
- → lie-groups-so3-se3 (extension): The curved-space generalization — manifolds, tangent spaces, exponential maps.
- → pde-methods (application): Continuum mechanics, GR, electromagnetism — every PDE in physics is tensor-valued.
- → cfd-deep (application, cross-library): Navier-Stokes is tensor calculus operationalized.
From probability-fundamentals
- → probability-distributions (extension): The 30+ distributions you’ll actually use, with conjugate priors, MGFs, and parameter estimation.
- → hypothesis-testing-mle (application): What probability turns into when you have data — likelihoods, p-values, confidence intervals.
- → _compare_probability-frameworks (synthesis): The frequentist / Bayesian / likelihoodist / information-theoretic / causal framings, side by side.
From probability-distributions
- → bayesian-inference (application): Conjugate priors are how distributions become tractable inference engines.
- → copulas-and-dependence (extension): Distributions for vectors — separating marginal distribution from dependence structure.
- → probability-distribution-zoo (foundation): The catalog you keep open while reading the rest.
Probability + statistics track
From bayesian-inference
- → mcmc-sampling (application): How you actually compute posteriors when conjugacy fails — MH, Gibbs, HMC, NUTS, parallel-tempering.
- → variational-inference (extension): The deterministic alternative to MCMC — ELBO, mean-field, normalizing flows, amortized inference.
- → _compare_probability-frameworks (synthesis): Where Bayesian sits relative to frequentist, likelihoodist, and information-theoretic.
From mcmc-sampling
- → markov-chains-and-hmm (foundation): The Markov-chain theory underneath MCMC — irreducibility, aperiodicity, mixing time.
- → variational-inference (extension): When MCMC is too slow, VI trades exact-in-the-limit for scalable.
- → stochastic-calculus (extension): Langevin, HMC, and Riemann manifold HMC are SDEs in disguise.
From hypothesis-testing-mle
- → bayesian-inference (extension): The Bayesian alternative — posterior probabilities, Bayes factors, credible intervals.
- → causal-inference (application): Hypothesis testing on interventions not just correlations — Rubin PO, Pearl do-calculus.
- → _compare_probability-frameworks (synthesis): How frequentist NHST compares to other inferential frames.
From causal-inference
- → graph-theory (foundation): DAGs are the language of causal models; d-separation is a graph property.
- → bayesian-inference (foundation): Most identifiability proofs go through Bayesian conditioning.
- → _compare_probability-frameworks (synthesis): Where causal sits in the framework hierarchy.
From variational-inference
- → information-theory (foundation): VI maximizes ELBO = log-evidence - KL(q||p); without KL divergence the whole machinery is opaque.
- → mcmc-sampling (adjacent): The sister technique you compare against on every problem.
- → transformer-architecture (application, cross-library): VAEs, normalizing flows, and diffusion models all live here.
From markov-chains-and-hmm
- → time-series-and-hmm (extension): The application-side view — state-space models, Kalman, particle filters, switching dynamics.
- → mcmc-sampling (application): MCMC = Markov chains designed to have a target stationary distribution.
- → reinforcement-learning-theory (application): MDPs are controlled Markov chains; HMMs are observation models.
From stochastic-calculus
- → derivatives-and-quant-finance (application, cross-library): Black-Scholes, Heston, SABR, Hull-White — everything in derivatives pricing is SDEs.
- → measure-theory-and-integration (foundation): Ito’s lemma without measure theory is incantation; with it, it’s a theorem.
- → _compare_probability-frameworks (synthesis): Where measure-theoretic probability sits.
From gaussian-processes
- → bayesian-inference (foundation): GPs are Bayesian non-parametrics; without Bayes the kernel + posterior story doesn’t compose.
- → variational-inference (extension): Sparse GPs and inducing-point methods are VI on GP posteriors.
- → _compare_optimization-methods (synthesis): Bayesian optimization (GP-EI) is the leading surrogate-based optimizer.
From copulas-and-dependence
- → portfolio-construction-and-risk-deep (application, cross-library): Gaussian, t, and Archimedean copulas are how multi-asset portfolios actually model tail dependence.
- → probability-distributions (foundation): The marginal distributions copulas glue together.
Optimization track
From convex-optimization
- → gradient-descent-variants (application): First-order methods — the workhorse of every ML pipeline.
- → _compare_optimization-methods (synthesis): The full taxonomy — exact, convex first-order, non-convex stochastic, evolutionary, Bayesian, derivative-free.
- → combinatorial-optimization (extension): The integer / discrete world — LP relaxation, branch-and-bound, MIP, SDP relaxations.
From gradient-descent-variants
- → _compare_optimization-methods (synthesis): Where SGD / Adam / Lion / Muon / Shampoo sit on the optimizer landscape.
- → riemannian-optimization (extension): Gradient descent on curved manifolds — Stiefel, Grassmann, SPD, Lie groups.
- → fine-tuning-rlhf (application, cross-library): Where these optimizers actually get used at scale.
From combinatorial-optimization
- → graph-theory (foundation): TSP, max-flow, min-cut, matching — combinatorial optimization is graph theory with objective functions.
- → convex-optimization (foundation): LP relaxations and SDP relaxations are the modern attack on NP-hard problems.
- → _compare_optimization-methods (synthesis): The exact-MIP corner of the optimizer landscape.
From riemannian-optimization
- → lie-groups-so3-se3 (foundation): SO(3), SE(3), Stiefel — the manifolds you actually optimize over.
- → _compare_optimization-methods (synthesis): Where Riemannian methods sit relative to Euclidean first-order.
- → gnc (application, cross-library): Attitude estimation and pose optimization happen on SO(3).
Calculus + ODE/PDE track
From ode-numerical-methods
- → pde-methods (extension): ODEs in space + time — the natural generalization.
- → ode-pde-solver-catalog (application): The catalog of Runge-Kutta, BDF, Rosenbrock, IMEX, symplectic methods.
- → cfd-deep (application, cross-library): Where ODE/PDE methods land in industrial fluid simulation.
From pde-methods
- → functional-analysis (foundation): Sobolev spaces, weak solutions, Lax-Milgram — what makes FEM rigorous.
- → tensor-calculus (foundation): Elasticity, Maxwell, Navier-Stokes are tensor-valued PDEs.
- → fem-fea (application, cross-library): The engineering operationalization of PDE methods.
From fft-spectral
- → pde-methods (application): Spectral methods (Fourier, Chebyshev) for PDEs.
- → information-theory (adjacent): Discrete Fourier ↔ entropy via Parseval; channel-capacity analyses use FFT-style spectral decomposition.
- → signal-processing-dsp (application, cross-library): Where FFT actually gets used — filtering, compression, modulation.
Discrete + algebraic track
From graph-theory
- → combinatorial-optimization (application): Most graph problems become combinatorial optimization with weights and constraints.
- → causal-inference (application): DAGs, d-separation, structural causal models — graphs as the language of causality.
- → distributed-systems-fundamentals (application, cross-library): Consensus, gossip, leader election — graph algorithms on cluster topology.
From group-theory-and-representation
- → lie-groups-so3-se3 (extension): Continuous groups — the bridge from finite groups to differential geometry.
- → number-theory (application): Group theory underpins modular arithmetic, finite fields, and modern crypto.
- → inorganic-chemistry (application, cross-library): Point groups and character tables run all of molecular spectroscopy.
From number-theory
- → cryptography-fundamentals (application, cross-library): RSA, ECC, lattice-based PQC — number theory operationalized as security.
- → group-theory-and-representation (foundation): Finite-field arithmetic is group theory; elliptic curves are group structures.
- → algebraic-geometry-foundations (extension): Schemes, varieties, sheaves — the modern algebraic-geometry view of number theory.
From algebraic-geometry-foundations
- → complex-analysis (foundation): Algebraic curves over ℂ are Riemann surfaces; complex analysis is the bridge.
- → number-theory (adjacent): Arithmetic geometry — number theory done with geometric tools.
Rigor / measure-theoretic track
From measure-theory-and-integration
- → functional-analysis (extension): L^p spaces, bounded operators, spectral theory — measure theory’s natural continuation.
- → probability-fundamentals (application): The measure-theoretic foundation of probability — σ-algebras, Radon-Nikodym, conditional expectation.
- → stochastic-calculus (application): Ito integration is the canonical measure-theoretic construction.
From functional-analysis
- → pde-methods (application): Sobolev spaces, distributions, weak formulations — what makes modern PDE theory work.
- → measure-theory-and-integration (foundation): L^p and Hilbert-space theory are measure theory restated.
- → complex-analysis (adjacent): Hardy spaces, operator theory on Hilbert spaces.
From complex-analysis
- → fft-spectral (application): Residue calculus + contour integration underlie many spectral and transform methods.
- → algebraic-geometry-foundations (extension): Riemann surfaces and algebraic curves over ℂ.
From lie-groups-so3-se3
- → group-theory-and-representation (foundation): Representation theory is what makes Lie groups computable.
- → riemannian-optimization (application): Optimization on Lie groups uses the exponential map you learned here.
- → spacecraft-attitude-control (application, cross-library): SO(3) is what every attitude-control system runs on.
From information-theory
- → variational-inference (application): ELBO, KL, mutual information are the variational toolbox.
- → cryptography-fundamentals (application, cross-library): One-time pad, perfect secrecy, channel capacity — Shannon’s other career.
- → _compare_probability-frameworks (synthesis): Where the info-theoretic frame sits.
From reinforcement-learning-theory
- → markov-chains-and-hmm (foundation): MDPs are controlled Markov chains.
- → _compare_optimization-methods (synthesis): Policy-gradient (PPO, SAC, DDPG) is where stochastic optimization meets control.
- → fine-tuning-rlhf (application, cross-library): RLHF, GRPO, DPO — RL applied to language models.
Tier 3 reference notes
The Tier 3 catalogs (probability-distribution-zoo, optimization-algorithm-taxonomy, ode-pde-solver-catalog, numerical-methods-reference, lie-group-catalog, sampling-algorithms-catalog, named-inequalities-catalog, special-functions-catalog, statistical-distributions-catalog-extended, convolutional-kernel-zoo) are lookup material, not reading material. Keep them open beside the Tier 1/2 notes.
Reading paths
ML Researcher Track
For someone whose work is training models and writing papers:
linear-algebra-essentials → multivariate-calculus → probability-fundamentals → probability-distributions → gradient-descent-variants → information-theory → variational-inference → _compare_optimization-methods → transformer-architecture
Quant Finance Trader Track
For someone pricing derivatives or managing portfolio risk:
probability-fundamentals → probability-distributions → measure-theory-and-integration → stochastic-calculus → copulas-and-dependence → derivatives-and-quant-finance → options-pricing-deep → market-microstructure-and-hft → _compare_pricing-models
Control Engineer Track
For someone designing controllers for aerospace, robotics, or industrial systems:
linear-algebra-essentials → multivariate-calculus → ode-numerical-methods → convex-optimization → lie-groups-so3-se3 → riemannian-optimization → reinforcement-learning-theory → _compare_control-strategies
Statistician / Causal Inference Track
For someone whose job is rigorous inference from data:
probability-fundamentals → probability-distributions → hypothesis-testing-mle → bayesian-inference → causal-inference → mcmc-sampling → variational-inference → _compare_probability-frameworks
Pure Mathematician Track
For someone whose goal is mathematical depth not application:
linear-algebra-essentials → measure-theory-and-integration → functional-analysis → complex-analysis → group-theory-and-representation → number-theory → algebraic-geometry-foundations → lie-groups-so3-se3
Adjacent libraries — when you’ve finished this library
- Compute — every Math topic has a Compute landing: transformer-architecture needs linear algebra, distributed-systems-fundamentals needs probability, cryptography-fundamentals needs number theory, cuda-triton-gpu-programming needs numerical linear algebra.
- Engineering — calculus + ODE/PDE + optimization land here: fem-fea, cfd-deep, _compare_control-strategies, structural-dynamics.
- Finance — probability + stochastic calculus land here: derivatives-and-quant-finance, options-pricing-deep, _compare_pricing-models, _compare_risk-measures.
- Languages — math-DSL notation (Julia, R, Lean, Coq).
Notes
This is opinionated synthesis. A theoretical statistician and a CUDA-kernel engineer will use the same Math library very differently — that is intentional. The recommendations come from the actual cross-reference structure of the per-topic notes and the two _compare_* syntheses, plus the canonical paths through graduate ML, quant-finance, and applied-math curricula.