Learn Next — Math Recommendation Graph

If you’ve worked through one Math note, what should you read next to gain the most leverage? This guide is a learning-path overlay on top of the per-topic notes and the two _compare_* synthesis notes (optimization methods, probability frameworks). The recommendations follow the natural dependency structure of mathematics: linear algebra and probability are foundational; calculus + ODE/PDE + optimization are the engines; measure theory + functional analysis + complex analysis sit above as the rigorizing layer.

How to use this guide

For each per-topic note, you get one to three “next” recommendations tagged:

  • (foundation) — a layer below this one you need to make full sense of it.
  • (extension) — same idea generalized to harder cases (curved spaces, infinite-dimensional, dependent variables).
  • (application) — how the topic lands in Compute, Engineering, or Finance.
  • (synthesis) — a _compare_* note that ties this topic into a wider decision space.

The closing Reading paths section composes them into named multi-step tracks (ML researcher, Quant trader, Control engineer, Statistician, Pure mathematician).


Foundational track — linear algebra + calculus + probability

From linear-algebra-essentials

  • numerical-linear-algebra (extension): What changes when matrices live in float32 — conditioning, stability, BLAS, QR, Cholesky, iterative methods (CG, GMRES).
  • svd-pca-spectral (application): The single most-used matrix decomposition in ML / statistics / signal processing.
  • eigenvalue-problems (extension): The deeper structural theory — Jordan forms, perturbation theory, generalized eigenvalues, sparse iterative methods (Lanczos, Arnoldi).

From numerical-linear-algebra

  • eigenvalue-problems (extension): The numerically hard cases that motivate Lanczos, Arnoldi, and Krylov-subspace methods.
  • ode-numerical-methods (application): Most ODE/PDE solvers reduce to numerical linear algebra at every timestep.
  • cuda-triton-gpu-programming (application, cross-library): GEMM, GEMV, batched matmul — where numerical linear algebra meets GPUs.

From svd-pca-spectral

  • eigenvalue-problems (foundation): SVD is the symmetric-eigenvalue problem of A*A; spectral methods generalize the SVD to operators.
  • variational-inference (application): Probabilistic PCA, factor analysis, VAEs are SVD’s probabilistic descendants.
  • rag-embeddings-vector-search (application, cross-library): Embeddings are SVD-like; HNSW, IVF, ScaNN are the engineering layer.

From multivariate-calculus

  • tensor-calculus (extension): Multivariate calculus generalized to higher-rank objects — the language of GR, elasticity, and modern ML autodiff.
  • gradient-descent-variants (application): What you do once you can take a multivariate gradient — SGD, Adam, AdamW, Lion, Muon.
  • convex-optimization (application): The full optimization theory — KKT, duality, Lagrangians — built on multivariate calculus.

From tensor-calculus

  • lie-groups-so3-se3 (extension): The curved-space generalization — manifolds, tangent spaces, exponential maps.
  • pde-methods (application): Continuum mechanics, GR, electromagnetism — every PDE in physics is tensor-valued.
  • cfd-deep (application, cross-library): Navier-Stokes is tensor calculus operationalized.

From probability-fundamentals

  • probability-distributions (extension): The 30+ distributions you’ll actually use, with conjugate priors, MGFs, and parameter estimation.
  • hypothesis-testing-mle (application): What probability turns into when you have data — likelihoods, p-values, confidence intervals.
  • _compare_probability-frameworks (synthesis): The frequentist / Bayesian / likelihoodist / information-theoretic / causal framings, side by side.

From probability-distributions

  • bayesian-inference (application): Conjugate priors are how distributions become tractable inference engines.
  • copulas-and-dependence (extension): Distributions for vectors — separating marginal distribution from dependence structure.
  • probability-distribution-zoo (foundation): The catalog you keep open while reading the rest.

Probability + statistics track

From bayesian-inference

  • mcmc-sampling (application): How you actually compute posteriors when conjugacy fails — MH, Gibbs, HMC, NUTS, parallel-tempering.
  • variational-inference (extension): The deterministic alternative to MCMC — ELBO, mean-field, normalizing flows, amortized inference.
  • _compare_probability-frameworks (synthesis): Where Bayesian sits relative to frequentist, likelihoodist, and information-theoretic.

From mcmc-sampling

  • markov-chains-and-hmm (foundation): The Markov-chain theory underneath MCMC — irreducibility, aperiodicity, mixing time.
  • variational-inference (extension): When MCMC is too slow, VI trades exact-in-the-limit for scalable.
  • stochastic-calculus (extension): Langevin, HMC, and Riemann manifold HMC are SDEs in disguise.

From hypothesis-testing-mle

  • bayesian-inference (extension): The Bayesian alternative — posterior probabilities, Bayes factors, credible intervals.
  • causal-inference (application): Hypothesis testing on interventions not just correlations — Rubin PO, Pearl do-calculus.
  • _compare_probability-frameworks (synthesis): How frequentist NHST compares to other inferential frames.

From causal-inference

From variational-inference

  • information-theory (foundation): VI maximizes ELBO = log-evidence - KL(q||p); without KL divergence the whole machinery is opaque.
  • mcmc-sampling (adjacent): The sister technique you compare against on every problem.
  • transformer-architecture (application, cross-library): VAEs, normalizing flows, and diffusion models all live here.

From markov-chains-and-hmm

  • time-series-and-hmm (extension): The application-side view — state-space models, Kalman, particle filters, switching dynamics.
  • mcmc-sampling (application): MCMC = Markov chains designed to have a target stationary distribution.
  • reinforcement-learning-theory (application): MDPs are controlled Markov chains; HMMs are observation models.

From stochastic-calculus

From gaussian-processes

  • bayesian-inference (foundation): GPs are Bayesian non-parametrics; without Bayes the kernel + posterior story doesn’t compose.
  • variational-inference (extension): Sparse GPs and inducing-point methods are VI on GP posteriors.
  • _compare_optimization-methods (synthesis): Bayesian optimization (GP-EI) is the leading surrogate-based optimizer.

From copulas-and-dependence


Optimization track

From convex-optimization

From gradient-descent-variants

  • _compare_optimization-methods (synthesis): Where SGD / Adam / Lion / Muon / Shampoo sit on the optimizer landscape.
  • riemannian-optimization (extension): Gradient descent on curved manifolds — Stiefel, Grassmann, SPD, Lie groups.
  • fine-tuning-rlhf (application, cross-library): Where these optimizers actually get used at scale.

From combinatorial-optimization

  • graph-theory (foundation): TSP, max-flow, min-cut, matching — combinatorial optimization is graph theory with objective functions.
  • convex-optimization (foundation): LP relaxations and SDP relaxations are the modern attack on NP-hard problems.
  • _compare_optimization-methods (synthesis): The exact-MIP corner of the optimizer landscape.

From riemannian-optimization

  • lie-groups-so3-se3 (foundation): SO(3), SE(3), Stiefel — the manifolds you actually optimize over.
  • _compare_optimization-methods (synthesis): Where Riemannian methods sit relative to Euclidean first-order.
  • gnc (application, cross-library): Attitude estimation and pose optimization happen on SO(3).

Calculus + ODE/PDE track

From ode-numerical-methods

  • pde-methods (extension): ODEs in space + time — the natural generalization.
  • ode-pde-solver-catalog (application): The catalog of Runge-Kutta, BDF, Rosenbrock, IMEX, symplectic methods.
  • cfd-deep (application, cross-library): Where ODE/PDE methods land in industrial fluid simulation.

From pde-methods

  • functional-analysis (foundation): Sobolev spaces, weak solutions, Lax-Milgram — what makes FEM rigorous.
  • tensor-calculus (foundation): Elasticity, Maxwell, Navier-Stokes are tensor-valued PDEs.
  • fem-fea (application, cross-library): The engineering operationalization of PDE methods.

From fft-spectral

  • pde-methods (application): Spectral methods (Fourier, Chebyshev) for PDEs.
  • information-theory (adjacent): Discrete Fourier ↔ entropy via Parseval; channel-capacity analyses use FFT-style spectral decomposition.
  • signal-processing-dsp (application, cross-library): Where FFT actually gets used — filtering, compression, modulation.

Discrete + algebraic track

From graph-theory

  • combinatorial-optimization (application): Most graph problems become combinatorial optimization with weights and constraints.
  • causal-inference (application): DAGs, d-separation, structural causal models — graphs as the language of causality.
  • distributed-systems-fundamentals (application, cross-library): Consensus, gossip, leader election — graph algorithms on cluster topology.

From group-theory-and-representation

  • lie-groups-so3-se3 (extension): Continuous groups — the bridge from finite groups to differential geometry.
  • number-theory (application): Group theory underpins modular arithmetic, finite fields, and modern crypto.
  • inorganic-chemistry (application, cross-library): Point groups and character tables run all of molecular spectroscopy.

From number-theory

From algebraic-geometry-foundations

  • complex-analysis (foundation): Algebraic curves over ℂ are Riemann surfaces; complex analysis is the bridge.
  • number-theory (adjacent): Arithmetic geometry — number theory done with geometric tools.

Rigor / measure-theoretic track

From measure-theory-and-integration

  • functional-analysis (extension): L^p spaces, bounded operators, spectral theory — measure theory’s natural continuation.
  • probability-fundamentals (application): The measure-theoretic foundation of probability — σ-algebras, Radon-Nikodym, conditional expectation.
  • stochastic-calculus (application): Ito integration is the canonical measure-theoretic construction.

From functional-analysis

  • pde-methods (application): Sobolev spaces, distributions, weak formulations — what makes modern PDE theory work.
  • measure-theory-and-integration (foundation): L^p and Hilbert-space theory are measure theory restated.
  • complex-analysis (adjacent): Hardy spaces, operator theory on Hilbert spaces.

From complex-analysis

  • fft-spectral (application): Residue calculus + contour integration underlie many spectral and transform methods.
  • algebraic-geometry-foundations (extension): Riemann surfaces and algebraic curves over ℂ.

From lie-groups-so3-se3

From information-theory

From reinforcement-learning-theory


Tier 3 reference notes

The Tier 3 catalogs (probability-distribution-zoo, optimization-algorithm-taxonomy, ode-pde-solver-catalog, numerical-methods-reference, lie-group-catalog, sampling-algorithms-catalog, named-inequalities-catalog, special-functions-catalog, statistical-distributions-catalog-extended, convolutional-kernel-zoo) are lookup material, not reading material. Keep them open beside the Tier 1/2 notes.


Reading paths

ML Researcher Track

For someone whose work is training models and writing papers:

linear-algebra-essentialsmultivariate-calculusprobability-fundamentalsprobability-distributionsgradient-descent-variantsinformation-theoryvariational-inference_compare_optimization-methodstransformer-architecture

Quant Finance Trader Track

For someone pricing derivatives or managing portfolio risk:

probability-fundamentalsprobability-distributionsmeasure-theory-and-integrationstochastic-calculuscopulas-and-dependencederivatives-and-quant-financeoptions-pricing-deepmarket-microstructure-and-hft_compare_pricing-models

Control Engineer Track

For someone designing controllers for aerospace, robotics, or industrial systems:

linear-algebra-essentialsmultivariate-calculusode-numerical-methodsconvex-optimizationlie-groups-so3-se3riemannian-optimizationreinforcement-learning-theory_compare_control-strategies

Statistician / Causal Inference Track

For someone whose job is rigorous inference from data:

probability-fundamentalsprobability-distributionshypothesis-testing-mlebayesian-inferencecausal-inferencemcmc-samplingvariational-inference_compare_probability-frameworks

Pure Mathematician Track

For someone whose goal is mathematical depth not application:

linear-algebra-essentialsmeasure-theory-and-integrationfunctional-analysiscomplex-analysisgroup-theory-and-representationnumber-theoryalgebraic-geometry-foundationslie-groups-so3-se3


Adjacent libraries — when you’ve finished this library

Notes

This is opinionated synthesis. A theoretical statistician and a CUDA-kernel engineer will use the same Math library very differently — that is intentional. The recommendations come from the actual cross-reference structure of the per-topic notes and the two _compare_* syntheses, plus the canonical paths through graduate ML, quant-finance, and applied-math curricula.