Electronic Structure & Computational Materials

Computational materials science predicts properties from first principles. Electronic structure — the distribution of electrons among nuclei — controls bonding, mechanical strength, band gap, magnetism, optical absorption, transport (electrical + thermal), reactivity, and stability. This note covers the methods (DFT through QMC), the software (VASP through MatterSim), and the 2024–26 data + ML revolution (Materials Project, GNoME, MatterGen, A-Lab).


At a Glance — Why Electronic Structure Matters

PropertyDetermined byMethod of choice
Cohesive energy, lattice constantTotal energy minimizationDFT-GGA
Band gap (semiconductors, insulators)Single-particle excitationsGW, HSE06 hybrid (DFT-PBE underestimates by ~50%)
Optical absorptionTwo-particle (electron-hole) excitationsBSE, TD-DFT
Magnetic moments, orderingSpin-polarized DFTDFT+U, DMFT for strongly-correlated
Phonons, thermal expansion, IR/RamanLattice dynamicsDFPT, finite-difference
Electrical transportBand velocity + electron-phononBoltzmann (BoltzTraP), EPW
Chemical reactivity, catalysisReaction paths + barriersDFT + NEB transition state
Strongly-correlated oxides (cuprates, ruthenates)Hubbard physicsDMFT, QMC
Charge transfer, bonding characterReal-space decompositionBader, ELF, COHP
Microstructure evolution (dendrites, spinodal)Phase-field PDEsCahn-Hilliard, Allen-Cahn
Time-dependent dynamics (atomic motion)Newton’s equations on PESMD (classical, AIMD, MLIP-driven)

The hierarchy: quantum chemistry / DFT (electrons explicit, ~10² atoms, ps) → MLIPs (DFT-accurate forces, ~10⁵ atoms, ns) → classical MD (empirical force fields, ~10⁷ atoms, μs) → phase-field / FEM (continuum, mm scale, hours).

Cross-links: crystallography-phase-diagrams for crystal structure inputs; characterization-methods for experimental validation; numerical-linear-algebra for diagonalization; eigenvalue-problems for Kohn-Sham solve; linear-algebra-essentials for basis transformations; organic-chemistry-foundations for DFT-in-chemistry; semiconductor-materials for band-gap engineering; cuda-triton-gpu-programming for GPU codes; transformer-architecture for MLIP equivariant networks; fem-fea for multi-scale handoff.


Quantum Mechanics Refresher

Schrödinger Equation (time-independent, non-relativistic)

H Ψ = E Ψ

For N electrons + M nuclei, H contains kinetic energy (electrons + nuclei), electron-nucleus attraction, electron-electron repulsion, nucleus-nucleus repulsion. The many-body wave function Ψ(r₁σ₁, ..., r_Nσ_N; R₁, ..., R_M) is intractable beyond ~6 electrons exactly — every approximation method below tackles this curse of dimensionality.

Born-Oppenheimer Approximation (Born & Oppenheimer, 1927)

Nuclei are ~1800× heavier than electrons → electrons adjust instantaneously to nuclear positions. Decouple: solve electronic Schrödinger eq at fixed nuclei, get potential energy surface (PES) E({R_I}). Nuclei then move classically (MD) or quantum-mechanically (path integral) on this PES. Breaks near conical intersections and in non-adiabatic dynamics (Tully surface hopping, Ehrenfest).

Pauli Principle & Slater Determinant (Slater, 1929)

Fermionic Ψ antisymmetric under particle exchange. Simplest antisymmetric ansatz: Slater determinant of single-particle orbitals φ_i(r,σ). Hartree-Fock variational ground state. Multi-determinant expansions (configuration interaction) capture correlation.


Wave-Function Methods (Quantum Chemistry)

Hartree-Fock (HF; Hartree 1928, Fock 1930)

Mean-field: each electron sees averaged potential from others. Self-consistent field iteration. Captures ~99% of energy but missing electron correlation — the instantaneous Coulomb avoidance that HF averages over. Errors of ~1 eV/atom; useless for reaction energies but cheap and a building block.

Post-Hartree-Fock

  • MP2 (Møller-Plesset 1934): second-order perturbation, scales O(N⁵). Good for weak interactions, dispersion.
  • CCSD (Coupled Cluster Singles + Doubles, Čížek 1966): O(N⁶).
  • CCSD(T) “gold standard”: adds perturbative triples. O(N⁷). Chemical accuracy (~1 kcal/mol) for small molecules. Reference for benchmarking DFT functionals.
  • CASSCF / CASPT2 / MRCI: multi-reference for static correlation (bond breaking, diradicals).
  • DMRG (White 1992): density-matrix renormalization, treats large active spaces; used for strongly-correlated molecules and 1D-2D lattices.

Codes: Gaussian, ORCA (Neese), MOLPRO, NWChem, PSI4, Q-Chem, MOLCAS, BAGEL.


Density Functional Theory (DFT) — The Workhorse

Foundational Theorems

  • Hohenberg-Kohn I (1964): ground-state energy is a unique functional of the electron density n(r). The wave function Ψ(r₁...r_N) (3N variables) is replaced by n(r) (3 variables) without loss for ground-state properties. Walter Kohn shared the 1998 Nobel Prize in Chemistry with John Pople.
  • Hohenberg-Kohn II (1964): variational principle on density.
  • Kohn-Sham (1965): map interacting system onto fictitious non-interacting one with same density. Solve single-particle equations [−½∇² + v_eff(r)] φ_i = ε_i φ_i where v_eff = v_ext + v_H + v_xc. Exact in principle — the exchange-correlation functional E_xc[n] hides all the many-body physics and must be approximated.

The Jacob’s Ladder of Functionals (Perdew & Schmidt 2001)

  1. LDA — Local Density Approximation (Kohn-Sham 1965; Ceperley-Alder 1980 parameterization): E_xc = ∫ ε_xc(n) n dr using uniform electron gas. Overbinds, predicts bond lengths ~1–3% too short. Surprisingly OK for metals and structural properties.
  2. GGA — Generalized Gradient Approximation: adds dependence on ∇n. PBE (Perdew-Burke-Ernzerhof, 1996) is the canonical GGA; ~10,000 citations/year. Lattice constants typically ~1% too large. PW91 (Perdew-Wang 1991), BLYP (Becke-Lee-Yang-Parr 1988) for chemistry.
  3. meta-GGA: adds kinetic-energy density τ(r). TPSS (Tao-Perdew-Staroverov-Scuseria 2003). SCAN (Sun-Ruzsinszky-Perdew 2015) — satisfies all 17 known exact constraints; excellent across diverse bonding. r²SCAN (Furness 2020) — numerically stable refinement now the default meta-GGA in many production workflows.
  4. Hybrid functionals: mix in fraction of exact HF exchange. B3LYP (Becke 1993; 20% HF) dominates molecular chemistry. PBE0 (Adamo-Barone 1999; 25% HF). HSE06 (Heyd-Scuseria-Ernzerhof 2003/2006) screens long-range HF for solids — fixes most band gaps. Cost: ~10–100× pure GGA.
  5. Range-separated hybrids: CAM-B3LYP (Yanai 2004), ωB97X-D (Chai-Head-Gordon 2008). Better for charge-transfer states + Rydberg excitations.
  6. Double-hybrids (rung 5): hybrid + MP2-like correlation. B2PLYP (Grimme 2006). Approach CCSD(T) accuracy on benchmark sets.

DFT Corrections & Extensions

  • DFT+U (Anisimov-Zaanen-Andersen 1991; Liechtenstein-Anisimov-Zaanen 1995): Hubbard correction for localized d/f electrons. Standard for transition-metal oxides (NiO, MnO, CeO₂), rare earths, actinides. Free parameter U (or U-J), typically 3–7 eV; can be computed self-consistently (linear-response, Cococcioni-de Gironcoli 2005).
  • DFT-D dispersion (Grimme): D2 (2006), D3 (2010), D3(BJ) Becke-Johnson damping (2011), D4 (2019). Adds C₆/R⁶ + C₈/R⁸ pair-wise corrections; essential for van der Waals (layered materials, MOFs, biomolecules). Alternatives: Tkatchenko-Scheffler vdW-TS (2009), vdW-DF (Dion-Rydberg-Schröder-Hyldgaard-Lundqvist 2004), rVV10.
  • Self-interaction correction (SIC): Perdew-Zunger (1981). DFT spuriously interacts an electron with itself; SIC removes this for localized states.

The Band Gap Problem

Kohn-Sham eigenvalues are not true excitation energies. PBE underestimates band gaps by ~30–50% (Si: 0.6 eV vs 1.17 eV experiment; GaN: 1.6 vs 3.4). Causes: (i) derivative discontinuity of E_xc, (ii) self-interaction error, (iii) lack of non-local screening. Fixes: hybrids (HSE06 typically within 0.3 eV of experiment), GW, scissor-shift (cheap hack), DFT+U (for correlated systems).


DFT Software Landscape

Plane-Wave Codes (periodic systems, condensed matter)

CodeLicenseStrengths
VASP (Hafner et al, U Vienna; 1993+)CommercialMost cited DFT code; PAW + plane waves; mature high-throughput pipelines
Quantum ESPRESSO (Giannozzi et al; 2009+)GPLFree; norm-conserving + USPP + PAW; strong DFPT phonons; EPW
ABINIT (Gonze, UCLouvain; 1997+)GPLFree; many-body GW + BSE; DFPT; cutting-edge methods
CASTEP (Cambridge; commercial)CommercialStrong NMR + experimental property tooling
CP2K (Hutter; mixed Gaussian + plane-wave)GPLExcellent for AIMD of liquids/biomolecules
GPAW (DTU)GPLPAW + real-space + plane-wave; tightly coupled with ASE

LCAO / Gaussian Codes (molecules; finite + periodic)

CodeLicenseStrengths
Gaussian (Pople; G09, G16)CommercialQuantum chemistry standard; CCSD(T), basis-set conventions
ORCA (Neese, MPI)Academic freeHybrids + CC + multi-reference; explicit correlation F12
NWChem (PNNL)OSIPlane-wave AIMD + CC + DFT; HPC scaling
FHI-aims (Blum, Scheffler; 2009)Commercial/academicNumeric atom-centered orbitals; all-electron; molecules + solids
CRYSTAL (Torino)CommercialGaussian basis for solids; B3LYP solid-state pioneer
ADF / BAND (SCM, Amsterdam)CommercialSlater orbitals; relativistic ZORA; transition-metal chemistry
DFTB+ (Aradi et al)LGPLApproximate tight-binding DFT; semi-empirical speed

Real-Space + Finite-Element

  • Octopus (Marques, Castro, Rubio): real-space TDDFT for laser-matter, optical absorption.
  • DFT-FE (Motamarri-Das-Subramanian, 2020): finite-element DFT, multi-million-electron calculations on GPUs — pushed past 6M electrons on Frontier in 2023.
  • PARSEC, RMG.

GPU Acceleration (2024–26)

  • VASP 6.4+ GPU port via OpenACC; ~3–5× speedup on H100 vs CPU node.
  • Quantum ESPRESSO 7.3 CUDA Fortran + NVIDIA Grace-Hopper optimized; routine 10× speedups for plane-wave SCF + DFPT.
  • CP2K GPU offload for DBCSR sparse matrix kernels.
  • AMS / BAND AMD ROCm + NVIDIA CUDA.
  • NVIDIA cuQuantum for quantum-chemistry CC tensor contractions.
  • PySCF + GPU4PySCF (Wu et al 2024): Python quantum chemistry with NVIDIA backend.

Numerics — What You Tune

Pseudopotentials

Core electrons are chemically inert → freeze them and replace with effective potential acting on valence only.

  • Norm-conserving (Hamann-Schlüter-Chiang 1979; Troullier-Martins 1991; Hamann ONCV 2013): preserve scattering properties; safest, transferable; modern ONCV pseudos are accurate + efficient.
  • Ultrasoft (USPP) (Vanderbilt 1990): drop norm-conservation, reduce cutoff dramatically.
  • PAW — Projector Augmented Wave (Blöchl 1994): reconstructs all-electron wave function inside augmentation sphere; near-all-electron accuracy at pseudopotential cost. Dominant choice (VASP default; QE supports).
  • Pseudopotential libraries: PseudoDojo (van Setten 2018), SSSP (Standard Solid State Pseudopotentials, Prandini-Marrazzo 2018), GBRV (Garrity-Bennett-Rabe-Vanderbilt 2014).

Plane-Wave Cutoff (E_cut)

ψ_k(r) = Σ_G c_{k+G} exp(i(k+G)·r); truncate at ½|k+G|² < E_cut. Typical 40–80 Ry for USPP/PAW, 80–150 Ry for norm-conserving. Always converge total energy + forces vs cutoff.

k-Point Sampling — Brillouin Zone

The Brillouin zone is the primitive cell of the reciprocal lattice. Integrate over occupied k-states by sampling on a discrete grid: Monkhorst-Pack (1976) uniform meshes. Metals need denser k-grids + smearing (Methfessel-Paxton, Marzari-Vanderbilt cold smearing, Gaussian, Fermi-Dirac). Insulators converge faster. Always check k-grid convergence; high-symmetry path Γ → X → W → K → Γ → L for FCC band-structure plots.

Bloch’s Theorem (Bloch, 1929)

In a periodic crystal, eigenstates take the form ψ_{nk}(r) = e^{ik·r} u_{nk}(r) with u_{nk} cell-periodic. Reduces infinite-crystal problem to one Brillouin zone parameterized by k. Foundation of solid-state physics.


Reading What DFT Gives You

Band Structure

ε_n(k) along high-symmetry lines. Read off direct vs indirect gap, effective masses (curvature), spin-orbit splittings (relativistic), Dirac/Weyl crossings.

Density of States (DOS) + Projected DOS (PDOS)

g(ε) = Σ_n,k δ(ε − ε_n(k)). Integrate to get electron count; partial decomposition onto atomic orbitals reveals which atoms + orbitals contribute at the Fermi level (transition-metal d-bands, ligand p-bands).

COHP / COBI (Crystal Orbital Hamilton Population)

Maintz, Dronskowski, Hoffmann — extends molecular-orbital bonding analysis to extended solids. Energy-resolved bonding (negative COHP) / antibonding (positive) contribution. Software: LOBSTER (Dronskowski group; periodic-orbital projection from VASP/QE output). Companion: COOP (Crystal Orbital Overlap Population, Hoffmann original).

Bader Charge Analysis (Bader, 1990s)

Partition real-space density into atomic basins separated by zero-flux surfaces. Yields integrated atomic charges + volumes without arbitrary basis-set choice. Implementations: Bader code (Henkelman group, UT Austin), Critic2 (Otero-de-la-Roza), AIM in many codes.

Electron Localization Function (ELF)

Becke-Edgecombe (1990). Scalar field in [0, 1] showing covalent bonds, lone pairs, atomic shells. 0.5 = electron-gas-like; ~1 = strong localization (bond, lone pair).

STM / AFM Simulation

Tersoff-Hamann (1985): STM current ∝ local DOS at tip apex height integrated near Fermi level. Codes: STMpw, p4vasp, Critic2.


Beyond-DFT — When DFT Isn’t Enough

GW Approximation (Hedin, 1965)

Self-energy Σ = i G W where G is Green’s function, W is screened Coulomb. Predicts quasi-particle (single-particle excitation) energies → band gaps within ~0.1–0.3 eV of experiment.

  • G₀W₀: one-shot, starts from DFT; depends on starting functional (PBE vs HSE differ).
  • scGW: self-consistent in G + W.
  • QSGW (Quasi-particle Self-Consistent GW; Faleev-van Schilfgaarde-Kotani 2004): partial self-consistency that yields highest accuracy.
  • Codes: BerkeleyGW, ABINIT, VASP, FHI-aims, TURBOMOLE.

Bethe-Salpeter Equation (BSE) — Optical Absorption

Two-particle Green’s function captures electron-hole interaction = excitons (Salpeter & Bethe 1951). Required for optical spectra of insulators with strong excitonic binding (LiF, MoS₂ monolayer, organics). Codes: BerkeleyGW, exciting, Yambo.

TD-DFT (Runge-Gross 1984)

Time-dependent extension of Kohn-Sham. Linear-response TDDFT for excitation spectra (cheap, good for valence excitations in molecules; fails for charge-transfer + Rydberg with pure functionals). Real-time TDDFT (RT-TDDFT) propagates orbitals in time → nonlinear optics, strong-field ionization, attosecond dynamics. Codes: Octopus, NWChem RT-TDDFT, Gaussian, GPAW.

DMFT — Dynamical Mean-Field Theory (Metzner-Vollhardt 1989; Georges-Kotliar 1992; Georges-Kotliar-Krauth-Rozenberg review 1996)

Maps lattice Hubbard model onto self-consistent Anderson impurity problem solved exactly. Captures Mott metal-insulator transitions, Hund’s metals, heavy fermions — strongly-correlated physics where DFT+U is insufficient. DFT+DMFT combines DFT with DMFT for materials (cuprates, ruthenates, manganites, iron pnictides, plutonium δ-phase).

Codes + impurity solvers: TRIQS (Saclay), eDMFTF (Haule), DMFTwDFT, AMULET, w2dynamics, ALPS/ALPSCore. Solvers: CT-HYB (Werner 2006), CT-INT, NRG, ED.

Quantum Monte Carlo (QMC)

Stochastic sampling of many-body wave function. Reference accuracy beyond CCSD(T) for some systems.

  • Variational MC (VMC): minimize energy of trial wave function (Jastrow-Slater, backflow).
  • Diffusion MC (DMC): project trial ψ to ground state in imaginary time; fixed-node error from trial nodal surface.
  • AFQMC — Auxiliary-Field MC (Zhang-Krakauer 2003): phase-free constrained-path; complementary to DMC.
  • FCIQMC (Booth-Alavi 2009): full configuration interaction in stochastic basis.
  • Codes: QMCPACK (Oak Ridge; ECP-funded), CASINO (Cambridge), TurboRVB, NECI, CHAMP. GPU-accelerated, runs on Frontier + Aurora exascale machines.

Molecular Dynamics

Classical MD

Integrate Newton’s equations m_I dR_I/dt² = −∇_I U({R}) on empirical PES. Thermostats: Nosé-Hoover (1984), Langevin, Berendsen (deprecated), velocity-rescaling. Barostats: Parrinello-Rahman (1981), Martyna-Tobias-Klein (1994).

Codes:

CodeDomainNotes
LAMMPS (Plimpton, Sandia; 1995+)Materials, polymers, granularMost flexible; thousands of force-field styles; pair-style interface for ML potentials
GROMACS (Berendsen group; 1991+)Biomolecular MDHighly optimized; GPU-accelerated
AMBER (Kollman-Case; 1981+)BiomolecularAMBER force fields
CHARMM (Karplus group; 1983)BiomolecularFounding code; CHARMM force fields
NAMD (UIUC; Schulten group)Biomolecular, large systemsScales to 10⁹ atoms
OpenMM (Pande, Stanford)Python-friendly biomolecularCustom forces; PyTorch integration for ML
ESPResSo (Holm group)Soft matter, polyelectrolytesLattice Boltzmann + MD
HOOMD-blue (Glotzer group, U Michigan)Self-assembly, colloidsGPU-native from the start

Classical Force Fields

  • Lennard-Jones 6-12 (Jones 1924): noble gases, vdW.
  • Buckingham 6-exp: similar with exponential repulsion.
  • Stillinger-Weber (1985): Si, three-body term enforces tetrahedral geometry.
  • Tersoff (1988) + Brenner / REBO (1990): bond-order potentials for covalent C/Si/Ge.
  • ReaxFF (van Duin, Goddard 2001): reactive bond-order; handles bond breaking/forming; combustion, catalysis.
  • EAM — Embedded Atom Method (Daw-Baskes 1984): metals; embedding energy + pair term.
  • MEAM — Modified EAM (Baskes 1992): adds angular dependence.
  • COMPASS (Sun 1998; commercial): polymers, organics.
  • OPLS-AA (Jorgensen 1996, OPLS3e 2019): biomolecular, drug-like.
  • AMBER ff14SB / ff19SB (Maier 2015, Tian 2020): proteins.
  • CHARMM36 / 36m (Best 2012, Huang 2017): proteins + lipids.
  • GAFF (Wang 2004; GAFF2 2020): general AMBER force field for small molecules.
  • SPC, TIP3P, TIP4P/Ew, TIP5P, OPC: water models.

Ab-Initio Molecular Dynamics (AIMD)

  • Car-Parrinello MD (Car & Parrinello, Phys. Rev. Lett. 1985): treat electronic degrees of freedom as fictitious dynamical variables on extended Lagrangian; both nuclei and electrons evolve simultaneously. Avoids full SCF at each step. Foundational paper, ~25,000 citations.
  • Born-Oppenheimer MD (BOMD): full SCF convergence at each step; longer time-steps, simpler. Now more common on modern hardware.
  • Codes: CPMD, Quantum ESPRESSO pw + cp, VASP MD, CP2K, NWChem AIMD, GPAW. Time-scale: typically 10–100 ps with ~10²–10³ atoms (10⁻¹⁶ s integration step).

Enhanced Sampling

Direct MD samples only states with kT ~ barrier. Enhanced methods overcome this:

  • Umbrella sampling (Torrie-Valleau 1977): bias along reaction coordinate, reweight via WHAM (Kumar 1992).
  • Metadynamics (Laio & Parrinello, PNAS 2002): history-dependent Gaussian bias on collective variables; well-tempered MetaD (Barducci 2008); ~5000 citations on original paper. PLUMED implementation.
  • Replica Exchange MD / Parallel Tempering (Sugita-Okamoto 1999): exchange replicas at different temperatures to cross barriers.
  • Steered MD (Izrailev-Schulten 1997): pull along coordinate at constant velocity.
  • Free-Energy Perturbation (Zwanzig 1954): ΔF = −kT ln ⟨exp(−ΔU/kT)⟩.
  • Thermodynamic Integration: ΔF = ∫ ⟨∂U/∂λ⟩ dλ.
  • Adaptive Biasing Force (Darve-Pohorille 2001).
  • String method (E-Ren-Vanden-Eijnden 2002) for minimum free-energy paths.

PLUMED (Bonomi, Tribello, 2014, 2019) — universal plugin for biasing on top of MD engines.


Machine-Learning Interatomic Potentials (MLIPs) — The 2024–26 Frontier

The goal: DFT-quality forces at force-field speed, learning E({R}) from DFT training data.

Pre-Equivariant Era

  • Gaussian Approximation Potentials (GAP) — Bartók-Csányi 2010, with SOAP descriptors. Mature; still strong for small datasets.
  • Spectral Neighbor Analysis Potential (SNAP) — Thompson 2015; LAMMPS native.
  • Moment Tensor Potentials (MTP) — Shapeev 2016.
  • ACE — Atomic Cluster Expansion — Drautz 2019; systematically improvable.

Neural-Network Potentials

  • Behler-Parrinello NNP (2007): foundational atom-centered symmetry-function descriptors; n2p2, RuNNer, AENET implementations.
  • SchNet (Schütt et al, NeurIPS 2017): continuous-filter convolutions on graphs; the first widely-used neural force field.
  • DimeNet / DimeNet++ (Klicpera-Günnemann 2020): directional message passing with angular information.
  • PaiNN (Schütt 2021): equivariant features without explicit angular features.
  • GemNet (Klicpera 2021): triplet + quadruplet interactions; OC20 leaderboard top.

Equivariant Graph Neural Networks (the leap)

  • NequIP (Batzner et al, Nature Communications 2022): E(3)-equivariant features via tensor products of spherical harmonics; trains with ~10⁴ structures, generalizes beautifully.
  • Allegro (Musaelian et al, Nature Communications 2023): strictly local + equivariant; massively parallel.
  • MACE — Higher-Order Equivariant Message-Passing (Batatia, Kovács, Simm et al, NeurIPS 2022; Nature 2023): combines ACE + equivariant message passing; state-of-the-art transferability as of 2024–26. MACE-MP-0 universal foil for ~89 elements.
  • Equiformer / Equiformer-V2 (Liao-Smidt 2022, 2023): equivariant transformers; SE(3) attention.
  • NequIP-OC / FAIR-Chem family: production training on Open Catalyst datasets.

2024–26 Universal Foundation Models for Materials

  • Orb (Orbital Materials, Neumann et al, 2024): pretrained universal MLIP; fast inference; good zero-shot transfer.
  • MatterSim (Microsoft Research, Yang et al, 2024 preprint → 2025): foundation MLIP trained on millions of DFT calculations across the periodic table; finite-temperature and pressure sampling baked in.
  • UMA — Universal Models for Atoms (Meta FAIR Chemistry 2025): jointly trained on OC20, OC22, ODAC, OMat — handles inorganic crystals + catalysts + adsorbates + MOFs in one model.
  • CHGNet (Deng-Zhong-Chen-Ceder, Nature Machine Intelligence 2023): includes magnetic moments + charges; covers the Materials Project chemical space.
  • M3GNet (Chen-Ong, Nature Computational Science 2022): three-body interactions; broad coverage.
  • GNoME potentials (Merchant et al 2023): the GNN that drove the 2.2M-crystal discovery.
  • SevenNet (Park 2024): graph + equivariance scaled.

Inference: serve via LAMMPS pair_style mliap, ASE, OpenMM, JAX-MD, TorchMD, n2p2. Deployment now routine: ~10⁵–10⁶ atom MD at DFT-comparable accuracy, 1–100 ns/day on a single H100.

Active Learning + Uncertainty Quantification

MLIPs fail silently outside training distribution. Mitigations:

  • Active learning loops: train → run MD → detect high-uncertainty structures → re-DFT → re-train. Frameworks: hAL (Vandermause-Kozinsky FLARE/MGP 2020), pyiron + ASE workflows, MACE+EvoMacs.
  • UQ: deep ensembles (Lakshminarayanan 2017), MVE, Gaussian-process posteriors (FLARE/MGP), evidential regression (Soleimany 2021), conformal prediction.
  • Practical heuristics: ensemble-disagreement on forces, NequIP/MACE committee.

Lattice Dynamics + Properties

Density Functional Perturbation Theory (DFPT)

Linear response to perturbations (Baroni-de Gironcoli-Dal Corso-Giannozzi, Rev. Mod. Phys. 2001). Yields phonons (any q without supercell), dielectric tensor (ε∞ + ε⁰), Born effective charges, Raman + IR intensities, piezoelectric tensors, electron-phonon coupling.

Codes: Quantum ESPRESSO ph.x, ABINIT, VASP DFPT, phonopy + phono3py (frozen-phonon supercell alternative; Togo).

Wannier Functions + Wannier90

Maximally-Localized Wannier Functions (MLWFs) — Marzari-Vanderbilt (1997), Souza-Marzari-Vanderbilt (2001). Real-space localized orbitals via gauge transformation of Bloch states. Enable:

  • Interpolated band structures + Fermi surfaces
  • Electron-phonon matrix elements on dense grids (EPW; Giustino-Cohen-Louie 2007)
  • Anomalous Hall + Berry curvature
  • Disentangled bands in metals
  • Tight-binding models for downstream methods

Wannier90 (Mostofi-Yates-Pizzi-Marzari, 2008, 2014, 2020) — the universal post-processing tool. WannierTools (Wu-Zhang-Soluyanov) for topological invariants.

Transport

  • Boltzmann Transport Equation (semiclassical): f(k,r,t) distribution function with collision integral. BoltzTraP / BoltzTraP2 (Madsen-Singh; Madsen-Carrete-Verstraete 2018) — solves BTE in constant-relaxation-time approximation for thermoelectric Seebeck, conductivity, ZT.
  • EPW (Electron-Phonon Wannier; Poncé-Margine-Verdi-Giustino 2016): ab-initio carrier mobility, superconducting T_c via Eliashberg, polarons.
  • AMSET (Ganose-Park-Jain 2021): ab-initio scattering rates (POP, ADP, IMP, PIE) for thermoelectrics.
  • ShengBTE / almaBTE / FourPhonon (Li-Carrete-Mingo 2014): three-phonon (+ four-phonon) lattice thermal conductivity.

Phase-Field Models — Mesoscale

Continuum PDEs for microstructure evolution; order parameters represent phases (composition c(r,t) or non-conserved η(r,t)).

  • Cahn-Hilliard equation (Cahn & Hilliard, J. Chem. Phys. 1958): ∂c/∂t = ∇·(M ∇ μ) with μ = ∂F/∂c − κ∇²c. Conserved-order-parameter spinodal decomposition. The founding paper of phase-field; ~25,000 citations.
  • Allen-Cahn equation (Allen & Cahn 1979): ∂η/∂t = −L δF/δη. Non-conserved (grain growth, phase ordering).
  • Karma model (Karma & Rappel 1998): quantitative dendritic solidification with anti-trapping current; calibrated to sharp-interface limit.
  • Krill-Geslin (Krill-Chen 2002; Moelans 2008): multi-grain phase-field for polycrystalline grain growth.
  • Steinbach multi-phase field (1996+): MICRESS commercial code.
  • Frameworks: MOOSE / MARMOT (INL; finite-element, multi-physics — heavily used for nuclear materials), PRISMS-PF (Michigan; matrix-free FE, adaptive mesh, GPU), OpenPhase (Steinbach), MMSP, Mu-PRO, FEniCS-based custom solvers.
  • Applications: dendritic + columnar solidification (additive manufacturing), spinodal decomposition (Cu-Ni-Fe alloys), Mn-Bi precipitates, grain growth + recrystallization, eutectics, fracture phase-field (Bourdin-Francfort-Marigo 2000), battery electrode microstructure.

CALPHAD thermodynamic input bridges DFT → phase-field via free-energy functions (already covered separately in crystallography-phase-diagrams).

Numerics + Discretization

  • Spectral methods (FFT-based) for periodic domains — Khachaturyan-Shatalov 1969 elastic kernel; micromechanics phase-field via FFT (Chen-Wang 1996).
  • Finite-element for complex geometries + adaptive refinement (MOOSE/MARMOT, PRISMS-PF, FEniCS).
  • Adaptive time-stepping: backward-Euler, IMEX, semi-implicit Fourier-spectral.
  • Coupling with CALPHAD: TC-PRISMA, OpenCalphad, pycalphad — pull G(c,T) curves from thermodynamic databases directly into phase-field free-energy expressions.

Multi-Scale + Coupled Workflows

Real materials problems span 9+ orders of magnitude in length and time. The contemporary practice:

  1. Electronic structure (DFT/GW/QMC) — meV/atom energies, lattice constants, defect formation energies, migration barriers, electron-phonon couplings, magnetic exchange constants.
  2. Hand-off to MLIPs or classical FF — finite-temperature thermodynamics, mechanical properties under strain, diffusion coefficients, melting points (Z-method, two-phase coexistence), thermal conductivity (Green-Kubo, NEMD).
  3. Hand-off to kinetic Monte Carlo (KMC) — rare-event dynamics on the time-scale of seconds-hours: thin-film growth, surface reactions, vacancy/dopant diffusion. SPPARKS (Sandia), KMCLib, lattice KMC generators from NEB + nudged elastic band barriers (Henkelman-Uberuaga-Jónsson 2000).
  4. Hand-off to phase-field / dislocation dynamics — microstructure, plasticity. MDDP / ParaDiS / MoDELib for discrete dislocation dynamics.
  5. Hand-off to continuum FEM — component-level stress + thermal analysis. See fem-fea.

The bridges (CALPHAD G-functions, ML-trained constitutive laws, atomistically-informed mobility tensors) are where most current research effort sits.

CRYSTAL-Structure Prediction (CSP)

Find the global minimum on the PES for a given chemical composition (or for a fixed atom count).

  • USPEX (Oganov-Glass 2006): evolutionary algorithm; predicted superconducting H₃S + LaH₁₀ + many high-pressure phases.
  • CALYPSO (Wang-Lv-Zhu-Ma 2010): particle-swarm optimization.
  • AIRSS — Ab Initio Random Structure Searching (Pickard-Needs 2006, 2011): random sensible initial configurations + DFT relaxation; deceptively simple, very effective at high pressure.
  • Random-search + ML-screening (GNoME 2023 paradigm): generate candidates with GNN, validate with DFT.
  • XtalOpt, GASP-python, PyXtal.

High-Throughput DFT + Materials Databases

Databases (state as of 2024–26)

DatabaseCustodianSize (2024–26)Notes
Materials ProjectLBNL (Jain, Persson)~155,000 inorganic crystals; band structures, elastic, dielectric, piezo, X-rayPymatgen + atomate2; freely accessible REST API
AFLOWlibDuke (Curtarolo)~3.5M entries with phase diagramsHigh-throughput VASP since 2012
OQMD — Open Quantum Materials DatabaseNorthwestern (Wolverton)~1.2M formation energiesDFT thermochemistry focus
NOMAD Repository + ArchiveFAIRmat / MPCDF~13M raw + ~140M total entries; multi-codeEU FAIR-data hub
JARVIS-NISTNIST (Choudhary)~80k crystals + 2D materials + DFT-D3 + MLJARVIS-Tools
ICSD — Inorganic Crystal Structure DatabaseFIZ Karlsruhe (commercial)~290k experimentally-determined structuresExperimental reference
AFLOWDuke~3.5MWorkflow-engine sibling of AFLOWlib
Computational Materials Repository (CMR)DTUNN-database for ASESpecialized 2D + catalysis sets
Materials CloudEPFL + MARVELMulti-tenant; AiiDA-archived workflowsFAIR + reproducible
Open Catalyst Project — OC20 / OC22 / OC25Meta FAIR + CMU~265M DFT calculations for catalysisLargest single-domain dataset
AlexandriaHoffmann group~5M PBE + r²SCANBridges MP + OQMD + AFLOW
OMat24Meta FAIR~118M DFT structuresFoundation for UMA pretraining

High-Throughput Workflow Engines

  • pymatgen (Ong, Comput. Mater. Sci. 2013) — Python materials informatics library.
  • atomate / atomate2 (Mathew 2017; Ganose et al 2024) — Materials Project workflow library.
  • AiiDA (Pizzi-Cepellotti-Sabatini-Marzari 2016; AiiDA-core 2.0 2022) — provenance-tracked workflow engine.
  • FireWorks (Jain 2015) — workflow manager.
  • ASE — Atomic Simulation Environment (Larsen-Mortensen et al 2017) — universal Python interface to ~40 calculators.

ML for Materials Discovery

  • Property prediction: SchNet, MEGNet (Chen-Ye-Zuo-Zheng-Ong 2019), CGCNN (Xie-Grossman 2018), ALIGNN (Choudhary-DeCost 2021), Matbench leaderboard.
  • Generative models: CDVAE (Xie-Tegmark 2022), DiffCSP, FlowMM (Miller et al 2024), MatterGen (Microsoft 2024 Nature) — diffusion model conditioned on composition + property targets, generates novel inorganic crystals with high success rate after DFT relaxation.
  • GNoME — Graph Networks for Materials Exploration (Merchant et al, Nature 2023, Google DeepMind): 2.2 million predicted-stable inorganic crystals via GNN-driven active learning over a 380M structure search; ~380k below the convex hull. 736 experimentally synthesized (overlap with prior literature contested). A massive leap in scale; integrated with Materials Project.
  • A-Lab Berkeley (Szymanski, Ceder et al, Nature 2023): autonomous synthesis lab — receives ML predictions (GNoME hits), plans + executes solid-state syntheses, characterizes via XRD, achieves ~70% success on attempted targets. Validation controversy (Leeman, Aykol, Chem. Mater. 2024): debate over whether successes are truly novel phases vs. known polymorphs; community response refined characterization standards.
  • MatterSim (Microsoft 2024–25): foundation-MLIP enabling temperature + pressure-dependent property prediction at scale.
  • Active learning closing the loop: ML-pred → DFT-verify → synth-attempt → characterize → update ML. Approaching genuine self-driving materials laboratories.

Industrial / Commercial Software

Vendor / StackCoverage
Schrödinger Materials Science SuiteDFT (Jaguar), MD (Desmond), QM/MM, OPLS, polymers, batteries
Materials Studio (Dassault / BIOVIA)CASTEP + DMol3 + Forcite; UI-driven; widely licensed in industry
AMS / SCM Amsterdam Modeling SuiteADF, BAND, DFTB, ReaxFF, COSMO-RS; relativistic chemistry strength
Matlantis (Preferred Networks + ENEOS, Tokyo)Cloud MLIP service using PFP (Preferred Potential), >70 elements, finance-friendly subscription
PolymerizePolymer R&D ML platform
KebotixSelf-driving lab + ML for specialty chemicals
Mat3ra (formerly Exabyte.io)Cloud DFT platform
Citrine InformaticsMaterials data + Bayesian optimization platform
Albert / Uncountable / EnthoughtMaterials informatics tooling

Cross-References


Citations

  • Born, Oppenheimer. Ann. Phys. 1927.
  • Slater. Phys. Rev. 1929 (determinant).
  • Bloch. Z. Phys. 1929.
  • Hartree (1928); Fock (1930).
  • Hohenberg, Kohn. Phys. Rev. 1964.
  • Kohn, Sham. Phys. Rev. 1965.
  • Hedin. Phys. Rev. 1965 (GW).
  • Cahn, Hilliard. J. Chem. Phys. 1958.
  • Allen, Cahn. Acta Metall. 1979.
  • Monkhorst, Pack. Phys. Rev. B 1976.
  • Car, Parrinello. Phys. Rev. Lett. 1985.
  • Stillinger, Weber. Phys. Rev. B 1985.
  • Tersoff. Phys. Rev. B 1988.
  • Becke. J. Chem. Phys. 1993 (B3LYP).
  • Perdew, Burke, Ernzerhof. Phys. Rev. Lett. 1996 (PBE).
  • Heyd, Scuseria, Ernzerhof. J. Chem. Phys. 2003, 2006 (HSE06).
  • Sun, Ruzsinszky, Perdew. Phys. Rev. Lett. 2015 (SCAN).
  • Vanderbilt. Phys. Rev. B 1990 (USPP).
  • Blöchl. Phys. Rev. B 1994 (PAW).
  • Marzari, Vanderbilt. Phys. Rev. B 1997 (MLWFs).
  • Baroni et al. Rev. Mod. Phys. 2001 (DFPT).
  • Laio, Parrinello. PNAS 2002 (metadynamics).
  • Schütt et al. NeurIPS 2017 (SchNet).
  • Batzner et al. Nat. Commun. 2022 (NequIP).
  • Batatia et al. NeurIPS 2022 (MACE).
  • Merchant et al. Nature 2023 (GNoME).
  • Szymanski et al. Nature 2023 (A-Lab).
  • Chen, Ong. Nat. Comput. Sci. 2022 (M3GNet); Deng et al. Nat. Mach. Intell. 2023 (CHGNet).
  • MatterGen — Zeni et al, Microsoft Research 2024 (preprint → Nature).
  • MatterSim — Yang et al, Microsoft Research 2024.
  • van Duin, Goddard. J. Phys. Chem. A 2001 (ReaxFF).
  • Jain et al. APL Materials 2013 (Materials Project).
  • Curtarolo et al. Comput. Mater. Sci. 2012 (AFLOW).
  • Pizzi et al. Comput. Mater. Sci. 2016 (AiiDA).