Lean — Reference

Source: https://lean-lang.org/

Lean

  • Created: 2013 (Lean 1, by Leonardo de Moura at Microsoft Research). Lean 4 first released in stable form in 2021; this note covers Lean 4 only — Lean 3 is unmaintained as of 2023.
  • Latest stable: Lean 4 v4.x — actively released on a roughly monthly cadence (e.g., v4.29.x April 2024 per https://github.com/leanprover/lean4/releases). The community ecosystem (mathlib4) is what most users actually pin against.
  • Owner / steward: Lean FRO (Lean Focused Research Organization, https://lean-fro.org/), a 501(c)(3) since 2023, funded by Simons Foundation, Sloan Foundation, Microsoft, AWS. Originated at Microsoft Research; Leonardo de Moura now leads from Amazon AWS.
  • Paradigms: Functional, dependently-typed, with strict evaluation. Doubles as a general-purpose programming language and an interactive theorem prover.
  • Typing: Dependent type theory — the Calculus of Inductive Constructions (CIC) with universe polymorphism, quotient types, and propositional extensionality. Definitional proof irrelevance for Prop.
  • Memory: Garbage-collected with reference counting + cycle detector; in-place updates (“destructive update via uniqueness”) when refcount = 1, giving Lean 4 unusually good imperative-style perf for a pure FP language.
  • Compilation: Compiles to C via Lean’s own compiler, then C → native via clang/gcc. Has its own bytecode interpreter for tactic-time execution.
  • Primary domains: Formalized mathematics (mathlib4 — the largest formal math library in any system), program verification (AWS Cedar, Aeneas/Rust verification), DSLs and proof-engineering research.
  • Official docs: https://lean-lang.org/https://leanprover.github.io/theorem_proving_in_lean4/https://leanprover-community.github.io/

At a glance

Lean 4 is the rare language that is both a serious programming language and a frontline theorem prover. It is competitive with OCaml/Haskell on raw perf for normal code thanks to its FBIP (Functional But In-Place) reference-counting compiler, and competitive with Coq/Agda for proofs thanks to its powerful elaborator and macro system. mathlib4 (https://github.com/leanprover-community/mathlib4) crossed 1.5M+ lines in 2024 and is the de-facto standard library for math. The project is stewarded by the Lean FRO. Source: https://lean-lang.org/, https://lean-fro.org/about/.

Getting started

  • Install: elan is the version manager (analogous to rustup). Install once, then per-project lean-toolchain files pin a specific Lean version.

    curl https://raw.githubusercontent.com/leanprover/elan/master/elan-init.sh -sSf | sh

    See https://leanprover-community.github.io/get_started.html.

  • Hello world:

    def main : IO Unit := IO.println "Hello, Lean 4!"
  • Smallest proof:

    theorem add_zero (n : Nat) : n + 0 = n := rfl
  • Project layout (Lake):

    lake new myproj          -- creates lakefile.lean / Main.lean / MyProj.lean
    lake build               -- compile
    lake exe myproj          -- run

    lakefile.lean declares dependencies; lean-toolchain pins the compiler version.

  • Editor: VS Code + the official Lean 4 extension is the canonical environment — it shows the infoview with goals, expected types, and proof state inline. Emacs has lean4-mode.

  • REPL / proof env: No standalone REPL; the editor’s interactive mode is the REPL — every cursor position has its own typing context displayed in the infoview.

Basics

  • Types & literals: Nat, Int, Float, Char, String, Bool, Unit, Prop, Type, Type u (universes). Numeric literals via OfNat.

  • Variables / scoping: let x := e1; e2 and let x := e1 blocks. def, theorem, instance, example are top-level.

  • Control flow: if c then a else b, match ... with, do-notation works for any monad including IO. Imperative-feeling do is a major Lean-4 selling point.

  • Functions:

    def fib : Nat → Nat
      | 0 => 0
      | 1 => 1
      | n+2 => fib n + fib (n+1)

    Lambda: fun x => x + 1 or fun | 0 => 0 | n+1 => n.

  • Strings: UTF-8 String; s!"hello {name}" for interpolation, ++ for concat.

  • Collections: List α, Array α (the workhorse — uses FBIP for O(1) updates), HashMap, RBMap, Std.HashSet.

Intermediate

  • Type system depth: Full CIC. Universe polymorphism is explicit: def id.{u} (α : Type u) (a : α) : α := a. Prop vs Type: Prop is impredicative and proof-irrelevant; Type u is predicative. Inductive families, mutual inductives, and nested inductives. Type classes are first-class (resolved by the elaborator). See https://leanprover.github.io/theorem_proving_in_lean4/dependent_type_theory.html.
  • Modules: Files are namespaces. import Foo.Bar loads compiled .olean files. namespace / end for nesting.
  • Error handling: Option, Except ε α, EIO. MonadError, MonadLift. Panics are panic! (last resort).
  • Concurrency: Task α and IO.asTask for green-thread-like parallel computation; IO.Mutex, IO.Channel. The runtime uses a thread pool. (https://leanprover-community.github.io/blog/posts/lean4-multithreading/)
  • I/O: IO α monad. IO.FS.*, IO.Process.*, IO.println, IO.eprintln. MonadLiftT lets IO actions appear in any sub-monad.
  • Stdlib highlights: Std4 (now mostly merged into core/Lean.Data) — HashMap, RBTree, BinomialHeap, Parser combinators. The core is small; mathlib4 is what most projects depend on.

Advanced

  • Memory / GC: Reference counting + cycle collector. The compiler tracks unique references and rewrites pure Array.set etc. into in-place mutation when safe — see Sebastian Ullrich & Leonardo de Moura, Counting Immutable Beans, IFL 2019 (https://leanprover.github.io/papers/beans.pdf).
  • Concurrency: As above — Task, IO.Mutex, IO.Channel. Lean 4 is genuinely usable for medium-scale concurrent server code.
  • FFI: @[extern "c_function_name"] attribute on a Lean declaration binds it to a C function. The C side gets an opaque lean_object* and uses Lean’s runtime API. Reference: https://lean-lang.org/lean4/doc/dev/ffi.html.
  • Reflection / metaprogramming: Lean 4 has the most powerful and ergonomic macro system of the dependently-typed family. macro, elab, syntax declarations + the MacroM and TermElabM monads let you define new syntactic categories with full hygiene (https://leanprover.github.io/lean4/doc/macro_overview.html). Whole DSLs (e.g., the do notation, simp configuration syntax) are implemented as macros.
  • Performance tools: set_option profiler true for elab/compile profiling; lean --profile; perf on the C output.
  • Tactics & proof automation:
    • simp — the primary rewrite tactic; simp_rw, simp only [lemma].
    • omega — decision procedure for linear arithmetic over Nat/Int.
    • decide — runs the kernel on a decidable proposition.
    • polyrith (mathlib) — polynomial arithmetic via Sage.
    • linarith, nlinarith, positivity, gcongr (mathlib).
    • grind — the new general-purpose congruence/instantiation solver shipped with recent Lean 4 (https://lean-lang.org/blog/).
    • exact?, apply?, rw? — interactive lemma search.

God mode

  • Macro & elaborator framework: syntax (name := myStx) "myop " term : term declares new syntax; macro_rules or elab_rules give it meaning. Full hygiene via MonadQuotation. You can introduce notation, custom binders, even new tactic combinators. (https://leanprover.github.io/theorem_proving_in_lean4/tactics.html, https://leanprover.github.io/lean4/doc/macro_overview.html)
  • partial vs total: Functions must be proven terminating or marked partial. partial defs do not unfold for the kernel — they’re opaque to proofs. unsafe is even stronger: opts out of value-soundness, only valid because the type has an Inhabited instance.
  • Kernel: Tiny trusted core checks CIC terms with definitional equality including beta/eta/iota/delta/proof-irrelevance for Prop. Universe polymorphism is part of the kernel. Nested inductives are compiled to mutual inductives by the elaborator before the kernel sees them.
  • Dependent pattern matching: Lean’s equation compiler handles overlapping patterns by inserting casesOn/recOn and equality witnesses. When patterns can’t be made non-overlapping the compiler emits a match elaboration error — sometimes you need to refactor or use match h : x with.
  • @[extern] and C compilation: Lean compiles each definition to a C function; @[extern] swaps in a hand-written C implementation while keeping the Lean signature for type checking. This is how String.append, Nat.add (for big nats), arrays etc. are fast.
  • Native compiler architecture: Lean source → IR → “λ_pure” / “λ_rc” (FBIP-aware IRs) → C → native. The Counting Immutable Beans paper documents this pipeline.
  • Performance vs Coq/Agda: Lean 4 compiled code is typically 1-2 orders of magnitude faster than Coq’s native_compute and far faster than Agda’s GHC backend for non-trivial programs, because Lean is built around being a compiled language from day one.

Idioms & style

  • Naming: UpperCamelCase for types/structures/inductives; lowerCamelCase for definitions and theorems; snake_case is not used (mathlib enforces this — https://leanprover-community.github.io/contribute/style.html). Theorem names read like sentences: Nat.add_comm, List.map_append.
  • Formatter / linter: No autoformatter equivalent to gofmt; mathlib enforces style via PR review and a Mathlib.Tactic.Linter.* suite (unusedHavesSuffices, simpNF, docPrime, etc.).
  • Idiomatic patterns: Prefer simp lemmas (@[simp]-tagged) over manual rewrites; structure proofs with calc blocks; use type classes for ad-hoc polymorphism; lean on omega/decide early.
  • What experts look for: Short, declarative tactic blocks (one-liners with simp [*]; omega are admired); @[simp] discipline (no looping rewrite rules); mathlib-style naming (add_comm, not addComm in the math library — but the core uses addComm; mathlib uses add_comm for historical reasons).

Ecosystem

Gotchas

  • Lean 3 vs Lean 4 are different languages. Lean 3 syntax (begin ... end tactic blocks, := vs :, meta keyword) does not work in Lean 4. mathlib3 was machine-translated to mathlib4 then human-cleaned; Lean 3 is unmaintained.
  • Universe-polymorphism inference: When a definition fails with “universe metavariable”, you usually need to be explicit: def foo.{u, v} ....
  • simp looping: Adding a poorly-oriented @[simp] lemma can make simp non-terminate. The simp linter catches some of this, not all.
  • decide is too eager: It runs the kernel; on big propositions this can take minutes. Use Nat.reduceMul and friends for static numeric checks.
  • partial defs don’t reduce in proofs. If you want to prove things about a function, define it total (with a termination proof) or use WellFoundedRecursion.
  • Tactic backtraces are noisy: Lean errors include macro/elaborator stacks; learn to read them top-down.
  • Tooling versions matter: A mathlib commit pins a specific Lean toolchain. Updating is a chore handled by lake update plus lake exe cache get (mathlib’s prebuilt-.olean cache).
  • Type vs Prop mistakes: if h : p then ... else ... where p : Prop requires [Decidable p]. Forgetting this gives confusing errors.

Citations