MPC for Robots

Model Predictive Control (MPC) for robots is the practice of repeatedly solving a finite-horizon optimal-control problem (OCP) on-line, applying the first control to the plant, then re-solving from the new measured state. The textbook recipe is identical to process-industry MPC (see [[Engineering/mpc-control]]), but the implementation is different in kind: robot dynamics are nonlinear and underactuated, state-dim is high (40+ for humanoids), the loop closes at 100-1000 Hz, and intermittent contact makes the dynamics hybrid. This note collects the robot-specific design, solver, and deployment knowledge that the process-MPC literature does not cover.

1. At a glance

Why robot MPC is its own subfield, not just “MPC with a different plant”:

Nonlinear rigid-body dynamics. The model ẋ = f(x,u) comes from Newton-Euler / Lagrangian mechanics (see [[Robotics/dynamics-rigid-body]]). Linear MPC formulations (Camacho 1995; Maciejowski 2002) used in process control do not generalise without re-linearisation every step, which is expensive when the Jacobian is 40×40.
High state-dim. A 7-DOF arm has 14 states; a quadruped (floating base
- 12 joints) has 36 states; a humanoid (floating base + 28-32 joints) sits at 60-70 states. Naive QP scales O(N³·n³) with horizon N and state-dim n.
Fast update rate. Manipulation MPC: 50-200 Hz. Legged locomotion MPC: 100-500 Hz inner / 30-100 Hz outer. Drone NMPC: 100-1000 Hz. A control cycle of 5 ms leaves no time for general-purpose IPM that costs 50 ms.
Hybrid dynamics from contact. Each foot independently makes/breaks contact; force is constrained to a friction cone in stance and is zero in swing. This is mixed-integer in the strict reading; practical robotics MPC uses (a) prescribed contact schedule (gait), (b) complementarity constraints (Posa, Cantu, Tedrake 2014), or (c) smoothed contact models (Howell, Le Cleach’, Manchester 2022).

Two solver families dominate:

DDP / iLQR family (Mayne 1966; Tassa, Erez, Todorov 2012). Exploits the Bellman recursion to factor the Hessian; cost is O(N·n³) per iteration instead of O((N·n)³). Unconstrained at heart; constraints handled by penalty, augmented Lagrangian, or projected line-search. Toolchains: Crocoddyl (LAAS+Edinburgh 2020), OCS2 (ETH 2017), Drake’s TrajectoryOpt.
Direct-collocation + SQP / IPM (Bock 1984; Hager 2000). Treats both x and u as decision variables; dynamics become equality constraints. Hard inequality constraints are first-class. Real-time-iteration (RTI) variant (Diehl, Bock, Schlöder 2005) runs one SQP step per control cycle and warm-starts from the previous solution — this is what makes embedded NMPC tractable. Toolchains: acados (Verschueren, Frey, Kirches 2018-2021), FORCES Pro (Domahidi, Jerez 2014), CasADi + IPOPT (Andersson 2019; Wächter & Biegler 2006) for offline.

The choice is mostly cultural: legged-robotics labs in Europe (LAAS, ETH, Edinburgh, IIT) lean DDP via Crocoddyl/OCS2; aerial-robotics labs and many manipulation groups use acados; offline trajectory studies use CasADi.

2. First principles

A finite-horizon continuous-time OCP looks like:

minimise   ∫₀^T L(x(t), u(t)) dt + φ(x(T))
   x(·),u(·)
subject to ẋ(t) = f(x(t), u(t)),       t ∈ [0,T]
           x(0) = x̂                     (current measured state)
           g(x(t), u(t)) ≤ 0           (inequality, e.g. friction cone, joint lim)
           h(x(t), u(t)) =  0          (equality, e.g. contact-foot-fixed)
           x(T) ∈ X_f                  (terminal set, often {x_goal})

L is the stage cost, φ the terminal cost, f the dynamics. For receding- horizon MPC, this OCP is re-solved at every control step from the latest x̂ and the first control u*(0) is applied. Discretisation converts the continuous-time problem to an N-stage NLP.

Three discretisation flavours are used in robotics:

Single shooting. Decision variables are only u₀, …, u_{N-1}; states are integrated by a numerical scheme (RK4 typical for robots — symplectic integrators preferred for energy-conserving long horizons). Hessian becomes dense and ill-conditioned for long horizons; rarely used in practice for N > 30.

Multiple shooting (Bock & Plitt 1984). Discretise the horizon into N intervals; both x_k and u_k are decision variables, with shooting-gap equality constraints x_{k+1} = F(x_k, u_k) (RK4 step). Parallelisable across intervals, much better-conditioned, and the structure exploitable by Riccati- type linear solvers. acados, FORCES Pro, and Crocoddyl’s MPC mode all use multiple shooting.

Direct collocation. State and control are represented by polynomials within each interval; the dynamics ẋ = f(x,u) is enforced at collocation points (Legendre-Gauss-Radau pseudospectral — Hager 2000; Hermite-Simpson — Hargraves & Paris 1987). State and control become joint decision variables. Used by TrajOpt (Schulman 2013), PSOPT, Drake. Higher accuracy per node but larger NLP; better for offline trajectory optimisation than for inner-loop MPC.

DDP / iLQR. A dynamic-programming method that linearises f and quadratises L around a trial trajectory and computes a feedback policy δu_k = K_k δx_k + d_k via a backward Riccati pass, then a forward roll-out with line-search. iLQR (Li & Todorov 2004; Tassa 2012) drops the second-order dynamics term of true DDP (Mayne 1966) — cheaper, almost the same convergence in practice. Variants:

SLQ (Sideris & Bobrow 2005, used in ETH’s OCS2) — continuous-time DDP.
FDDP (Mastalli et al. 2020, in Crocoddyl) — “feasibility-driven” DDP that does not require dynamic feasibility at every iterate; converges from arbitrary initial guesses including non-rolled-out ones.
iLQG (Todorov & Li 2005) — DDP with control noise.
AL-iLQR (Howell, Jackson, Manchester 2019, ALTRO) — augmented- Lagrangian wrapping iLQR for hard inequality constraints.

Real-time iteration (RTI). Diehl, Bock, Schlöder 2005 observed that solving the NLP to convergence every cycle is wasteful when the previous cycle’s solution is a good warm-start. RTI runs one full SQP step (linearise + solve QP + take step) per control cycle, exploiting the fact that the warm- started QP is well-conditioned. This is the bedrock of all real-time NMPC on robots; “MPC at 100 Hz” essentially means “one RTI step per 10 ms”.

3. Worked examples

(a) Cart-pole swing-up via iLQR

The canonical demo (running example in Tassa, Erez, Todorov 2012). State x = [x, θ, ẋ, θ̇] (cart position [m], pole angle [rad], cart velocity [m/s], pole angular velocity [rad/s]); control u is force on cart [N]. Goal: invert the pole from θ=π (hanging) to θ=0 (up) and stabilise.

Cost design:

Horizon N = 100, dt = 0.02 s → T = 2 s.
Stage cost: L(x,u) = x' Q x + u' R u with Q = diag(1, 10, 0.1, 0.1), R = 0.001.
Terminal: φ(x_N) = x_N' Q_f x_N with Q_f = 100·I₄.

One iLQR iteration:

Forward pass. Roll out from x̂ with current control sequence u^{(i)}; record (x_k, u_k) and cost-to-go.
Linearise & quadratise. Compute A_k = ∂f/∂x, B_k = ∂f/∂u via finite- difference (or analytic — see Pinocchio’s algorithmic derivatives for the rigid-body case). Compute first/second derivatives of L.
Backward pass. Riccati recursion for V_k = optimal cost-to-go quadratic approximation; compute feedback K_k and feedforward d_k.
Forward roll-out with line-search. Try α = 1 first; if cost increases, halve to α = 0.5, 0.25, …, with min α ≈ 2⁻¹⁰. Updates: u_k = u_k^{(i)} + α d_k + K_k (x_k - x_k^{(i)}).
Convergence test. Stop when |dV_α=1| < 10⁻⁶ or ||d_k||_∞ < 10⁻⁴.

Empirically converges in ~15 iterations from u ≡ 0 initial guess. Tassa’s implementation runs ~50 ms in MATLAB; modern C++ Crocoddyl does it in <1 ms.

(b) Quadruped trot via SRBD MPC

The Di Carlo & Wensing 2018 IROS convex MPC, deployed on MIT Cheetah 3 and nearly every academic quadruped since.

Model — Single Rigid Body Dynamics (SRBD). Replace the full 18-DOF floating-base + 12-joint system with the centroidal inertia of the trunk and the ground-reaction-forces (GRFs) at the feet. State x ∈ R¹² = [p, Θ, ṗ, ω], where p ∈ R³ is base position [m], Θ ∈ R³ is roll-pitch-yaw [rad], ṗ ∈ R³ is COM linear velocity [m/s], ω ∈ R³ is base angular velocity [rad/s]. Control u ∈ R¹² = [f₁, f₂, f₃, f₄], one 3D force per foot [N].

Dynamics (linearised, key trick — Di Carlo 2018):

m p̈ = Σᵢ fᵢ - m g
I_world ω̇ = Σᵢ (rᵢ - p) × fᵢ
Θ̇       = T(Θ) ω                  (Euler-rate map)

Linearise the rotational inertia by holding I_world fixed at the current orientation each MPC step — this turns the dynamics affine in u and produces a QP rather than an NLP.

Constraints.

Friction cone (linearised pyramid): |fᵢ_x| ≤ μ fᵢ_z, |fᵢ_y| ≤ μ fᵢ_z, with μ ≈ 0.4-0.6 depending on surface.
Stance/swing schedule from a prescribed gait clock: fᵢ_z = 0 when foot i is in swing; fᵢ_z ≥ f_min ≈ 5 N when in stance.
Force magnitude: fᵢ_z ≤ f_max ≈ 250 N (Cheetah 3 actuator limit).

Horizon & rate. N = 50 nodes, dt = 0.01 s → 0.5 s lookahead. Re-solve at 100 Hz. Solve time on Intel i7 NUC: 3-8 ms with qpOASES warm-started; ANYmal deployments report 5-15 ms.

The MPC output is reference foot forces. A lower-loop whole-body QP at 500- 1000 Hz tracks these forces while satisfying the full kinematic/dynamic manipulator constraints (see [[Robotics/legged-robotics]] for the WBC).

(c) 7-DOF manipulator collision-avoidance MPC

Setup: Franka Emika Panda (7 revolute joints). State x ∈ R¹⁴ = [q; q̇], joint positions [rad] and velocities [rad/s]. Control u ∈ R⁷ is joint torque [N·m] or joint acceleration depending on whether the inner loop is torque or velocity (Franka exposes both via libfranka).

Model. Full rigid-body dynamics M(q) q̈ + C(q,q̇) q̇ + g(q) = τ, with M, C, g computed by Pinocchio’s CRBA/RNEA (Featherstone 2008 algorithms).

Cost.

Tracking: ||p_ee(q) - p_des(t)||² weighted ~100 in [m²].
Posture: ||q - q_rest||² · w_posture with w_posture = 0.01 (regulariser).
Effort: ||τ||² · 10⁻⁴.
Terminal: tracking weight ×10.

Constraints.

Joint limits q_min ≤ q ≤ q_max (Panda: q1∈[-2.9, 2.9], q4∈[-3.07, -0.07], etc., from the official URDF).
Velocity limits |q̇| ≤ q̇_max (Panda: 2.17 rad/s on q1).
Torque limits |τ| ≤ τ_max (Panda: 87 N·m on q1-4, 12 N·m on q5-7).
Signed-distance ≥ d_safe ≈ 0.03 m to a list of spherical obstacle primitives. SDF linearised around current iterate (Schulman’s TrajOpt approach 2013); use HPP-FCL or libccd for the SD computation.

Horizon & rate. N = 20, dt = 0.05 s → 1 s lookahead. 50 Hz outer. acados with RTI + partial condensing solves in 2-5 ms on a standard i7 CPU.

4. Solver families & toolchains

Toolchain	Author(s) / year	Language	Method	Typical deployment
Crocoddyl	Mastalli et al., LAAS-CNRS + Edinburgh, 2020 (ICRA)	C++ / Python	DDP, FDDP, AL	Talos humanoid, ANYmal, MIT-Cheetah replicas, HRP-2, SOLO
acados	Verschueren, Frey, Kirches et al., 2018-2021	C (codegen)	RTI-SQP + HPIPM/qpOASES	TIAGo, Jackal UGV, autonomous racing, drones, Crazyflie
OCS2	Farshidian et al., ETH RSL, 2017	C++	SLQ (cts-time DDP)	ANYmal C / D, ANYmal-on-wheels, ballbot
CasADi + IPOPT	Andersson, Gillis, Horn, Rawlings, Diehl 2019; Wächter & Biegler 2006	Python / MATLAB / C++	Direct collocation + IPM	Offline trajopt, MHE for SLAM, research prototypes
TrajOpt	Schulman, Ho, Lee, Awwal, Bradlow, Abbeel 2013 (IJRR)	C++ / Python	Sequential convex programming	Kinematic trajectory optimisation, motion planning
FORCES Pro	Domahidi & Jerez 2014, embotech	C codegen, commercial	RTI-SQP + interior-point	Embedded automotive, drones; ABB / Mercedes deployments
drake/MathematicalProgram	Tedrake et al., MIT, 2014+	C++ / Python	Multi-backend (SNOPT, IPOPT, Gurobi, MOSEK)	Research; Atlas / Spot at TRI; benchmark studies
ALTRO	Howell, Jackson, Manchester 2019	Julia / C++	AL-iLQR	Aerospace research, REx Lab
HPIPM	Frison, Diehl 2020	C	Riccati QP / IPM	Internal solver inside acados
PSOPT	Becerra 2010	C++	Pseudospectral collocation	Aerospace, offline studies

A few practical notes:

Crocoddyl is the dominant choice when you have analytical rigid-body derivatives via Pinocchio (Carpentier et al. 2019); the two libraries share authors and integrate tightly. FDDP convergence from infeasible warm-starts is genuinely useful in MPC because the previous-step solution may violate the new shooting gaps after a state jump.
acados is the most-deployed embedded NMPC framework as of 2024-2025. Codegens C with no runtime dependencies; HPIPM exploits the multiple- shooting block-tridiagonal Hessian structure. The Python interface is excellent for development; the generated C runs anywhere.
OCS2 has a small footprint but is older and less actively maintained than Crocoddyl; ANYbotics inherited it via Marco Hutter’s lab.
CasADi is the right answer when “off-line” — designing controllers, identifying parameters, computing reference trajectories — and the wrong answer for inner-loop MPC because IPOPT is too slow per iteration.

5. Design heuristics

Real-time MPC succeeds or fails on a handful of pragmatic choices:

Keep the horizon short. T = 0.3-1.0 s is normal for legged and aerial robots. Manipulation: 0.5-2 s. Process MPC’s “few minutes” horizons are irrelevant — robot dynamics are stiff at the second-scale.
dt ≥ 5 ms. RK4 with sub-millisecond dt explodes the NLP size with little benefit. If you need fine integration, run a finer integrator inside each shooting interval (multiple-shooting “implicit RK”).
Use RTI, not full SQP-to-convergence. Diehl 2005’s result is that one SQP step from a warm-start is nearly optimal when the plant is stable; multiple steps are wasteful when the next solve is 10 ms away.
Warm-start always. Initialise (x, u) with the shifted previous solution: u_k ← u_{k+1}^prev, x_k ← x_{k+1}^prev, then re-roll the dynamics. The first cold solve is allowed to take 10-100× longer.
If the QP is > 50 % of the period, you’ll miss deadlines. Profile your worst-case, not your mean. Use HPIPM partial-condensing N₂ parameter to trade between dense-QP size and Riccati passes.
Cost tuning is empirical. Start with Q on positions only; add velocity damping when oscillating; add control rate (Δu) cost when chattering. Bryson’s rule (Q_ii = 1/x_max²) gives a starting point, never a final one.
For contact-rich tasks, prefer contact-implicit MPC (Posa, Cantu, Tedrake 2014 IJRR; Howell, Le Cleach’, Manchester 2022) over hand-tuned mode switching. The complementarity formulation is harder to solve but handles unanticipated contacts.
SRBD is the right fidelity for legged locomotion MPC. Whole-body MPC at 100 Hz is barely tractable on a flagship CPU (10-30 ms solve), so reserve whole-body for either (a) offline trajectory generation or (b) a slower outer loop (20-50 Hz) feeding an SRBD inner loop.
Outer-loop MPC + inner-loop tracking is the default architecture. MPC produces a reference (forces, accelerations, or joint trajectories); a faster (500-1 kHz) operational-space or whole-body QP tracks it. The inner loop sees current state and reacts to disturbances the MPC missed.
Use augmented Lagrangian, not penalty. Penalty methods need ever-growing weights; AL converges with bounded multipliers and is what Crocoddyl/ALTRO use.

6. Numerical conditioning

Robot MPC NLPs are routinely ill-conditioned, for understood reasons:

Large state-dim with rotational coordinates. Quaternions (4 components, unit-norm) and rotation matrices (9 components, orthogonal) are over-parameterised. The KKT matrix becomes structurally singular along the manifold tangent. Tassa 2012 §III.C documents the symptom: backward- pass Hessian Q_uu becomes indefinite at iterates with large angle errors.
Standard fix: regularisation. Add λI to Q_uu (and recursively the Riccati matrices) until positive-definite. λ adapted Levenberg-Marquardt style: increase 10× on non-PD or non-descending step; decrease 10× on successful step. Crocoddyl, OCS2, and FORCES all do this internally.
Quaternion drift. Integrating ̇q = ½ Ω(ω) q with RK4 drifts off the unit-norm manifold within a few hundred steps. Two remedies: (a) after every integration step, normalise q ← q / ||q||; (b) use the error-state formulation — represent state as nominal quaternion plus tangent-space error δθ ∈ R³, integrate δθ linearly, occasionally fold into the nominal (Forster, Carlone, Dellaert, Scaramuzza 2015 — the IMU-preintegration paper introduced this notation to robotics).
SO(3) Jacobians. The tangent-space derivative of a rotation w.r.t. ω is not constant; use the right-Jacobian Jr(θ) (Sola 2018 micro-Lie tutorial) to map between body and world rates. Crocoddyl and Pinocchio implement these.
Floating-base coordinates. A free-floating rigid-body has 6 unactuated DOFs. The mass matrix M(q) couples joints to base; its inverse is dense. Use CRBA (Composite Rigid Body Algorithm, Walker & Orin 1982) for M and RNEA (Featherstone 2008) for the bias term. Pinocchio’s spatial-algebra implementation gives both with analytic derivatives at near-Eigen speed.
Scaling. Position (m), velocity (m/s), angle (rad), torque (N·m) have hugely different magnitudes. Hessian condition number explodes without pre-scaling. Either (a) rescale states by D_x = diag(1/x_typ) before passing to the solver, or (b) tune Q and R until well-conditioned — practical heuristic: aim for condition number < 10⁴.
Inequality slacks. Hard inequalities with active sets near the iterate produce abrupt Hessian rank changes. Relax with slack variables penalised quadratically; gives a smoother NLP at the cost of small constraint violation (Hannemann-Tamás & Marquardt 2010).

7. Failure modes & debugging

QP solver diverges or returns infeasible. Almost always indefinite Hessian or a too-tight constraint. Crank regularisation; sanity-check that Q_uu is PD by eigenvalue; if not, increase trust-region radius or back off line-search.
Solution looks smooth in sim, unstable on hardware. Model mismatch. Either (a) add online system ID (see [[Engineering/system-identification]] — recursive least-squares on inertia or friction parameters), or (b) wrap the MPC with a robust inner loop (H∞ — see [[Engineering/h-infinity-robust]], or [[Engineering/sliding-mode-control]] for the actuator-level disturbance rejection).
Foot slips during trot/walk. Friction-cone constraint in MPC is tight but the low-level WBC commands forces outside the cone because the MPC reference is on the boundary. Tighten the low-level cone (μ_lowlevel = 0.8 μ_mpc) so disturbances do not push it over.
Late deadline misses (jitter). Profile worst-case, not mean. Options in order of severity: (i) decrease N or increase dt; (ii) switch from full SQP to RTI (one step); (iii) drop hard inequalities to slacked-soft; (iv) switch from interior-point to active-set QP if the active set is small and stable.
Local-minima trap on hard manoeuvres (humanoid stand-up from prone; parkour vault; tight squeeze). DDP is a local method. Remedies: (a) multi-start from several initial guesses; (b) seed with a sampling-based planner (CHOMP — Ratliff 2009, STOMP — Kalakrishnan 2011, RRT-Connect for geometric seeding); (c) use a global trajectory optimiser offline (Mixed-Integer Convex — Deits & Tedrake 2014 for footstep planning).
NaN propagation. Almost always integrator instability after a divergent SQP step. Cap the step length and add NaN guards in the forward pass; Crocoddyl does this by default.
Cost blows up far from goal. Quadratic terminal cost is unbounded. Either (a) saturate the cost (min(cost, cost_max)), or (b) use a log-barrier or Huber-loss terminal.

8. Hardware platforms & deployments

ANYmal C / D (ANYbotics, Hutter lab spin-out, 2019+) — OCS2-based MPC at 100 Hz with whole-body QP tracking at 400 Hz. Stair-climbing demonstrations 2020-2023 ran the full Crocoddyl whole-body pipeline at 50 Hz with Pinocchio analytical derivatives.
MIT Cheetah 3 & Mini Cheetah (Kim, Wensing, Bledt, Katz, Di Carlo, 2018-2019) — convex SRBD MPC at 30-40 Hz on an Intel NUC; enabled backflips, stair-climb without prior maps, and the open-hardware Mini Cheetah platform that spawned dozens of academic clones.
Boston Dynamics Spot (commercial 2020+) — proprietary stack believed to be MPC-based; engineers have publicly described an SRBD outer loop and whole-body QP inner loop. Atlas parkour (2018-2022 videos) uses a multi-fidelity stack with a high-level footstep planner described in Kuindersma et al. 2016 (Autonomous Robots) — the DRC Atlas paper — and newer work cited in BD blog posts.
Unitree Go1 / Go2 / B1 (2022+) — hybrid MPC + RL. The locomotion policy is RL-trained but inherits the SRBD MPC structure as either an initial-condition supervisor or a residual reference.
PAL Robotics TIAGo — whole-body Crocoddyl-based MPC (Mastalli, Merkt, Marti-Saumell, Stouraitis, Goepp, et al. 2022) for mobile manipulation; runs at 50 Hz on the onboard CPU.
Skydio R1 / 2 / X10 drones (2018+) — onboard NMPC for trajectory tracking and obstacle avoidance; published details are sparse but the architecture is widely described in talks as RTI-SQP at ~200 Hz.
Crazyflie / RotorS / PX4 community — acados-based NMPC controllers (Foehn et al. 2021 — time-optimal planning for quadrotor waypoint flight) run at 100-200 Hz on Cortex-M4 / Pi-Zero-class hardware.
Tesla Optimus (2022+) — Tesla’s publicly shown talks reference SQP-MPC for locomotion; specifics undisclosed.
Apptronik Apollo, Figure 01/02, 1X Neo, Agility Digit — humanoid commercial fleet (2024-2025); locomotion stacks are believed to be a mixture of MPC + learned policies, similar to Unitree’s pattern.

9. Case studies

(a) MIT Cheetah 3 convex MPC (Di Carlo & Wensing, IROS 2018)

The crisp insight: the rotational part of rigid-body dynamics is what makes quadruped dynamics nonlinear, and you can dodge it by linearising the inertia tensor around the current orientation at every MPC step.

I_world(t) ≈ R(Θ̂(t₀)) I_body R(Θ̂(t₀))ᵀ            (frozen each MPC step)

With I_world held constant over the horizon, both linear and angular dynamics become linear in the GRFs, and the OCP collapses to a QP. The QP has 12·N decision variables (N=10-50) and friction-cone inequalities; qpOASES solves it in <1 ms on the Cheetah 3 embedded Intel NUC. Re-solve at 30-40 Hz.

Consequences for capability:

Backflip from a standing start. The MPC is replanned every cycle, so there is no offline trajectory; the gait scheduler triggers the airborne phase by switching all four feet to swing, and the MPC handles the ballistic dynamics with the same code path.
Blind stair-climb. No terrain map; the MPC’s foot-position references come from the gait clock + body-velocity command, and contact events are detected on-board.

The paper became the de-facto reference architecture; ANYmal, Spot, Unitree all use variants. The main known limitation: the linearised inertia tensor introduces ~5° tilt-tracking error at aggressive turns; whole-body MPC recovers this at higher computational cost.

(b) ANYmal stair-climbing with whole-body MPC (Mastalli et al. 2023)

Crocoddyl FDDP solving the full floating-base + 12-joint dynamics:

State dim: 6 (base pose) + 12 (joints) + 6 (base twist) + 12 (joint velocities) = 36; with the unit-quaternion this becomes a 37-dim ambient representation, 36-dim tangent.
Control dim: 12 joint torques.
Horizon: 0.5 s at 20 ms ⇒ N = 25.
Pinocchio’s analytic derivatives of CRBA, RNEA, and contact Jacobians feed Crocoddyl’s backward pass — no finite differences.
Solves at 8-20 ms on an i7-class CPU; deployed at 50 Hz with the previous- cycle solution warm-starting.

The lesson: with analytic derivatives + FDDP + warm-start, whole-body MPC is just tractable at 50 Hz on present hardware. Without any of those three levers it isn’t.

(c) Atlas parkour (Boston Dynamics, 2018-2022 videos)

Specifics are proprietary, but the architecture is publicly described in Kuindersma, Permenter, Tedrake (Autonomous Robots 2016, “Optimization-based locomotion planning, estimation, and control design for the Atlas humanoid robot”), the canonical DRC Atlas paper, and updated in BD’s 2021 blog post “Picking Up Momentum”:

High-level: footstep planner over discrete contact sequences (mixed- integer convex; Deits & Tedrake 2014 IROS).
Middle-level: SRBD MPC at 100 Hz producing a reference centroidal trajectory and contact wrenches.
Low-level: whole-body QP at 1 kHz tracking the wrench reference while satisfying joint, torque, and contact-cone constraints (Kuindersma 2016 §V).
State estimation via factor-graph EKF using IMU + joint encoders + vision.

The 2021 parkour videos add learned components — most plausibly to choose contact strategies or to seed the trajectory optimisation — but the runtime controller remains MPC-based.

10. Cross-references

[[Engineering/mpc-control]] — general (process-industry, linear) MPC.
[[Robotics/state-space-lqr]] — LQR is the one-step / infinite-horizon linear special case of MPC; the Riccati machinery in iLQR is identical.
[[Robotics/dynamics-rigid-body]] — the rigid-body model M(q) q̈ + C q̇ + g = τ that every robotics MPC formulation depends on.
[[Robotics/legged-robotics]] — SRBD, whole-body QP, gait scheduling that wrap legged-robot MPC.
[[Robotics/humanoid-balance]] — capture-point / DCM control as a simplified MPC.
[[Robotics/multirotor-design]] — NMPC on drones is mature; acados is the go-to.
[[Robotics/rl-for-control]] — hybrid MPC + RL (Wang, Levine, Abbeel 2018; ETH ANYmal-RL stack) is the active frontier.
[[Robotics/trajectory-generation]] — offline trajectory optimisation (CasADi, TrajOpt) is the close cousin.
[[Robotics/impedance-control]] — common as the low-level layer beneath MPC.
[[Engineering/sliding-mode-control]] — robust inner loop wrap for MPC.
[[Engineering/system-identification]] — online ID closes the model- mismatch loop.
[[Engineering/h-infinity-robust]] — alternative robust inner loop.
[[Engineering/adaptive-control]] — for model-parameter drift.

11. Citations

Founding methods

Mayne (1966). “A second-order gradient method.” Int. J. Control 3(1):85-95. — Original DDP.
Bock & Plitt (1984). “A multiple shooting algorithm for direct solution of OCPs.” IFAC World Congress, Budapest.
Hargraves & Paris (1987). “Direct trajectory optimization using NLP and collocation.” JGCD 10(4):338-342.
Hager (2000). “Runge-Kutta methods in optimal control and the transformed adjoint system.” Numer. Math. 87:247-282.
Li & Todorov (2004). “Iterative LQR for nonlinear biological movement systems.” ICINCO.
Todorov & Li (2005). “Generalized iLQG for locally-optimal feedback of constrained nonlinear stochastic systems.” ACC.
Diehl, Bock, Schlöder (2005). “A real-time iteration scheme for NLP in optimal feedback control.” SIAM J. Control Optim. 43(5):1714-1736. — RTI.

Modern robotics MPC

Tassa, Erez, Todorov (2012). “Synthesis and stabilization of complex behaviors through online trajectory optimization.” IROS. — Canonical iLQR-for-robotics.
Schulman et al. (2013). “Finding locally optimal, collision-free trajectories with sequential convex optimization.” IJRR. — TrajOpt.
Posa, Cantu, Tedrake (2014). “A direct method for trajectory optimization of rigid bodies through contact.” IJRR 33(1):69-81. — Contact-implicit.
Deits & Tedrake (2014). “Footstep planning on uneven terrain with mixed-integer convex optimization.” Humanoids.
Kuindersma et al. (2016). “Optimization-based locomotion planning, estimation, and control design for the Atlas humanoid.” Autonomous Robots 40:429-455.
Di Carlo, Wensing, Katz, Bledt, Kim (2018). “Dynamic locomotion in MIT Cheetah 3 through convex MPC.” IROS.
Bledt, Wensing, Kim (2017). “Policy-regularized MPC for MIT Cheetah.” IROS.
Howell, Jackson, Manchester (2019). “ALTRO: a fast solver for constrained trajectory optimization.” IROS.
Mastalli et al. (2020). “Crocoddyl: an efficient and versatile framework for multi-contact optimal control.” ICRA.
Mastalli et al. (2022). “A direct-indirect hybridization approach to control-limited DDP.” Autonomous Robots.
Howell, Le Cleach’, Manchester (2022). “Trajectory optimization with optimization-based dynamics.” RA-L.
Foehn, Romero, Scaramuzza (2021). “Time-optimal planning for quadrotor waypoint flight.” Science Robotics 6(56).
Sleiman, Farshidian, Minniti, Hutter (2021). “A unified MPC framework for whole-body dynamic locomotion and manipulation.” RA-L.
Farshidian et al. (2017). “An efficient optimal planning and control framework for quadrupedal locomotion.” ICRA. — OCS2.

Solver infrastructure

Wächter & Biegler (2006). “Implementation of an interior-point filter line-search algorithm.” Math. Program. 106(1):25-57. — IPOPT.
Domahidi & Jerez (2014). “FORCES Pro: code generation for embedded optimization.” embotech.
Andersson, Gillis, Horn, Rawlings, Diehl (2019). “CasADi: a framework for NLP and OC.” Math. Program. Comput. 11:1-36.
Frison & Diehl (2020). “HPIPM: a high-performance QP framework for MPC.” IFAC World Congress.
Verschueren et al. (2021). “acados — a modular open-source framework for fast embedded OC.” Math. Program. Comput. 14:147-183.

Rigid-body machinery

Walker & Orin (1982). “Efficient dynamic computer simulation of robotic mechanisms.” ASME J. Dyn. Syst. 104(3):205-211. — CRBA.
Featherstone (2008). Rigid Body Dynamics Algorithms. Springer. — CRBA, RNEA, ABA.
Forster, Carlone, Dellaert, Scaramuzza (2015). “IMU preintegration on manifold for VIO.” RSS. — Error-state formulation.
Sola, Deray, Atchuthan (2018). “A micro Lie theory for state estimation in robotics.” arXiv:1812.01537. — SO(3) / SE(3) notation.
Carpentier et al. (2019). “Pinocchio — fast rigid-body dynamics with analytical derivatives.” SII.

Process-MPC roots (historical bridge)

Camacho & Bordons (1995). Model Predictive Control in the Process Industry. Springer.
Maciejowski (2002). Predictive Control with Constraints. Prentice Hall.
Rawlings, Mayne, Diehl (2017). Model Predictive Control: Theory, Computation, and Design, 2nd ed. Nob Hill.

Compendium

Explorer

MPC for Robots — Robotics Reference