MPC for Robots

Model Predictive Control (MPC) for robots is the practice of repeatedly solving a finite-horizon optimal-control problem (OCP) on-line, applying the first control to the plant, then re-solving from the new measured state. The textbook recipe is identical to process-industry MPC (see [[Engineering/mpc-control]]), but the implementation is different in kind: robot dynamics are nonlinear and underactuated, state-dim is high (40+ for humanoids), the loop closes at 100-1000 Hz, and intermittent contact makes the dynamics hybrid. This note collects the robot-specific design, solver, and deployment knowledge that the process-MPC literature does not cover.

1. At a glance

Why robot MPC is its own subfield, not just “MPC with a different plant”:

  • Nonlinear rigid-body dynamics. The model ẋ = f(x,u) comes from Newton-Euler / Lagrangian mechanics (see [[Robotics/dynamics-rigid-body]]). Linear MPC formulations (Camacho 1995; Maciejowski 2002) used in process control do not generalise without re-linearisation every step, which is expensive when the Jacobian is 40×40.
  • High state-dim. A 7-DOF arm has 14 states; a quadruped (floating base
    • 12 joints) has 36 states; a humanoid (floating base + 28-32 joints) sits at 60-70 states. Naive QP scales O(N³·n³) with horizon N and state-dim n.
  • Fast update rate. Manipulation MPC: 50-200 Hz. Legged locomotion MPC: 100-500 Hz inner / 30-100 Hz outer. Drone NMPC: 100-1000 Hz. A control cycle of 5 ms leaves no time for general-purpose IPM that costs 50 ms.
  • Hybrid dynamics from contact. Each foot independently makes/breaks contact; force is constrained to a friction cone in stance and is zero in swing. This is mixed-integer in the strict reading; practical robotics MPC uses (a) prescribed contact schedule (gait), (b) complementarity constraints (Posa, Cantu, Tedrake 2014), or (c) smoothed contact models (Howell, Le Cleach’, Manchester 2022).

Two solver families dominate:

  • DDP / iLQR family (Mayne 1966; Tassa, Erez, Todorov 2012). Exploits the Bellman recursion to factor the Hessian; cost is O(N·n³) per iteration instead of O((N·n)³). Unconstrained at heart; constraints handled by penalty, augmented Lagrangian, or projected line-search. Toolchains: Crocoddyl (LAAS+Edinburgh 2020), OCS2 (ETH 2017), Drake’s TrajectoryOpt.
  • Direct-collocation + SQP / IPM (Bock 1984; Hager 2000). Treats both x and u as decision variables; dynamics become equality constraints. Hard inequality constraints are first-class. Real-time-iteration (RTI) variant (Diehl, Bock, Schlöder 2005) runs one SQP step per control cycle and warm-starts from the previous solution — this is what makes embedded NMPC tractable. Toolchains: acados (Verschueren, Frey, Kirches 2018-2021), FORCES Pro (Domahidi, Jerez 2014), CasADi + IPOPT (Andersson 2019; Wächter & Biegler 2006) for offline.

The choice is mostly cultural: legged-robotics labs in Europe (LAAS, ETH, Edinburgh, IIT) lean DDP via Crocoddyl/OCS2; aerial-robotics labs and many manipulation groups use acados; offline trajectory studies use CasADi.

2. First principles

A finite-horizon continuous-time OCP looks like:

minimise   ∫₀^T L(x(t), u(t)) dt + φ(x(T))
   x(·),u(·)
subject to ẋ(t) = f(x(t), u(t)),       t ∈ [0,T]
           x(0) = x̂                     (current measured state)
           g(x(t), u(t)) ≤ 0           (inequality, e.g. friction cone, joint lim)
           h(x(t), u(t)) =  0          (equality, e.g. contact-foot-fixed)
           x(T) ∈ X_f                  (terminal set, often {x_goal})

L is the stage cost, φ the terminal cost, f the dynamics. For receding- horizon MPC, this OCP is re-solved at every control step from the latest x̂ and the first control u*(0) is applied. Discretisation converts the continuous-time problem to an N-stage NLP.

Three discretisation flavours are used in robotics:

Single shooting. Decision variables are only u₀, …, u_{N-1}; states are integrated by a numerical scheme (RK4 typical for robots — symplectic integrators preferred for energy-conserving long horizons). Hessian becomes dense and ill-conditioned for long horizons; rarely used in practice for N > 30.

Multiple shooting (Bock & Plitt 1984). Discretise the horizon into N intervals; both x_k and u_k are decision variables, with shooting-gap equality constraints x_{k+1} = F(x_k, u_k) (RK4 step). Parallelisable across intervals, much better-conditioned, and the structure exploitable by Riccati- type linear solvers. acados, FORCES Pro, and Crocoddyl’s MPC mode all use multiple shooting.

Direct collocation. State and control are represented by polynomials within each interval; the dynamics ẋ = f(x,u) is enforced at collocation points (Legendre-Gauss-Radau pseudospectral — Hager 2000; Hermite-Simpson — Hargraves & Paris 1987). State and control become joint decision variables. Used by TrajOpt (Schulman 2013), PSOPT, Drake. Higher accuracy per node but larger NLP; better for offline trajectory optimisation than for inner-loop MPC.

DDP / iLQR. A dynamic-programming method that linearises f and quadratises L around a trial trajectory and computes a feedback policy δu_k = K_k δx_k + d_k via a backward Riccati pass, then a forward roll-out with line-search. iLQR (Li & Todorov 2004; Tassa 2012) drops the second-order dynamics term of true DDP (Mayne 1966) — cheaper, almost the same convergence in practice. Variants:

  • SLQ (Sideris & Bobrow 2005, used in ETH’s OCS2) — continuous-time DDP.
  • FDDP (Mastalli et al. 2020, in Crocoddyl) — “feasibility-driven” DDP that does not require dynamic feasibility at every iterate; converges from arbitrary initial guesses including non-rolled-out ones.
  • iLQG (Todorov & Li 2005) — DDP with control noise.
  • AL-iLQR (Howell, Jackson, Manchester 2019, ALTRO) — augmented- Lagrangian wrapping iLQR for hard inequality constraints.

Real-time iteration (RTI). Diehl, Bock, Schlöder 2005 observed that solving the NLP to convergence every cycle is wasteful when the previous cycle’s solution is a good warm-start. RTI runs one full SQP step (linearise + solve QP + take step) per control cycle, exploiting the fact that the warm- started QP is well-conditioned. This is the bedrock of all real-time NMPC on robots; “MPC at 100 Hz” essentially means “one RTI step per 10 ms”.

3. Worked examples

(a) Cart-pole swing-up via iLQR

The canonical demo (running example in Tassa, Erez, Todorov 2012). State x = [x, θ, ẋ, θ̇] (cart position [m], pole angle [rad], cart velocity [m/s], pole angular velocity [rad/s]); control u is force on cart [N]. Goal: invert the pole from θ=π (hanging) to θ=0 (up) and stabilise.

Cost design:

  • Horizon N = 100, dt = 0.02 s → T = 2 s.
  • Stage cost: L(x,u) = x' Q x + u' R u with Q = diag(1, 10, 0.1, 0.1), R = 0.001.
  • Terminal: φ(x_N) = x_N' Q_f x_N with Q_f = 100·I₄.

One iLQR iteration:

  1. Forward pass. Roll out from x̂ with current control sequence u^{(i)}; record (x_k, u_k) and cost-to-go.
  2. Linearise & quadratise. Compute A_k = ∂f/∂x, B_k = ∂f/∂u via finite- difference (or analytic — see Pinocchio’s algorithmic derivatives for the rigid-body case). Compute first/second derivatives of L.
  3. Backward pass. Riccati recursion for V_k = optimal cost-to-go quadratic approximation; compute feedback K_k and feedforward d_k.
  4. Forward roll-out with line-search. Try α = 1 first; if cost increases, halve to α = 0.5, 0.25, …, with min α ≈ 2⁻¹⁰. Updates: u_k = u_k^{(i)} + α d_k + K_k (x_k - x_k^{(i)}).
  5. Convergence test. Stop when |dV_α=1| < 10⁻⁶ or ||d_k||_∞ < 10⁻⁴.

Empirically converges in ~15 iterations from u ≡ 0 initial guess. Tassa’s implementation runs ~50 ms in MATLAB; modern C++ Crocoddyl does it in <1 ms.

(b) Quadruped trot via SRBD MPC

The Di Carlo & Wensing 2018 IROS convex MPC, deployed on MIT Cheetah 3 and nearly every academic quadruped since.

Model — Single Rigid Body Dynamics (SRBD). Replace the full 18-DOF floating-base + 12-joint system with the centroidal inertia of the trunk and the ground-reaction-forces (GRFs) at the feet. State x ∈ R¹² = [p, Θ, ṗ, ω], where p ∈ R³ is base position [m], Θ ∈ R³ is roll-pitch-yaw [rad], ṗ ∈ R³ is COM linear velocity [m/s], ω ∈ R³ is base angular velocity [rad/s]. Control u ∈ R¹² = [f₁, f₂, f₃, f₄], one 3D force per foot [N].

Dynamics (linearised, key trick — Di Carlo 2018):

m p̈ = Σᵢ fᵢ - m g
I_world ω̇ = Σᵢ (rᵢ - p) × fᵢ
Θ̇       = T(Θ) ω                  (Euler-rate map)

Linearise the rotational inertia by holding I_world fixed at the current orientation each MPC step — this turns the dynamics affine in u and produces a QP rather than an NLP.

Constraints.

  • Friction cone (linearised pyramid): |fᵢ_x| ≤ μ fᵢ_z, |fᵢ_y| ≤ μ fᵢ_z, with μ ≈ 0.4-0.6 depending on surface.
  • Stance/swing schedule from a prescribed gait clock: fᵢ_z = 0 when foot i is in swing; fᵢ_z ≥ f_min ≈ 5 N when in stance.
  • Force magnitude: fᵢ_z ≤ f_max ≈ 250 N (Cheetah 3 actuator limit).

Horizon & rate. N = 50 nodes, dt = 0.01 s → 0.5 s lookahead. Re-solve at 100 Hz. Solve time on Intel i7 NUC: 3-8 ms with qpOASES warm-started; ANYmal deployments report 5-15 ms.

The MPC output is reference foot forces. A lower-loop whole-body QP at 500- 1000 Hz tracks these forces while satisfying the full kinematic/dynamic manipulator constraints (see [[Robotics/legged-robotics]] for the WBC).

(c) 7-DOF manipulator collision-avoidance MPC

Setup: Franka Emika Panda (7 revolute joints). State x ∈ R¹⁴ = [q; q̇], joint positions [rad] and velocities [rad/s]. Control u ∈ R⁷ is joint torque [N·m] or joint acceleration depending on whether the inner loop is torque or velocity (Franka exposes both via libfranka).

Model. Full rigid-body dynamics M(q) q̈ + C(q,q̇) q̇ + g(q) = τ, with M, C, g computed by Pinocchio’s CRBA/RNEA (Featherstone 2008 algorithms).

Cost.

  • Tracking: ||p_ee(q) - p_des(t)||² weighted ~100 in [m²].
  • Posture: ||q - q_rest||² · w_posture with w_posture = 0.01 (regulariser).
  • Effort: ||τ||² · 10⁻⁴.
  • Terminal: tracking weight ×10.

Constraints.

  • Joint limits q_min ≤ q ≤ q_max (Panda: q1∈[-2.9, 2.9], q4∈[-3.07, -0.07], etc., from the official URDF).
  • Velocity limits |q̇| ≤ q̇_max (Panda: 2.17 rad/s on q1).
  • Torque limits |τ| ≤ τ_max (Panda: 87 N·m on q1-4, 12 N·m on q5-7).
  • Signed-distance ≥ d_safe ≈ 0.03 m to a list of spherical obstacle primitives. SDF linearised around current iterate (Schulman’s TrajOpt approach 2013); use HPP-FCL or libccd for the SD computation.

Horizon & rate. N = 20, dt = 0.05 s → 1 s lookahead. 50 Hz outer. acados with RTI + partial condensing solves in 2-5 ms on a standard i7 CPU.

4. Solver families & toolchains

ToolchainAuthor(s) / yearLanguageMethodTypical deployment
CrocoddylMastalli et al., LAAS-CNRS + Edinburgh, 2020 (ICRA)C++ / PythonDDP, FDDP, ALTalos humanoid, ANYmal, MIT-Cheetah replicas, HRP-2, SOLO
acadosVerschueren, Frey, Kirches et al., 2018-2021C (codegen)RTI-SQP + HPIPM/qpOASESTIAGo, Jackal UGV, autonomous racing, drones, Crazyflie
OCS2Farshidian et al., ETH RSL, 2017C++SLQ (cts-time DDP)ANYmal C / D, ANYmal-on-wheels, ballbot
CasADi + IPOPTAndersson, Gillis, Horn, Rawlings, Diehl 2019; Wächter & Biegler 2006Python / MATLAB / C++Direct collocation + IPMOffline trajopt, MHE for SLAM, research prototypes
TrajOptSchulman, Ho, Lee, Awwal, Bradlow, Abbeel 2013 (IJRR)C++ / PythonSequential convex programmingKinematic trajectory optimisation, motion planning
FORCES ProDomahidi & Jerez 2014, embotechC codegen, commercialRTI-SQP + interior-pointEmbedded automotive, drones; ABB / Mercedes deployments
drake/MathematicalProgramTedrake et al., MIT, 2014+C++ / PythonMulti-backend (SNOPT, IPOPT, Gurobi, MOSEK)Research; Atlas / Spot at TRI; benchmark studies
ALTROHowell, Jackson, Manchester 2019Julia / C++AL-iLQRAerospace research, REx Lab
HPIPMFrison, Diehl 2020CRiccati QP / IPMInternal solver inside acados
PSOPTBecerra 2010C++Pseudospectral collocationAerospace, offline studies

A few practical notes:

  • Crocoddyl is the dominant choice when you have analytical rigid-body derivatives via Pinocchio (Carpentier et al. 2019); the two libraries share authors and integrate tightly. FDDP convergence from infeasible warm-starts is genuinely useful in MPC because the previous-step solution may violate the new shooting gaps after a state jump.
  • acados is the most-deployed embedded NMPC framework as of 2024-2025. Codegens C with no runtime dependencies; HPIPM exploits the multiple- shooting block-tridiagonal Hessian structure. The Python interface is excellent for development; the generated C runs anywhere.
  • OCS2 has a small footprint but is older and less actively maintained than Crocoddyl; ANYbotics inherited it via Marco Hutter’s lab.
  • CasADi is the right answer when “off-line” — designing controllers, identifying parameters, computing reference trajectories — and the wrong answer for inner-loop MPC because IPOPT is too slow per iteration.

5. Design heuristics

Real-time MPC succeeds or fails on a handful of pragmatic choices:

  • Keep the horizon short. T = 0.3-1.0 s is normal for legged and aerial robots. Manipulation: 0.5-2 s. Process MPC’s “few minutes” horizons are irrelevant — robot dynamics are stiff at the second-scale.
  • dt ≥ 5 ms. RK4 with sub-millisecond dt explodes the NLP size with little benefit. If you need fine integration, run a finer integrator inside each shooting interval (multiple-shooting “implicit RK”).
  • Use RTI, not full SQP-to-convergence. Diehl 2005’s result is that one SQP step from a warm-start is nearly optimal when the plant is stable; multiple steps are wasteful when the next solve is 10 ms away.
  • Warm-start always. Initialise (x, u) with the shifted previous solution: u_k ← u_{k+1}^prev, x_k ← x_{k+1}^prev, then re-roll the dynamics. The first cold solve is allowed to take 10-100× longer.
  • If the QP is > 50 % of the period, you’ll miss deadlines. Profile your worst-case, not your mean. Use HPIPM partial-condensing N₂ parameter to trade between dense-QP size and Riccati passes.
  • Cost tuning is empirical. Start with Q on positions only; add velocity damping when oscillating; add control rate (Δu) cost when chattering. Bryson’s rule (Q_ii = 1/x_max²) gives a starting point, never a final one.
  • For contact-rich tasks, prefer contact-implicit MPC (Posa, Cantu, Tedrake 2014 IJRR; Howell, Le Cleach’, Manchester 2022) over hand-tuned mode switching. The complementarity formulation is harder to solve but handles unanticipated contacts.
  • SRBD is the right fidelity for legged locomotion MPC. Whole-body MPC at 100 Hz is barely tractable on a flagship CPU (10-30 ms solve), so reserve whole-body for either (a) offline trajectory generation or (b) a slower outer loop (20-50 Hz) feeding an SRBD inner loop.
  • Outer-loop MPC + inner-loop tracking is the default architecture. MPC produces a reference (forces, accelerations, or joint trajectories); a faster (500-1 kHz) operational-space or whole-body QP tracks it. The inner loop sees current state and reacts to disturbances the MPC missed.
  • Use augmented Lagrangian, not penalty. Penalty methods need ever-growing weights; AL converges with bounded multipliers and is what Crocoddyl/ALTRO use.

6. Numerical conditioning

Robot MPC NLPs are routinely ill-conditioned, for understood reasons:

  • Large state-dim with rotational coordinates. Quaternions (4 components, unit-norm) and rotation matrices (9 components, orthogonal) are over-parameterised. The KKT matrix becomes structurally singular along the manifold tangent. Tassa 2012 §III.C documents the symptom: backward- pass Hessian Q_uu becomes indefinite at iterates with large angle errors.
  • Standard fix: regularisation. Add λI to Q_uu (and recursively the Riccati matrices) until positive-definite. λ adapted Levenberg-Marquardt style: increase 10× on non-PD or non-descending step; decrease 10× on successful step. Crocoddyl, OCS2, and FORCES all do this internally.
  • Quaternion drift. Integrating ̇q = ½ Ω(ω) q with RK4 drifts off the unit-norm manifold within a few hundred steps. Two remedies: (a) after every integration step, normalise q ← q / ||q||; (b) use the error-state formulation — represent state as nominal quaternion plus tangent-space error δθ ∈ R³, integrate δθ linearly, occasionally fold into the nominal (Forster, Carlone, Dellaert, Scaramuzza 2015 — the IMU-preintegration paper introduced this notation to robotics).
  • SO(3) Jacobians. The tangent-space derivative of a rotation w.r.t. ω is not constant; use the right-Jacobian Jr(θ) (Sola 2018 micro-Lie tutorial) to map between body and world rates. Crocoddyl and Pinocchio implement these.
  • Floating-base coordinates. A free-floating rigid-body has 6 unactuated DOFs. The mass matrix M(q) couples joints to base; its inverse is dense. Use CRBA (Composite Rigid Body Algorithm, Walker & Orin 1982) for M and RNEA (Featherstone 2008) for the bias term. Pinocchio’s spatial-algebra implementation gives both with analytic derivatives at near-Eigen speed.
  • Scaling. Position (m), velocity (m/s), angle (rad), torque (N·m) have hugely different magnitudes. Hessian condition number explodes without pre-scaling. Either (a) rescale states by D_x = diag(1/x_typ) before passing to the solver, or (b) tune Q and R until well-conditioned — practical heuristic: aim for condition number < 10⁴.
  • Inequality slacks. Hard inequalities with active sets near the iterate produce abrupt Hessian rank changes. Relax with slack variables penalised quadratically; gives a smoother NLP at the cost of small constraint violation (Hannemann-Tamás & Marquardt 2010).

7. Failure modes & debugging

  • QP solver diverges or returns infeasible. Almost always indefinite Hessian or a too-tight constraint. Crank regularisation; sanity-check that Q_uu is PD by eigenvalue; if not, increase trust-region radius or back off line-search.
  • Solution looks smooth in sim, unstable on hardware. Model mismatch. Either (a) add online system ID (see [[Engineering/system-identification]] — recursive least-squares on inertia or friction parameters), or (b) wrap the MPC with a robust inner loop (H∞ — see [[Engineering/h-infinity-robust]], or [[Engineering/sliding-mode-control]] for the actuator-level disturbance rejection).
  • Foot slips during trot/walk. Friction-cone constraint in MPC is tight but the low-level WBC commands forces outside the cone because the MPC reference is on the boundary. Tighten the low-level cone (μ_lowlevel = 0.8 μ_mpc) so disturbances do not push it over.
  • Late deadline misses (jitter). Profile worst-case, not mean. Options in order of severity: (i) decrease N or increase dt; (ii) switch from full SQP to RTI (one step); (iii) drop hard inequalities to slacked-soft; (iv) switch from interior-point to active-set QP if the active set is small and stable.
  • Local-minima trap on hard manoeuvres (humanoid stand-up from prone; parkour vault; tight squeeze). DDP is a local method. Remedies: (a) multi-start from several initial guesses; (b) seed with a sampling-based planner (CHOMP — Ratliff 2009, STOMP — Kalakrishnan 2011, RRT-Connect for geometric seeding); (c) use a global trajectory optimiser offline (Mixed-Integer Convex — Deits & Tedrake 2014 for footstep planning).
  • NaN propagation. Almost always integrator instability after a divergent SQP step. Cap the step length and add NaN guards in the forward pass; Crocoddyl does this by default.
  • Cost blows up far from goal. Quadratic terminal cost is unbounded. Either (a) saturate the cost (min(cost, cost_max)), or (b) use a log-barrier or Huber-loss terminal.

8. Hardware platforms & deployments

  • ANYmal C / D (ANYbotics, Hutter lab spin-out, 2019+) — OCS2-based MPC at 100 Hz with whole-body QP tracking at 400 Hz. Stair-climbing demonstrations 2020-2023 ran the full Crocoddyl whole-body pipeline at 50 Hz with Pinocchio analytical derivatives.
  • MIT Cheetah 3 & Mini Cheetah (Kim, Wensing, Bledt, Katz, Di Carlo, 2018-2019) — convex SRBD MPC at 30-40 Hz on an Intel NUC; enabled backflips, stair-climb without prior maps, and the open-hardware Mini Cheetah platform that spawned dozens of academic clones.
  • Boston Dynamics Spot (commercial 2020+) — proprietary stack believed to be MPC-based; engineers have publicly described an SRBD outer loop and whole-body QP inner loop. Atlas parkour (2018-2022 videos) uses a multi-fidelity stack with a high-level footstep planner described in Kuindersma et al. 2016 (Autonomous Robots) — the DRC Atlas paper — and newer work cited in BD blog posts.
  • Unitree Go1 / Go2 / B1 (2022+) — hybrid MPC + RL. The locomotion policy is RL-trained but inherits the SRBD MPC structure as either an initial-condition supervisor or a residual reference.
  • PAL Robotics TIAGo — whole-body Crocoddyl-based MPC (Mastalli, Merkt, Marti-Saumell, Stouraitis, Goepp, et al. 2022) for mobile manipulation; runs at 50 Hz on the onboard CPU.
  • Skydio R1 / 2 / X10 drones (2018+) — onboard NMPC for trajectory tracking and obstacle avoidance; published details are sparse but the architecture is widely described in talks as RTI-SQP at ~200 Hz.
  • Crazyflie / RotorS / PX4 community — acados-based NMPC controllers (Foehn et al. 2021 — time-optimal planning for quadrotor waypoint flight) run at 100-200 Hz on Cortex-M4 / Pi-Zero-class hardware.
  • Tesla Optimus (2022+) — Tesla’s publicly shown talks reference SQP-MPC for locomotion; specifics undisclosed.
  • Apptronik Apollo, Figure 01/02, 1X Neo, Agility Digit — humanoid commercial fleet (2024-2025); locomotion stacks are believed to be a mixture of MPC + learned policies, similar to Unitree’s pattern.

9. Case studies

(a) MIT Cheetah 3 convex MPC (Di Carlo & Wensing, IROS 2018)

The crisp insight: the rotational part of rigid-body dynamics is what makes quadruped dynamics nonlinear, and you can dodge it by linearising the inertia tensor around the current orientation at every MPC step.

I_world(t) ≈ R(Θ̂(t₀)) I_body R(Θ̂(t₀))ᵀ            (frozen each MPC step)

With I_world held constant over the horizon, both linear and angular dynamics become linear in the GRFs, and the OCP collapses to a QP. The QP has 12·N decision variables (N=10-50) and friction-cone inequalities; qpOASES solves it in <1 ms on the Cheetah 3 embedded Intel NUC. Re-solve at 30-40 Hz.

Consequences for capability:

  • Backflip from a standing start. The MPC is replanned every cycle, so there is no offline trajectory; the gait scheduler triggers the airborne phase by switching all four feet to swing, and the MPC handles the ballistic dynamics with the same code path.
  • Blind stair-climb. No terrain map; the MPC’s foot-position references come from the gait clock + body-velocity command, and contact events are detected on-board.

The paper became the de-facto reference architecture; ANYmal, Spot, Unitree all use variants. The main known limitation: the linearised inertia tensor introduces ~5° tilt-tracking error at aggressive turns; whole-body MPC recovers this at higher computational cost.

(b) ANYmal stair-climbing with whole-body MPC (Mastalli et al. 2023)

Crocoddyl FDDP solving the full floating-base + 12-joint dynamics:

  • State dim: 6 (base pose) + 12 (joints) + 6 (base twist) + 12 (joint velocities) = 36; with the unit-quaternion this becomes a 37-dim ambient representation, 36-dim tangent.
  • Control dim: 12 joint torques.
  • Horizon: 0.5 s at 20 ms ⇒ N = 25.
  • Pinocchio’s analytic derivatives of CRBA, RNEA, and contact Jacobians feed Crocoddyl’s backward pass — no finite differences.
  • Solves at 8-20 ms on an i7-class CPU; deployed at 50 Hz with the previous- cycle solution warm-starting.

The lesson: with analytic derivatives + FDDP + warm-start, whole-body MPC is just tractable at 50 Hz on present hardware. Without any of those three levers it isn’t.

(c) Atlas parkour (Boston Dynamics, 2018-2022 videos)

Specifics are proprietary, but the architecture is publicly described in Kuindersma, Permenter, Tedrake (Autonomous Robots 2016, “Optimization-based locomotion planning, estimation, and control design for the Atlas humanoid robot”), the canonical DRC Atlas paper, and updated in BD’s 2021 blog post “Picking Up Momentum”:

  • High-level: footstep planner over discrete contact sequences (mixed- integer convex; Deits & Tedrake 2014 IROS).
  • Middle-level: SRBD MPC at 100 Hz producing a reference centroidal trajectory and contact wrenches.
  • Low-level: whole-body QP at 1 kHz tracking the wrench reference while satisfying joint, torque, and contact-cone constraints (Kuindersma 2016 §V).
  • State estimation via factor-graph EKF using IMU + joint encoders + vision.

The 2021 parkour videos add learned components — most plausibly to choose contact strategies or to seed the trajectory optimisation — but the runtime controller remains MPC-based.

10. Cross-references

  • [[Engineering/mpc-control]] — general (process-industry, linear) MPC.
  • [[Robotics/state-space-lqr]] — LQR is the one-step / infinite-horizon linear special case of MPC; the Riccati machinery in iLQR is identical.
  • [[Robotics/dynamics-rigid-body]] — the rigid-body model M(q) q̈ + C q̇ + g = τ that every robotics MPC formulation depends on.
  • [[Robotics/legged-robotics]] — SRBD, whole-body QP, gait scheduling that wrap legged-robot MPC.
  • [[Robotics/humanoid-balance]] — capture-point / DCM control as a simplified MPC.
  • [[Robotics/multirotor-design]] — NMPC on drones is mature; acados is the go-to.
  • [[Robotics/rl-for-control]] — hybrid MPC + RL (Wang, Levine, Abbeel 2018; ETH ANYmal-RL stack) is the active frontier.
  • [[Robotics/trajectory-generation]] — offline trajectory optimisation (CasADi, TrajOpt) is the close cousin.
  • [[Robotics/impedance-control]] — common as the low-level layer beneath MPC.
  • [[Engineering/sliding-mode-control]] — robust inner loop wrap for MPC.
  • [[Engineering/system-identification]] — online ID closes the model- mismatch loop.
  • [[Engineering/h-infinity-robust]] — alternative robust inner loop.
  • [[Engineering/adaptive-control]] — for model-parameter drift.

11. Citations

Founding methods

  • Mayne (1966). “A second-order gradient method.” Int. J. Control 3(1):85-95. — Original DDP.
  • Bock & Plitt (1984). “A multiple shooting algorithm for direct solution of OCPs.” IFAC World Congress, Budapest.
  • Hargraves & Paris (1987). “Direct trajectory optimization using NLP and collocation.” JGCD 10(4):338-342.
  • Hager (2000). “Runge-Kutta methods in optimal control and the transformed adjoint system.” Numer. Math. 87:247-282.
  • Li & Todorov (2004). “Iterative LQR for nonlinear biological movement systems.” ICINCO.
  • Todorov & Li (2005). “Generalized iLQG for locally-optimal feedback of constrained nonlinear stochastic systems.” ACC.
  • Diehl, Bock, Schlöder (2005). “A real-time iteration scheme for NLP in optimal feedback control.” SIAM J. Control Optim. 43(5):1714-1736. — RTI.

Modern robotics MPC

  • Tassa, Erez, Todorov (2012). “Synthesis and stabilization of complex behaviors through online trajectory optimization.” IROS. — Canonical iLQR-for-robotics.
  • Schulman et al. (2013). “Finding locally optimal, collision-free trajectories with sequential convex optimization.” IJRR. — TrajOpt.
  • Posa, Cantu, Tedrake (2014). “A direct method for trajectory optimization of rigid bodies through contact.” IJRR 33(1):69-81. — Contact-implicit.
  • Deits & Tedrake (2014). “Footstep planning on uneven terrain with mixed-integer convex optimization.” Humanoids.
  • Kuindersma et al. (2016). “Optimization-based locomotion planning, estimation, and control design for the Atlas humanoid.” Autonomous Robots 40:429-455.
  • Di Carlo, Wensing, Katz, Bledt, Kim (2018). “Dynamic locomotion in MIT Cheetah 3 through convex MPC.” IROS.
  • Bledt, Wensing, Kim (2017). “Policy-regularized MPC for MIT Cheetah.” IROS.
  • Howell, Jackson, Manchester (2019). “ALTRO: a fast solver for constrained trajectory optimization.” IROS.
  • Mastalli et al. (2020). “Crocoddyl: an efficient and versatile framework for multi-contact optimal control.” ICRA.
  • Mastalli et al. (2022). “A direct-indirect hybridization approach to control-limited DDP.” Autonomous Robots.
  • Howell, Le Cleach’, Manchester (2022). “Trajectory optimization with optimization-based dynamics.” RA-L.
  • Foehn, Romero, Scaramuzza (2021). “Time-optimal planning for quadrotor waypoint flight.” Science Robotics 6(56).
  • Sleiman, Farshidian, Minniti, Hutter (2021). “A unified MPC framework for whole-body dynamic locomotion and manipulation.” RA-L.
  • Farshidian et al. (2017). “An efficient optimal planning and control framework for quadrupedal locomotion.” ICRA. — OCS2.

Solver infrastructure

  • Wächter & Biegler (2006). “Implementation of an interior-point filter line-search algorithm.” Math. Program. 106(1):25-57. — IPOPT.
  • Domahidi & Jerez (2014). “FORCES Pro: code generation for embedded optimization.” embotech.
  • Andersson, Gillis, Horn, Rawlings, Diehl (2019). “CasADi: a framework for NLP and OC.” Math. Program. Comput. 11:1-36.
  • Frison & Diehl (2020). “HPIPM: a high-performance QP framework for MPC.” IFAC World Congress.
  • Verschueren et al. (2021). “acados — a modular open-source framework for fast embedded OC.” Math. Program. Comput. 14:147-183.

Rigid-body machinery

  • Walker & Orin (1982). “Efficient dynamic computer simulation of robotic mechanisms.” ASME J. Dyn. Syst. 104(3):205-211. — CRBA.
  • Featherstone (2008). Rigid Body Dynamics Algorithms. Springer. — CRBA, RNEA, ABA.
  • Forster, Carlone, Dellaert, Scaramuzza (2015). “IMU preintegration on manifold for VIO.” RSS. — Error-state formulation.
  • Sola, Deray, Atchuthan (2018). “A micro Lie theory for state estimation in robotics.” arXiv:1812.01537. — SO(3) / SE(3) notation.
  • Carpentier et al. (2019). “Pinocchio — fast rigid-body dynamics with analytical derivatives.” SII.

Process-MPC roots (historical bridge)

  • Camacho & Bordons (1995). Model Predictive Control in the Process Industry. Springer.
  • Maciejowski (2002). Predictive Control with Constraints. Prentice Hall.
  • Rawlings, Mayne, Diehl (2017). Model Predictive Control: Theory, Computation, and Design, 2nd ed. Nob Hill.