Legged Robot Dynamics & Control — Robotics Reference

See also (Tier 3 family index): Legged Morphologies

Paired with [[Robotics/legged-locomotion]]. This note is the theory half — floating-base hybrid dynamics, centroidal MPC, whole-body QP. legged-locomotion.md is the platforms + control-stack survey half — Boston Dynamics / Cassie / ANYmal / Unitree / humanoid wave, ZMP/LIP/capture-point, RL policies + sim-to-real. Both intentional; both link back here.

Scope. Wheels work on smooth roads; legs work everywhere else — rubble, stairs, ladders, mud, snow, the inside of a building designed for humans. The price is dynamics that are simultaneously floating-base, underactuated, hybrid (continuous flight phases punctuated by instantaneous contact-impulse events), and constrained by unilateral contacts and friction cones. This note covers the engineering practice that has converged in 2018–2026 around centroidal-dynamics MPC, whole-body QP control, and proprioceptive QDD actuation — the recipe used by the MIT Cheetah lineage, Boston Dynamics Spot, ANYmal, Unitree, Cassie/Digit, and the current humanoid wave (Figure, Optimus, Apollo, 1X NEO, Unitree G1/H1). Manipulator dynamics live in [[Robotics/dynamics-rigid-body]]; here we deal with what happens when the base floats and contacts come and go.

1. At a glance

A legged robot is a robot whose locomotion relies on discrete, switching contacts with the ground rather than continuous rolling support. The instant a foot leaves or strikes the ground the structure of the equations of motion changes: degrees of freedom that were constrained become free, and impulsive forces redistribute momentum in zero elapsed time. The control problem is therefore hybrid continuous–discrete, not a single ODE.

Locomotion classes by foot count (in increasing difficulty / decreasing static stability):

  • Hexapod / octopod (six or eight legs, e.g. Boston Dynamics RHex, PhantomX) — at any instant ≥ 3 feet on the ground; a statically stable support polygon is almost always available. Heavy, slow, but trivially balanced.
  • Quadruped (Spot, ANYmal, Unitree Go2/B2, MIT Cheetah, DeepRobotics X20) — supports either static (creeping crawl, 3 feet down) or dynamic (trot/bound/gallop, 2 or 0 feet down) gaits. The current production sweet spot: payload to ~ 15 kg, top speeds 2–5 m/s, runtime 0.5–2 h.
  • Biped / humanoid (Cassie, Digit, Atlas, Optimus Gen 2, Figure 02, Unitree H1/G1, 1X NEO Gamma, Apptronik Apollo) — at most two contact points; no statically stable phase exists while walking. Hardest control problem. Cassie demonstrated 5-km outdoor running (2022); Unitree H1 reached 3.3 m/s (2024).
  • Monoped (Raibert hopper, 1986; ETH ARES) — the canonical research platform: simplest dynamics that capture the essential balance + thrust + flight-phase placement trinity Raibert showed are sufficient for legged locomotion.

Control paradigm shift (2018–2026). The previous generation (Honda ASIMO, HRP-2/3/4) relied on ZMP-tracking + statically stable gaits — slow, flat-ground, fragile. The current generation is a hierarchical layer-cake: centroidal-dynamics convex MPC (Di Carlo 2018) issues body-level wrenches at 50–200 Hz, a whole-body QP projects them into joint torques at 500–1 kHz, and reinforcement-learning policies (Hwangbo 2019, Lee 2020, Margolis 2024) trained in simulators (Isaac Lab, RaiSim) increasingly replace or augment the model-based stack.

Where it sits in the design stack. Floating-base dynamics + kinematics supply the model; QDD actuators supply the torque; MPC supplies the regulator; impedance supplies the contact compliance; invariant-EKF supplies the state estimate. This note ties them together into a working locomotion controller.

First ask before applying: How statically stable is the gait? If ≥ 3 feet down → ZMP-margin and quasi-static IK suffice. If ≤ 2 feet down → centroidal MPC. Is the base floating? If yes → use Pinocchio/Crocoddyl floating-base APIs, not fixed-base RBD. Is the contact torque-controlled? If yes → impedance-style force commands; if no (purely position-controlled feet) → admittance-style force tracking via spring estimation. RL or model-based? Model-based generalises better; RL is more robust to terrain noise and unmodelled compliance. The state of the art is hybrid — model-based MPC plus a learned residual policy.

2. First principles

2.1 Hybrid dynamics

The configuration is split into base (6 unactuated DoF in SE(3)) and joints (n_a actuated):

Between contact events the equations of motion are the standard floating-base manipulator equation augmented with the contact wrenches:

  • M(q) ∈ ℝ^{(6+n_a)×(6+n_a)} joint-space inertia (now singular along the 6 base columns)
  • h = C(q,\dot q)\dot q + g(q) Coriolis + gravity vector
  • S = [0_{n_a × 6} \; I_{n_a}] selection matrix encoding base under-actuation
  • J_{c,i} 3×(6+n_a) (point contact) or 6×(6+n_a) (line/area contact) contact Jacobian of foot i in the active contact set \mathcal{C}
  • λ_i ground-reaction force/wrench at foot i, constrained to the friction cone \|λ_{xy}\| ≤ μ λ_z and λ_z ≥ 0

At a contact event (foot-down or foot-up) the active set \mathcal{C} changes; impulsive forces redistribute the generalised momentum:

The Bauchau-Park 2001 / Hurmuzlu 2004 impact map fixes \dot q^{+} so that the new contacts are no-slip. The discontinuity in \dot q is the defining feature of legged dynamics — it is what makes simulation hard (you have to detect zero-crossings; see MuJoCo solver = Newton) and what makes control hard (each touchdown injects an exogenous impulse the regulator must reject).

2.2 Under-actuation

S has rank n_a < 6 + n_a. Six directions of generalised force — the base wrench (3 forces + 3 torques about the centre of mass) — cannot be commanded directly. The only way to change body angular momentum is through the contact wrench at the feet:

(the gravity cross-product vanishes about the CoM). With no feet on the ground (flight phase) the angular momentum is conserved: cats land on their feet because they reshape their inertia, not because they apply external torque.

2.3 Centroidal dynamics (Orin & Goswami 2008)

Project the 6+n_a-dimensional dynamics onto the centroidal frame (origin at CoM, axes aligned with world):

m\,\ddot r_{\text{CoM}} &= \sum_i F_i + m\,g \quad \text{(Newton)} \\ \dot L_{\text{CoM}} &= \sum_i (r_i - r_{\text{CoM}}) \times F_i \quad \text{(Euler)} \end{aligned}$$ The 6D **centroidal momentum** `h_G = (m\,\dot r_{\text{CoM}}, L_{\text{CoM}})` evolves under contact forces only. This is the model used by every modern legged MPC: 6 ODEs instead of 30+, convex in the contact forces (linear in `F_i` once the contact location is fixed), and capture the only quantities the contacts can change. ### 2.4 Zero-Moment Point (ZMP) — Vukobratović 1972 The **ZMP** is the point on the support surface where the net moment of the ground reaction force has zero horizontal component: $$p_{\text{ZMP}} \;=\; \frac{\sum_i r_i \times F_{i,z} \,\hat z}{\sum_i F_{i,z}} - \frac{(\dot L_{\text{CoM}})_{xy}\hat z \times \hat z}{\sum_i F_{i,z}}$$ For a flat horizontal ground this collapses to the **Centre of Pressure (CoP)**. The fundamental balance theorem: the robot **does not tip** iff the ZMP lies inside the **support polygon** (convex hull of foot–ground contacts). Outside → the foot rolls about its edge and the robot falls. ZMP-tracking was the dominant control method 1995–2015 (Honda ASIMO, HRP-2/3/4). It is exact for *flat ground* under the *Linear Inverted Pendulum* assumption and fails for `[[Robotics/legged-robotics|dynamic gaits with flight phases]]` (where the ZMP is undefined). ### 2.5 Linear Inverted Pendulum (LIP) — Kajita 2001 Assume (i) CoM at constant height `h_{\text{CoM}}`, (ii) all leg mass concentrated at CoM, (iii) massless telescopic leg, (iv) one foot in contact at point `x_{\text{ZMP}}`. Then the horizontal CoM obeys the **LIP equation**: $$\ddot x_{\text{CoM}} \;=\; \frac{g}{h_{\text{CoM}}}\,(x_{\text{CoM}} - x_{\text{ZMP}})$$ an unstable second-order linear ODE with natural frequency `ω_{\text{LIP}} = \sqrt{g/h_{\text{CoM}}}`. Closed-form orbital integral: $$E_{\text{LIP}} \;=\; \tfrac{1}{2}\dot x_{\text{CoM}}^2 - \tfrac{1}{2}\omega_{\text{LIP}}^2\,(x_{\text{CoM}} - x_{\text{ZMP}})^2 \quad \text{(conserved within a step)}$$ This is the workhorse model for biped footstep planning and capture-point analysis. ### 2.6 Capture Point (Pratt 2006) The **instantaneous capture point** is where the swing foot must land *now* to bring the CoM to rest in one step (without changing CoP further): $$x_{\text{CP}} \;=\; x_{\text{CoM}} + \frac{\dot x_{\text{CoM}}}{\omega_{\text{LIP}}}$$ The locus of all `x_{\text{CP}}` over a planning horizon is the **N-step capture region** — finite area for stable bipeds, shrinking under disturbance. Push-recovery controllers (Atlas DRC, Cassie outdoor running) compute CP every cycle and step into it. ### 2.7 Single-Rigid-Body Dynamics (SRBD) — Di Carlo 2018 For quadrupeds with low leg mass (legs ≲ 15 % of body mass — typical for MIT Cheetah / ANYmal / Unitree class), approximate the robot as **one rigid body** of mass `m`, inertia `I_B`, plus massless legs that just locate the contact forces. State `x = (r_{\text{CoM}}, \Theta_B, v_{\text{CoM}}, \omega_B) \in \mathbb{R}^{12}` (Euler angles `\Theta_B`); input `u = (F_1, F_2, F_3, F_4) \in \mathbb{R}^{12}`. Linearised about hover: $$\begin{bmatrix} \dot r \\ \dot \Theta \\ \dot v \\ \dot \omega \end{bmatrix} \;=\; \underbrace{\begin{bmatrix} 0 & 0 & I_3 & 0 \\ 0 & 0 & 0 & R_z(\psi) \\ 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 \end{bmatrix}}_{A}\begin{bmatrix} r \\ \Theta \\ v \\ \omega \end{bmatrix} + \underbrace{\begin{bmatrix} 0 & \cdots & 0 \\ 0 & \cdots & 0 \\ I_3/m & \cdots & I_3/m \\ I_B^{-1}[r_1 - r_{\text{CoM}}]_\times & \cdots & I_B^{-1}[r_4 - r_{\text{CoM}}]_\times \end{bmatrix}}_{B(r_i)} u + g$$ This is the model the convex-MPC literature solves at 50–200 Hz on commodity x86. Friction cone enters as 4 linear inequalities per foot (the **pyramidal approximation** of the Coulomb cone `μ ≈ 0.6`). ### 2.8 Gaits A **gait** is a periodic schedule of contact-set transitions. Standard quadruped gaits and their duty factors `D` (fraction of cycle a foot is in stance): - **Walk** — `D ≈ 0.75`, lateral sequence FL→HL→FR→HR. Statically stable; preferred at v < 0.4 m/s. - **Trot** — `D ≈ 0.5`, diagonal pairs in phase (FL+HR, FR+HL). Two feet down, dynamic. v ∈ 0.4–2 m/s. - **Pace** — `D ≈ 0.5`, lateral pairs in phase (FL+HL, FR+HR). Energy-efficient at moderate speed (camels, Spot fast walk). - **Bound** — `D ≈ 0.4`, front pair + rear pair alternating, with flight phase. v ∈ 2–4 m/s (Mini Cheetah top speed 3.7 m/s). - **Gallop** — `D ≈ 0.3`, all four feet phased; two flight phases per cycle (transverse / rotary). Fastest. MIT Cheetah 3 reached 6 m/s. Hoyt-Taylor 1981 (horses) and Bertram 2000 (energetic cost of transport): gait switches happen at the speed where metabolic cost equalises across gaits — replicated for legged robots by Kim 2019 (Mini Cheetah optimisation). ### 2.9 Contact-implicit vs contact-explicit MPC Two camps in 2026: - **Contact-explicit (the dominant practice)** — a *contact schedule* (which feet are in contact, at which times) is fixed by an upstream gait planner; the MPC then solves a convex QP over forces conditional on that schedule. Used by MIT Cheetah, Spot, Unitree, ANYmal stock controllers. Pros: convex, fast (1–5 ms), deterministic. Cons: schedule is a heuristic; bad schedule → fall. - **Contact-implicit (research)** — contact location and timing are decision variables of the optimisation. Formulated as a Mathematical Program with Complementarity Constraints (MPCC); Posa-Tedrake 2014, Mordatch 2012, Patel 2019. Pros: discovers new gaits, navigates clutter. Cons: non-convex, expensive (100 ms+), local optima. Used in research and offline trajectory generation, not real-time. The current frontier is **predictive sampling MPC** (MuJoCo MPC, Howell 2022) — Monte-Carlo rollout of trajectories with a learned or hand-coded sampler, no gradients needed, runs at ~ 100 Hz on commodity CPU. Demonstrated for whole-body humanoid balance and quadruped acrobatics. Bridges the gap between explicit and implicit contact reasoning. ## 3. Practical math and worked examples ### Example A — LIP-based footstep prediction (biped) Setup: humanoid mass 35 kg, CoM at `h_{\text{CoM}} = 0.85` m, gravity `g = 9.81` m/s². Compute the natural frequency: $$\omega_{\text{LIP}} \;=\; \sqrt{9.81 / 0.85} \;=\; 3.398\ \text{rad/s}$$ CoM measured at `x_{\text{CoM}} = 0.10` m relative to current stance foot, moving forward at `\dot x_{\text{CoM}} = 0.50` m/s. Instantaneous capture point: $$x_{\text{CP}} \;=\; 0.10 + \frac{0.50}{3.398} \;=\; 0.247\ \text{m}$$ Placing the next footstep at `x_{\text{CP}} = 0.247` m in front of the current stance foot brings the CoM to rest in one stride (assuming the new step is taken instantly and the swing leg can reach). Practical step time ~ 0.35 s — adjust the target by `\dot x_{\text{CoM}} \cdot T_{\text{swing}} = 0.175` m to compensate for travel during swing, yielding a *predictive* capture point at 0.247 + 0.175 = 0.422 m. This is the heart of Pratt's capture-point walking and the recovery step Atlas takes after a shove. ### Example B — Quadruped convex MPC sizing Robot: Unitree A1 class, mass `m = 25` kg, inertia `I_B = \mathrm{diag}(0.05, 0.20, 0.22)` kg·m², 12 joints (3 per leg × 4 legs). SRBD state dim `n = 13` (3 pos + 4 quaternion + 3 lin-vel + 3 ang-vel) or 12 with Euler angles. Input `n_u = 12` (3 contact forces × 4 feet). Choose horizon `T = 0.5` s with sample `Δt = 0.04` s → `N = 12` knots. Decision variables per QP: `(N+1) n + N\,n_u = 13×13 + 12×12 = 313`; with two friction-cone slacks per foot we add 8×12 = 96 → ≈ 400 total. Constraints: dynamics equality (12×13 = 156), friction pyramidal inequality (5 per foot × 12 = 240), foot-on/off equality (12 forces × 12 knots), unilateral `F_z ≥ 0` (48). Total ≈ 600 constraints. OSQP on x86 Ryzen 7 5800X solves this in **1.5–3 ms** (Di Carlo 2018 reported 1.4 ms on Intel i7 with custom warm start). Update rate 100–200 Hz easily fits within the 500 Hz–1 kHz inner-loop budget. ROS-package `ocs2_quadrupedal_controller` and `quad-sdk` provide reference implementations. ### Example C — Swing-leg trajectory (Bézier) The swing foot must traverse `Δx = 0.30` m forward, clear `Δz = 0.10` m above the terrain, and land with zero relative velocity in `T_{\text{swing}} = 0.40` s. Use a 5-point 5th-order Bézier in z: $$z(s) \;=\; \sum_{k=0}^{5} \binom{5}{k}\,(1-s)^{5-k}\,s^k\,P_k, \qquad s = t / T_{\text{swing}} \in [0, 1]$$ Control points (heights, m): `P_0 = 0` (lift-off), `P_1 = 0.05`, `P_2 = 0.10` (apex), `P_3 = 0.10`, `P_4 = 0.05`, `P_5 = 0` (touchdown). The 5th-order polynomial has `\ddot z(0) = \ddot z(T) = 0` → zero impact jerk. Forward axis is a simple linear or cubic blend; for **velocity matching** at touchdown set `\dot x_{\text{foot}}(T) = v_{\text{body}}` so the foot is stationary relative to ground at first contact — eliminates slip and impact (Raibert's "touchdown angle" trick, 1986). For Unitree A1 trotting at 1 m/s, the swing-foot vertical velocity at lift-off and touchdown is `\dot z = 5 (P_1 - P_0)/T = 0.625` m/s — well within the 2.5 m/s peak the QDD actuator can drive. ### Example D — Friction-cone feasibility check A Unitree Go2 (15 kg) standing still distributes its weight `W = 147.15` N evenly across four feet → `F_z = 36.8` N/foot. To resist a sideways push, the horizontal force each foot must produce is bounded by `\sqrt{F_x^2 + F_y^2} \le μ F_z`. With `μ = 0.6` (dry rubber on concrete), max lateral force per foot = `0.6 × 36.8 = 22.1` N → total lateral capacity `4 × 22.1 = 88.3` N. The robot can withstand a steady-state side push of ~ 88 N (~ 9 kg of pressing force) before any foot slips. Reduce `μ` to 0.3 (wet tile) and the budget halves to 44 N — explaining why bath-floor demos of legged robots fail far short of expected limits. The MPC's pyramidal approximation tightens this further by `\sqrt 2 / 2 ≈ 0.71`, so a conservative MPC budget is closer to 62 N (dry) / 31 N (wet). ### Example E — Reflected inertia at a Mini Cheetah hip QDD hip module: motor rotor inertia `I_m = 6.4 × 10^{-5}` kg·m², gear ratio `N = 6`, torque constant `K_t = 0.087` N·m/A, peak current `i_q = 40` A. Reflected inertia to the joint output: $$I_{\text{reflected}} \;=\; N^2 I_m \;=\; 36 × 6.4\!\times\!10^{-5} \;=\; 2.3 × 10^{-3}\ \text{kg·m}^2$$ Compare to the leg link inertia (~ 5 × 10⁻³ kg·m²) — the motor contributes ~ 30 % of total joint inertia, low enough to preserve compliance but high enough that current control transparently maps to output torque: `τ_{\text{output}} = N · K_t · i_q = 6 × 0.087 × 40 = 20.9` N·m peak. No torque sensor needed — the current measurement *is* the torque measurement to within the friction/cogging error budget (~ 5–8 %). ### Example F — Whole-body QP layout (quadruped, MIT Cheetah-style) The whole-body controller (WBC) takes the MPC's *desired body wrench* and the *desired swing-leg accelerations* and resolves them into joint torques `τ ∈ \mathbb{R}^{12}` subject to dynamics, contacts, friction, and joint torque limits. Single-priority QP form (Bledt 2017): $$\min_{\ddot q, \tau, λ} \tfrac{1}{2}\,\| J_b \ddot q - \ddot x_b^{\text{des}} \|_{W_b}^2 + \tfrac{1}{2}\,\| J_{\text{sw}} \ddot q - \ddot x_{\text{sw}}^{\text{des}} \|_{W_{\text{sw}}}^2 + \tfrac{1}{2}\,\|λ - λ^{\text{MPC}}\|_{W_λ}^2$$ subject to: - floating-base dynamics: `M(q)\ddot q + h(q,\dot q) = S^\top τ + \sum_i J_{c,i}^\top λ_i` - no-slip contact: `J_c \ddot q + \dot J_c \dot q = 0` - friction cone: `\| λ_{i,xy} \|_\infty \le μ λ_{i,z}` - unilateral: `λ_{i,z} \ge 0` - torque limit: `|τ_j| \le τ_{\max,j}` Decision dim: 18 (q̈) + 12 (τ) + 12 (λ) = 42; for typical quadruped this solves in 100–300 µs via qpOASES with warm-start, well within the 1 ms budget. Task hierarchy (when needed): Khatib's nullspace projection or Kanoun 2009 cascaded QPs — first solve contact + dynamics, then swing-foot in the remaining nullspace, then body-pose tracking. For most quadruped tasks the single-QP weighted form is good enough; cascaded QPs are reserved for humanoids juggling many simultaneous tasks (manipulation + locomotion + balance). ### 3.1 RL policy training (canonical PPO recipe) The 2024 "blueprint" RL recipe (Margolis 2024, Walk These Ways 2023) used by every public quadruped/biped RL paper: - **Algorithm** — PPO (Schulman 2017), 4096–8192 parallel envs on Isaac Gym/Lab, single RTX 4090 or A100. - **Policy network** — 3-layer MLP, 512 → 256 → 128 hidden units, ELU activation. Inputs: joint pos/vel, base ang-vel, gravity vector in body frame, commanded velocity (3D), gait phase (sin/cos of 4 leg phases). Output: 12 joint position targets fed to PD with stiff `K_p = 25` N·m/rad, `K_d = 0.5` N·m·s/rad. - **Reward** — weighted sum: linear-velocity tracking (+ 1.0), angular-velocity tracking (+ 0.5), action smoothness (−0.01), joint accel (−2.5e-7), foot air-time (+ 1.0 if 0.2 < t_air < 0.5 s), survival (+ 0.15/step), contact-force bound (−1e-3 × ‖F‖), termination (−200 if base z < 0.3 m). - **Domain randomisation** — friction U(0.4, 1.2), payload mass U(−1, 3) kg, motor strength U(0.8, 1.2), Kp/Kd ± 20 %, observation latency U(0, 25) ms, push every 8 s with random ± 1 m/s base velocity. - **Curriculum** — terrain difficulty climbs based on rolling success rate; start flat, then slopes, stairs, gaps, slippery (sand). - **Training budget** — 12 000–24 000 PPO iterations, ~ 24 000 env-steps/iter → ~ 5×10⁸ environment steps total. Wall-clock: 6–12 h on a single GPU. Sim-to-real transfer rate: ~ 90 % succeed in simple terrain, ~ 60 % on stairs without adaptation policy. The trained policy runs at 50 Hz on the robot's CPU (8 ms inference budget on Cortex-A78); the underlying joint PD then closes at 1 kHz. The recipe is now the *default* shipping mode on Unitree Go2 (firmware ≥ 1.0.20) and recent ANYmal-X releases. ## 4. Design heuristics **Actuation choice — the single biggest design decision:** - **Quasi-Direct-Drive (QDD)** — low gear ratio (N ≈ 4–10), high-pole-count BLDC, backdrivable, proprioceptive (no torque sensor needed). MIT Mini Cheetah, Unitree A1/Go2/B2, ANYmal X, MJBots qdd100, T-Motor AK series. Bandwidth > 200 Hz, impact tolerance excellent. The dominant choice 2018–2026. - **Series-Elastic Actuator (SEA)** — spring in series with output (Pratt-Williamson 1995). Cassie, Digit, Valkyrie. Filters impacts, but spring resonance ~ 10–30 Hz caps control bandwidth. Output torque is measured by spring deflection — cheap, accurate. - **Hydraulic** — Boston Dynamics BigDog, AlphaDog, original Atlas DRC. Highest power density (~ 500 W/kg vs ~ 150 W/kg for QDD), but oil leaks, infrastructure (pump, accumulator), and noise killed it commercially. Spot (2019) and Atlas (2024 electric refresh) replaced it. - **Tendon-driven** — Cassie hip, surgical robots, anthropomorphic hands. Decouples motor mass from joint mass; bandwidth limited by cable stiffness. **Sensor must-haves:** - **Joint encoders** — 12–18-bit absolute (no homing); typical AS5048A on QDD or BiSS-C on cobot joints. - **Joint torque** — either (a) explicit torque sensor (KUKA iiwa, ANYmal C harmonic-drive strain gauge) or (b) motor-current estimation `τ ≈ K_t \cdot i_q / N` for QDD (Mini Cheetah / Unitree). The latter has ~ 5–10 % error but no extra hardware. - **IMU** — Bosch BMI088 (Spot, Go2), VectorNav VN-100 (ANYmal C), Xsens MTi-630 (Digit). 100–800 Hz update; accel ± 4 g, gyro ± 250 °/s. Mounted close to base CoM. - **Foot contact** — strain-gauge load cell (ANYmal foot), capacitive (Spot), or contact estimation from joint torque + kinematics (Mini Cheetah, Unitree). The latter is the price you pay for proprioceptive design. - **Exteroception** — Realsense D435i, Velodyne / Ouster LiDAR for stair / terrain (Spot CAM, ANYmal Navigation Module). Drives the local map; control loop does not consume vision directly. **Compute target.** Real-world breakdown (Spot, ANYmal-C, Unitree B2): one **real-time control PC** (x86 quad-core, RT-PREEMPT or Xenomai, 500 Hz–1 kHz inner loop) plus one **perception PC / GPU SoC** (Jetson AGX Orin or NUC + RTX A2000, 30 Hz vision + planning) connected over EtherCAT or low-jitter Ethernet. Tesla Optimus uses a single FSD-class custom SoC. **Stability margins.** Rule of thumb: keep the ZMP **≥ 5 cm** inside the support polygon at all times during quasi-static motion. Dynamic gaits replace this with an MPC terminal cost on the capture point or centroidal momentum. Friction-cone half-angle: design for `μ_design = 0.5` on indoor floors, `μ_design = 0.3` for outdoor mud / leaves; outdoor robots universally include a friction-estimation layer (Bloesch 2013) that adapts the cone. **Gait selection by speed (Hoyt-Taylor 1981 generalised to robots).** For a 25 kg quadruped: - v < 0.4 m/s → walk (D = 0.75), peak τ < 60 % rated; - 0.4 ≤ v < 2.0 m/s → trot (D = 0.5); - v ≥ 2 m/s → bound or gallop (D < 0.5, flight phases). **Friction-cone pyramidal approximation.** For convex MPC the Coulomb cone `\sqrt{F_x^2 + F_y^2} \le μ F_z` is approximated by `|F_x| \le μ F_z` and `|F_y| \le μ F_z` (4-sided pyramid; conservative by factor `\sqrt{2}/2 ≈ 0.71`). Use 8-sided pyramid for high-speed running where corner directions matter. **Battery sizing.** Energy density for current LiPo / lithium-NMC packs is 200–260 Wh/kg at pack level (Tesla Optimus 2.3 kWh / 13 kg → 177 Wh/kg pack including BMS). Continuous power for a 25 kg trotting quadruped at 1 m/s is roughly `P = m g v · \text{CoT} = 25 × 9.81 × 1 × 0.45 ≈ 110` W; peak (jumping, sprinting) reaches 800–1500 W. For 90-minute runtime at 110 W average + 200 W vision/compute → 465 Wh → ~ 2.5 kg pack. Compute domination is real: a Jetson AGX Orin at 60 W draws nearly half the locomotion budget at low speed. **Mass distribution.** Concentrate mass close to the centre and high (above the hips) so the legs swing rapidly with low rotational inertia — the cheetah body plan. Avoid distal mass: a 100 g sensor on the foot raises swing-leg inertia by ~ 30 % and is the single most common cause of slow gait limits in research robots. The Mini Cheetah dropped this further by putting motors at the hip and using cable transmissions to the knee (similar to Honda ASIMO's tendon-driven ankles, but inverted for the leg). **Leg length and step height.** Maximum step height ≈ `L_{\text{leg}} × (1 - \cos θ_{\text{knee,max}})`. For a Mini Cheetah-class leg (L = 0.4 m, θ_max = 150°) → max obstacle clearance ≈ 0.34 m, but practical clearance with reserve for re-planning is ≈ 0.6 × L ≈ 0.24 m. Humanoids with 0.45 m thigh + 0.4 m shin can step up onto a 0.4 m surface, just barely (the Unitree H1 release video specifically demonstrates 0.3 m kerb step-up). **State estimator.** Floating-base pose cannot be measured directly. Use **contact-aided invariant-EKF** (Hartley 2020): treat each stance foot as a fixed landmark, fuse encoder kinematics + IMU pre-integration. Yields 10–30 mm position drift per minute, sub-degree attitude error indoors. The Lie-group formulation handles attitude singularities the standard EKF cannot. **Simulation for training.** RL pipelines are trained in **Isaac Lab / Isaac Gym** (NVIDIA, 4096–8192 parallel envs on a single RTX 4090) or **RaiSim** (ETH, legged-specific, fastest single-CPU). Domain randomisation is mandatory: actuator delay U(0, 20) ms, friction U(0.4, 1.2), mass ± 20 %, latency in observation U(0, 20) ms. Lee 2020 and Margolis 2024 are the canonical recipes. Sim-to-real gap is the dominant failure mode: every successful deployment includes an **adaptation policy** (Kumar 2021 RMA, Margolis 2023 Walk These Ways) or **system-identification step** on the real hardware. ## 5. Components & sourcing ### Commercial legged robots (as of 2026-05) | Robot | Class | Mass | Payload | Top speed | Battery runtime | List price (USD) | | --- | --- | --- | --- | --- | --- | --- | | Boston Dynamics Spot | Quadruped | 32 kg | 14 kg | 1.6 m/s | 90 min | 75 000 | | Boston Dynamics Atlas (electric, 2024) | Humanoid | 89 kg | 11 kg | 2.5 m/s | ~ 60 min | research only | | ANYbotics ANYmal D | Quadruped | 50 kg | 10 kg | 1.0 m/s | 2 h | ~ 150 000 | | ANYbotics ANYmal X (Ex-proof) | Quadruped | 65 kg | 15 kg | 1.0 m/s | 1.5 h | ~ 250 000 | | Unitree Go2 (EDU) | Quadruped | 15 kg | 8 kg | 3.7 m/s | 1–2 h | 16 000 | | Unitree B2 | Quadruped | 60 kg | 40 kg | 6.0 m/s | 4–5 h | ~ 100 000 | | Unitree H1 | Humanoid | 47 kg | 30 kg | 3.3 m/s | 2 h | 90 000 | | Unitree G1 | Humanoid | 35 kg | 5 kg | 2.0 m/s | 2 h | 16 000 | | DeepRobotics Lite3 | Quadruped | 12 kg | 7 kg | 2.5 m/s | 1.5 h | ~ 15 000 | | DeepRobotics Jueying X20 | Quadruped | 50 kg | 20 kg | 3.5 m/s | 4 h | ~ 80 000 | | Agility Robotics Digit v4 | Humanoid (biped) | 64 kg | 16 kg | 1.5 m/s | 2 h | ~ 250 000 | | Tesla Optimus Gen 2 | Humanoid | 57 kg | 20 kg | 2.3 m/s | (undisclosed) | not for sale | | Figure 02 | Humanoid | 70 kg | 25 kg | 1.2 m/s | 5 h | enterprise pilot | | 1X NEO Gamma | Humanoid | 30 kg | 20 kg | 1.5 m/s | 4 h | enterprise pilot | | Apptronik Apollo | Humanoid | 73 kg | 25 kg | 1.5 m/s | 4 h | enterprise pilot | ### Research / open-design platforms - **MIT Mini Cheetah / Cheetah 3** (Bledt 2018, Katz 2019) — open-design proprioceptive quadruped; the basis of nearly all current low-cost quadrupeds. CAD released; the controllers (`mit-biomimetics/Cheetah-Software`) are MIT-licensed C++. - **Open Dynamic Robot Initiative (ODRI) Solo8 / Solo12** (Grimminger 2020, MPI Tübingen) — fully open-hardware 8- and 12-DoF quadrupeds, ~ €8 k BoM, used by ~ 30 research groups. - **Stanford Doggo** (Kau 2019) — sub-$3 k educational quadruped, brushless QDD, fully open. - **Cassie** (Agility Robotics, 2017) — bipedal research platform; demonstrated 5 km outdoor run; controllers (Castillo, Apgar) widely cited. - **HRP-2/3/4/5** (Kawada / AIST) — Japanese humanoid research lineage; ZMP-control reference. - **WALK-MAN** (IIT, 2017) — DRC-era humanoid; SEA + impedance control. ### Actuator modules | Module | N | Peak τ (N·m) | Continuous τ (N·m) | Peak speed (rad/s) | Mass (kg) | Approx. cost | | --- | --- | --- | --- | --- | --- | --- | | MJBots qdd100 beta3 | 6 : 1 | 17 | 6 | 50 | 0.49 | $ 575 | | T-Motor AK80-9 | 9 : 1 | 18 | 9 | 22 | 0.49 | $ 350 | | T-Motor AK70-10 | 10 : 1 | 25 | 8 | 50 | 0.52 | $ 500 | | T-Motor AK60-6 | 6 : 1 | 9 | 3 | 45 | 0.32 | $ 280 | | Unitree A1 motor | 6.33 : 1 | 33.5 | 12 | 21 | 0.61 | $ 400 (cell) | | Unitree GO-M8010-6 | 6.33 : 1 | 23 | 8 | 30 | 0.53 | $ 350 | | ANYdrive (ANYbotics) | harmonic | 40 | 12 | 20 | 1.3 | proprietary | | MIT Mini Cheetah module | 6 : 1 | 17 | 6.9 | 40 | 0.5 | open design | ### Software stacks | Stack | Language | Robot targets | Open? | License | Notes | | --- | --- | --- | --- | --- | --- | | Boston Dynamics SDK | gRPC / Python / C++ | Spot, Atlas | partial | proprietary | High-level only; no joint torque API | | Unitree SDK (legged_control, unitree_ros2) | C++ / Python | A1, Go1/2, B1/2, H1, G1 | yes | BSD | Joint torque access, ROS 2 bridges | | ANYbotics SDK (`anymal_d_sdk`) | C++ / Python | ANYmal C/D/X | partial | proprietary core + open API | Twin simulator free | | `ocs2` (ETH RSL) | C++ | quadruped, biped | yes | BSD-3 | NMPC framework, SQP/iLQR backends | | `OpenSoT` / `XBotControl` (IIT) | C++ | WALK-MAN, Centauro | yes | LGPL | Hierarchical QP WBC | | Crocoddyl (LAAS-CNRS) | C++ / Python | floating-base | yes | BSD-3 | DDP solver; integrated with Pinocchio | | Champ | C++ / ROS 2 | quadruped (generic) | yes | BSD | URDF-driven controller; education-focused | | MIT Cheetah Software | C++ | Mini Cheetah, A1 | yes | MIT | Reference convex MPC implementation | | Agility Robotics Digit SDK | Python / C++ | Cassie, Digit | partial | proprietary | LowLevel API exposes torque | | Isaac Lab | Python | training only | yes | BSD-3 | NVIDIA GPU sim, 4096+ parallel envs | | RaiSim | C++ / Python | training | partial (commercial) | RaiSim Tech | Fastest single-CPU legged sim | | MuJoCo Menagerie | XML (MJCF) | sim models | yes | Apache 2.0 | Validated models for Spot, Go2, H1, ANYmal, Cassie | ### Simulators (quick triage) - **Isaac Lab** — GPU, 1000s of parallel envs, the de-facto RL training stack 2024-26. - **MuJoCo 3.x** — high-fidelity contact, free, the de-facto evaluation sim for RL papers. - **RaiSim** — fastest CPU legged sim; the ETH-internal default. - **Bullet / PyBullet** — older, slower, but still common in education and OpenAI Gym legacy code. - **Gazebo Sim (Garden, Harmonic)** — full ROS 2 integration; good for whole-system tests, weaker on contact. - **Drake** — hydroelastic contact, formal-method-friendly; the choice for TRI manipulation and some humanoid work. ## 6. Reference data ### Quadruped gait phase parameters | Gait | Duty factor D | Phase offsets (rel. FL) FR, HL, HR | Typical v (m/s) | CoT (range) | | --- | --- | --- | --- | --- | | Crawl / lateral walk | 0.75 | 0.50, 0.25, 0.75 | 0.0–0.5 | 0.6–1.5 | | Walk (trot-walk) | 0.65 | 0.50, 0.50, 0.00 | 0.3–0.8 | 0.5–1.0 | | Trot | 0.50 | 0.50, 0.50, 0.00 | 0.5–2.5 | 0.4–0.7 | | Flying trot | 0.40 | 0.50, 0.50, 0.00 | 1.5–3.5 | 0.5–0.8 | | Pace | 0.50 | 0.00, 0.50, 0.50 | 0.8–2.5 | 0.4–0.6 | | Bound | 0.40 | 0.00, 0.50, 0.50 | 2.0–4.0 | 0.6–1.0 | | Gallop (transverse) | 0.30 | 0.10, 0.55, 0.45 | 3.0–6.0 | 0.7–1.2 | CoT = cost of transport = `P / (m g v)` (dimensionless). MIT Cheetah 3 achieved 0.45 trotting; humans ~ 0.2; horses ~ 0.1. ### MPC horizons and rates (deployed) | System | Model | Horizon T | Δt | Knots N | Solver | Update rate | | --- | --- | --- | --- | --- | --- | --- | | MIT Cheetah 3 / Mini Cheetah | SRBD linear | 0.5 s | 0.04 s | 12 | qpOASES | 30 Hz (outer), WBC 500 Hz | | Boston Dynamics Spot (public est.) | centroidal | ~ 0.4 s | 0.025 s | 16 | proprietary | 200 Hz | | ANYmal C (ocs2 NMPC) | full body + contact | 1.0 s | 0.015 s | 60 | SQP/iLQR | 60–100 Hz | | Cassie (Apgar 2018) | LIP + ALIP | 0.8 s | 0.04 s | 20 | Gurobi | 50 Hz | | Digit (Agility internal) | centroidal | 1.0 s | 0.04 s | 25 | proprietary | 200 Hz | | Unitree Go2 stock | SRBD | 0.4 s | 0.04 s | 10 | OSQP | 100 Hz | | Atlas (Kuindersma 2016) | whole-body | 0.5 s | 0.05 s | 10 | Mosek | 333 Hz | ### Software stack license / openness matrix | Stack | Source | License | Torque API | Sim included | RL pipeline | | --- | --- | --- | --- | --- | --- | | Boston Dynamics SDK | closed | proprietary | no | sim-only Choreographer | none | | Unitree SDK | open | BSD | yes | URDF + MuJoCo | community | | ANYbotics SDK | partial | proprietary core | partial | TwinSDK | none | | MIT Cheetah Software | open | MIT | yes | LCM bridge | none | | `ocs2` | open | BSD-3 | yes | Gazebo / RaiSim | hooks | | Crocoddyl + Pinocchio | open | BSD-3 | yes | Pinocchio | hooks | | Isaac Lab | open | BSD-3 | yes (sim) | itself | first-class | | RaiSim | partial | commercial | yes (sim) | itself | first-class | | Champ | open | BSD | yes | Gazebo | none | ### QDD-vs-other actuator trade-offs | Property | QDD | SEA | Harmonic-drive | Hydraulic | | --- | --- | --- | --- | --- | | Bandwidth | 200–500 Hz | 30–80 Hz | 50–150 Hz | 100–300 Hz | | Backdrivability | excellent | good | poor | poor | | Torque transparency | excellent | via spring | poor | with sensor | | Impact tolerance | excellent | excellent | poor | excellent | | Specific power | ~ 150 W/kg | ~ 100 W/kg | ~ 200 W/kg | ~ 500 W/kg | | Cost (per joint) | $ 300–600 | $ 1–3 k | $ 2–5 k | $ 5–15 k | | Infrastructure | battery | battery | battery | pump + lines | | Used by | Spot, ANYmal-X, Cheetah, Unitree | Cassie, Digit | KUKA iiwa, ABB | legacy BD, JCB | ### Typical real-time loop budget (production legged robot) | Layer | Rate | Compute | Solver / model | Latency | | --- | --- | --- | --- | --- | | Vision / perception | 30 Hz | GPU (Jetson AGX, RTX A2000) | depth + grid map | 33 ms | | Footstep / gait planner | 5–20 Hz | x86 core | A* / RRT* on grid | 50–200 ms | | Centroidal MPC | 50–200 Hz | x86 core (RT) | OSQP / qpOASES / HPIPM | 1–5 ms | | Whole-body QP | 500–1000 Hz | x86 core (RT) | qpOASES / eiQuadProg | 0.3–1 ms | | State estimator | 500–1000 Hz | x86 core (RT) | InvEKF | 0.2–0.5 ms | | Joint torque loop | 10–40 kHz | per-joint MCU (STM32G4, F405) | current PI | 25–100 µs | | IMU | 800–4000 Hz | dedicated bus (SPI/CAN) | — | 0.25–1 ms | The bottleneck is rarely raw compute — it is the I/O / scheduling jitter on the RT loop. RT-PREEMPT Linux + Xenomai routinely deliver < 50 µs jitter on the 1 kHz control loop; standard Linux ≥ 1 ms jitter is enough to break high-bandwidth impedance. ### Floating-base configuration counts (worked legged robots) | Robot | n_q | n_v | Joints / leg | Notes | | --- | --- | --- | --- | --- | | Mini Cheetah | 19 | 18 | 3 × 4 = 12 | Abad-hip-knee | | Unitree A1 / Go2 | 19 | 18 | 3 × 4 = 12 | Abad-hip-knee | | Spot | 19 | 18 | 3 × 4 = 12 | + 1 optional arm 7-DoF | | ANYmal C | 19 | 18 | 3 × 4 = 12 | Has parallel kinematic knee | | Cassie | 27 | 26 | 5 × 2 = 10 (springs) | 4 springs as virtual DoF | | Digit | 30 | 28 | 6 × 2 + 4 × 2 arms | 12 leg + 8 arm + 4 base | | Atlas (electric, 2024) | 36 | 34 | 6 × 2 + 7 × 2 + neck | 28 actuated + 6 floating | | Unitree H1 | 26 | 25 | 5 × 2 + 4 × 2 + 1 | 19 actuated | | Unitree G1 | 30 | 29 | 6 × 2 + 5 × 2 + 1 | 23 actuated | The floating-base 6 DoF is invariant across all platforms (3 translation + 3 rotation in SE(3), quaternion encodes 4 components → `n_q = n_v + 1` rule). ### CoT comparison (cost of transport) | Walker | Mass | Speed | CoT | | --- | --- | --- | --- | | Human | 70 kg | 1.4 m/s | 0.2 | | Horse (trot) | 500 kg | 5 m/s | 0.1 | | Cassie (Cassie Energetics 2020) | 32 kg | 1.0 m/s | 0.46 | | Digit | 64 kg | 1.0 m/s | 0.4 | | Mini Cheetah (trot) | 9 kg | 1.5 m/s | 0.45 | | ANYmal C (trot) | 50 kg | 1.0 m/s | 0.42 | | Atlas (electric, 2024 est.) | 89 kg | 1.5 m/s | ~ 0.5 | | Boston Dynamics Spot | 32 kg | 1.0 m/s | ~ 0.5 | Robot CoT is roughly 2–3 × biological; closing that gap is the open mechanical-design problem. ## 7. Failure modes & debugging | Symptom | Likely cause | Fix | | --- | --- | --- | | Fall at gait transition | Phase reset glitch; force commanded with feet still mid-swing | Add hold-down phase; align contact-schedule to FSM | | Foot slip on tile / ice | Friction estimate too high in MPC | Lower `μ` assumption; run online friction estimator (Bloesch 2013) | | MPC infeasible at limit pose | Friction cone + dynamics infeasible together | Soft-constrain friction (slack with high cost); fall back to recovery gait | | Sagging gait at low battery | Voltage sag → reduced `K_t · V` → reduced peak torque | Current-limit gait planner; monitor pack voltage; bigger pack | | Joint suddenly hot | Magnet thermal demag (NdFeB curie at ~ 150 °C, working limit ~ 80 °C) | Thermistor per motor; throttle when > 60 °C | | Encoder slip after impact | Mechanical encoder coupling slipped | Re-zero against known pose (e.g. tucked stand); switch to bonded magnetic encoder | | Policy works in sim, falls on hardware | Sim-to-real gap (actuator delay, friction, COM offset) | Domain randomise during training; add adaptation policy (Kumar 2021 RMA, Margolis 2023) | | Foot fails to detect contact | Toe steps on power cable or carpet edge | Add per-foot F/T or strain gauge; tune contact threshold | | Whole-body QP misses deadline | Too many tasks / inequalities | Reduce hierarchy; drop to convex SRBD MPC for the outer loop | | Cascading impact at landing | Foot velocity not matched to body velocity | Raibert touch-down angle; impedance with low `K_d` on swing-to-stance transition | | Push knocks robot over despite CP step | Wind / step out of reach | N-step capture; allow second / third recovery step; lower CoM during disturbance | | Pinch / cable wrap during gait | Mechanical design issue at limit posture | Gait planner aware of joint-limit boundary; mechanical stops | | Stair climbing fails | Vision mis-detection or contact mis-prediction | Conservative climb mode; tactile fallback (Spot stair-climber) | | Battery ≪ rated runtime | Cold ambient (LiPo capacity drops 30 % at 0 °C) | Pre-heat pack; over-spec | | Yaw drift over time | IMU bias not corrected without magnetometer or vision | Add magnetometer / VIO loop closure; reset on known landmarks | | Trot oscillates ("bouncing") | MPC dropping into resonance with leg stiffness | Increase MPC damping cost; widen swing-foot trajectory | | Foot sticks momentarily on lift | Plantar tactile residual force | Add lift-off impulse; vertical jerk limit during late stance | | Yaw oscillation while standing | Coupling between yaw torque and contact normal force; small `μ` margin | Add yaw-rate damping in MPC; widen stance | | ZMP jumps at heel-strike | Discrete contact normal-force impulse | Filter ZMP with 5–10 ms low-pass; switch to centroidal mode | | Robot refuses to fall on push (then crashes seconds later) | Capture point outside reach but controller did not abandon planned gait | Trigger emergency stepping mode at `‖x_{CP}‖ > L_{leg}` | | Knee inverts during loaded stance | Trajectory enters joint limit; redundancy nullspace ignored | Use task-priority QP with joint-limit avoidance as highest-priority task | | Trot freezes after touchdown | Contact detected too early; MPC commits forces before foot lands | Use force threshold AND velocity threshold AND kinematic prediction | | Robot wobbles after stair step-up | Plant inertia changed (CoM ascended); MPC linearisation stale | Re-linearise about new operating point each MPC iteration | ## 8. Case studies ### 8.1 MIT Mini Cheetah → the proprioceptive QDD recipe (Katz 2019, Bledt 2018) The Mini Cheetah (9 kg, 12 × QDD modules at N ≈ 6, peak τ 17 N·m per joint) crystallised the modern legged stack. The control architecture (open-source on GitHub as `mit-biomimetics/Cheetah-Software`) is the template every subsequent quadruped — Spot's electric variants, ANYmal C/D/X, all Unitree models — borrows from: 1. **State estimator** at 1 kHz — contact-aided EKF on IMU + leg kinematics, body pose drift < 30 mm/min indoors. 2. **Convex MPC** (Di Carlo 2018) at 30 Hz — SRBD model, 0.5 s horizon, qpOASES solver, outputs 12 contact forces per knot. 3. **Whole-body controller** at 500 Hz — hierarchical task-priority QP: contact-force tracking → body pose → swing-foot pose, projects to 12 joint torques. 4. **Joint torque loop** at 40 kHz on a TI DRV8323-class driver — current PI, dead-time compensation. This four-tier stack is the **default** for every modern quadruped. It is the reason MIT Mini Cheetah did back-flips on benches in 2019 and Unitree Go2 ships a similar capability in a $16 000 retail product in 2024. ### 8.2 Boston Dynamics Spot → from hydraulic to electric production (2005 → 2024) Boston Dynamics's quadruped lineage — **BigDog** (DARPA, 2005, gasoline-hydraulic, 109 kg, carried 150 kg over rough terrain), **AlphaDog/LS3** (2012, larger), **Spot Classic** (2015, hydraulic prototype) — culminated in **Spot** (2019, electric SEA, 32 kg, commercial). The 2024 Atlas reveal completed the hydraulic-to-electric migration even for the humanoid line. Public technical communications (Boston Dynamics blog, 2020-2024) reveal: - 200 Hz convex centroidal MPC, 1 kHz WBC, 333 Hz torque loop (numbers via Kuindersma 2016 + later updates). - Custom SEA modules; spring deflection serves as torque sensor and provides natural impact filtering. - Stair-climber and dynamic-replan modes use the same MPC with switched cost weights. - ≥ 1500 units sold (Boston Dynamics 2023 disclosure) — the first legged robot to clear the production threshold. Spot is the proof-of-concept that legged robots have moved from research demo to deployable industrial product. ### 8.3 Unitree H1 + G1 → the sub-$ 20 k humanoid (2023–2024) Until 2023 humanoids were either research artefacts (HRP-4 ~ $ 400 k, Atlas not for sale) or aspirational (Tesla Optimus, Figure 02 not yet shipping). Unitree's H1 (47 kg, 19 DoF, 90 k USD) and G1 (35 kg, 23 DoF, 16 k USD) collapsed that price ceiling by an order of magnitude. Key choices: - **All-QDD actuation** — same module family as their quadrupeds (GO-M8010 derivatives), proprioceptive, no SEAs. - **No torque sensors** — current-based τ estimation, accepted ~ 5 % error. - **MuJoCo-compatible URDF released** — third-party RL training viable from day one (Unitree provided Isaac Lab examples by mid-2024). - **Reported 3.3 m/s walking** (H1, 2024) — faster than Cassie's 2.1 m/s record (2022), comparable to slow human running. Unitree's pricing forced an entire industry response. By 2026-05 every humanoid maker (Tesla, Figure, 1X, Apptronik) is targeting comparable price-points within 24 months, signalling that humanoids have entered the same "consumer-ish, mass-produced" regime quadrupeds entered in 2019. ### 8.4 Cassie's 5 km outdoor run → biped RL at human-comparable speed (Castro 2022) Agility Robotics' Cassie (32 kg, 5-DoF/leg + 2 passive springs, SEA-driven) became the first bipedal robot to autonomously complete a continuous outdoor 5 km run in May 2022 at the Oregon State University Coliseum (53 minutes, 12.4 minute/km pace, 1.4 m/s sustained). The controller (Castro et al., IROS 2022) is a single deep-RL policy trained in MuJoCo: - **Policy** — 3-layer MLP, 256 → 256 hidden, ELU activations. - **Action** — joint position targets at 40 Hz, passed to a 2 kHz joint PD with `K_p = 50` N·m/rad, `K_d = 1.5` N·m·s/rad. - **State input** — 47-dim: base orient (quat), base ang-vel, joint pos × 14, joint vel × 14, command velocity, gait clock signals. - **Reward** — velocity tracking + foot air-time bonus + smoothness penalty; trained with PPO, ~ 5×10⁸ sim steps. - **Sim-to-real** — actuator delay randomisation, payload randomisation, push-recovery curriculum. - **Compute on robot** — 50 Hz inference on the i7 control PC, 8 ms per step. The run validated three claims: (1) pure RL can replace the LIP/ZMP stack on a real biped, (2) SEA hardware is *not* a prerequisite for compliance — the policy synthesises virtual compliance, (3) sample-efficient RL with good randomisation closes the sim-to-real gap on hardware that previously required years of hand-tuned control. Every subsequent humanoid RL paper (Optimus walking demos, Figure 02, Unitree H1) draws on this template, with newer variants adding teacher-student distillation (Margolis 2023, Radosavovic 2024) and language-conditioned policies (Lin 2024). ## 9. Cross-references - `[[Robotics/dynamics-rigid-body]]` — floating-base manipulator equation, RNEA/CRBA/ABA underlying every legged controller - `[[Robotics/kinematics-dh]]` — forward / inverse leg kinematics, Jacobians - `[[Robotics/motors-electric]]` — QDD actuator sizing, BLDC torque-speed curves, back-EMF - `[[Robotics/sensors-pose-motion]]` — IMU specifications (BMI088, VN-100), bias modelling - `[[Robotics/impedance-control]]` — joint and Cartesian impedance for swing-leg compliance and contact transitions - `[[Robotics/state-space-lqr]]` — LQR design for LIP balance, attitude inner loop - `[[Robotics/path-planning]]` — global footstep planning, terrain-aware A* - `[[Robotics/bayesian-estimation]]` — invariant-EKF, contact-aided floating-base estimation - `[[Robotics/mobile-base-wheeled]]` — companion mobility regime (wheels vs legs decision matrix) - `[[Robotics/pid-control]]` — joint-level current and velocity loops - `[[Engineering/mpc-control]]` — receding-horizon optimal control theory, OSQP / qpOASES interfaces - `[[Engineering/vibration-dynamics]]` — leg structural resonance, SEA spring mode - `[[Engineering/gears-power-transmission]]` — gear-ratio choice, harmonic vs planetary vs cycloidal - `[[Engineering/classical-control]]` — frequency-domain margins for inner loops - `[[Languages/Tier3/robotics-control]]` — URDF / MJCF / SDF, ROS 2 controller interfaces ## 10. Citations **Foundations** - Raibert, M. H. (1986). *Legged Robots That Balance*. MIT Press. (Canonical text; 1-leg, 2-leg, 4-leg hoppers.) - Vukobratović, M. & Borovac, B. (2004). "Zero-Moment Point — Thirty-five years of its life." *International Journal of Humanoid Robotics* 1(1), 157–173. - Kajita, S., Kanehiro, F., Kaneko, K., Yokoi, K. & Hirukawa, H. (2001). "The 3D Linear Inverted Pendulum Mode: a simple modeling for a biped walking pattern generation." *IROS*. - Pratt, J., Carff, J., Drakunov, S. & Goswami, A. (2006). "Capture Point: A Step toward Humanoid Push Recovery." *IEEE-RAS Humanoids*. - Orin, D. E. & Goswami, A. (2008). "Centroidal Momentum Matrix of a humanoid robot." *IROS*. - Sentis, L. & Khatib, O. (2005). "Synthesis of Whole-Body Behaviors through Hierarchical Control of Behavioral Primitives." *International Journal of Humanoid Robotics* 2(4). - Khatib, O. (1987). "A Unified Approach for Motion and Force Control of Robot Manipulators: The Operational Space Formulation." *IEEE Journal of Robotics and Automation* 3(1). - Kanoun, O., Lamiraux, F. & Wieber, P.-B. (2009). "Kinematic Control of Redundant Manipulators: Generalizing the Task-Priority Framework to Inequality Tasks." *IEEE Trans. Robotics* 27(4). **Model-based control** - Wensing, P. M., Wang, A., Seok, S., Otten, D., Lang, J. & Kim, S. (2017). "Proprioceptive Actuator Design in the MIT Cheetah." *IEEE Trans. Robotics* 33(3). - Di Carlo, J., Wensing, P. M., Katz, B., Bledt, G. & Kim, S. (2018). "Dynamic Locomotion in the MIT Cheetah 3 Through Convex Model-Predictive Control." *IROS*. - Bledt, G., Powell, M. J., Katz, B., Di Carlo, J., Wensing, P. M. & Kim, S. (2018). "MIT Cheetah 3: Design and Control of a Robust, Dynamic Quadruped Robot." *IROS*. - Katz, B., Di Carlo, J. & Kim, S. (2019). "Mini Cheetah: A Platform for Pushing the Limits of Dynamic Quadruped Control." *ICRA*. - Kuindersma, S., Deits, R., Fallon, M., Valenzuela, A., Dai, H., Permenter, F., Koolen, T., Marion, P. & Tedrake, R. (2016). "Optimization-based locomotion planning, estimation, and control design for the Atlas humanoid robot." *Autonomous Robots* 40(3). - Mastalli, C. et al. (2020). "Crocoddyl: An Efficient and Versatile Framework for Multi-Contact Optimal Control." *ICRA*. - Tedrake, R. (2024). *Underactuated Robotics: Algorithms for Walking, Running, Swimming, Flying, and Manipulation*. Online textbook, <https://underactuated.mit.edu>. **Learning-based control** - Hwangbo, J., Lee, J., Dosovitskiy, A., Bellicoso, D., Tsounis, V., Koltun, V. & Hutter, M. (2019). "Learning agile and dynamic motor skills for legged robots." *Science Robotics* 4(26). - Lee, J., Hwangbo, J., Wellhausen, L., Koltun, V. & Hutter, M. (2020). "Learning quadrupedal locomotion over challenging terrain." *Science Robotics* 5(47). - Kumar, A., Fu, Z., Pathak, D. & Malik, J. (2021). "RMA: Rapid Motor Adaptation for Legged Robots." *RSS*. - Margolis, G. B., Yang, G., Paigwar, K., Chen, T. & Agrawal, P. (2024). "Rapid Locomotion via Reinforcement Learning." *IJRR* 43(4). - Margolis, G. B. & Agrawal, P. (2023). "Walk These Ways: Tuning Robot Control for Generalization with Multiplicity of Behavior." *CoRL*. **State estimation** - Bloesch, M., Hutter, M., Hoepflinger, M. A., Leutenegger, S., Gehring, C., Remy, C. D. & Siegwart, R. (2013). "State Estimation for Legged Robots — Consistent Fusion of Leg Kinematics and IMU." *RSS*. - Hartley, R., Ghaffari, M., Eustice, R. M. & Grizzle, J. W. (2020). "Contact-Aided Invariant Extended Kalman Filtering for Robot State Estimation." *IJRR* 39(4). **Mechanics / actuation** - Pratt, G. A. & Williamson, M. M. (1995). "Series Elastic Actuators." *IROS*. - Hurmuzlu, Y. & Marghitu, D. B. (1994). "Rigid Body Collisions of Planar Kinematic Chains With Multiple Contact Points." *International Journal of Robotics Research* 13(1). - Grimminger, F. et al. (2020). "An Open Torque-Controlled Modular Robot Architecture for Legged Locomotion Research." *RA-L* 5(2). **Standards / SDKs / documentation (accessed 2026-05)** - Boston Dynamics Spot SDK 4.x documentation, <https://dev.bostondynamics.com>. - Unitree SDK (`unitree_sdk2`, `unitree_ros2`), <https://github.com/unitreerobotics>. - ANYbotics ANYmal Documentation Portal (login-walled). - NVIDIA Isaac Lab documentation, <https://isaac-sim.github.io/IsaacLab>. - MuJoCo Menagerie, <https://github.com/google-deepmind/mujoco_menagerie>. - ETH Robotic Systems Lab `ocs2`, <https://github.com/leggedrobotics/ocs2>. - Agility Robotics Digit Developer Documentation, <https://docs.agilityrobotics.com>. **Historical / context** - Hoyt, D. F. & Taylor, C. R. (1981). "Gait and the energetics of locomotion in horses." *Nature* 292. - Bertram, J. E. A. & Ruina, A. (2001). "Multiple walking speed-frequency relations are predicted by constrained optimization." *Journal of Theoretical Biology* 209(4). - Honda Motor Co. (2000). "The Honda Humanoid Robot ASIMO." Technical white paper. - Buchli, J., Kalakrishnan, M., Mistry, M., Pastor, P. & Schaal, S. (2009). "Compliant quadruped locomotion over rough terrain." *IROS*. - Bledt, G., Wensing, P. M. & Kim, S. (2017). "Policy-Regularized Model Predictive Control to Stabilize Diverse Quadrupedal Gaits for the MIT Cheetah." *IROS*. - Posa, M., Cantu, C. & Tedrake, R. (2014). "A direct method for trajectory optimization of rigid bodies through contact." *IJRR* 33(1). - Mordatch, I., Todorov, E. & Popović, Z. (2012). "Discovery of complex behaviors through contact-invariant optimization." *ACM Trans. Graphics* 31(4). - Howell, T., Le Cleac'h, S., Brüdigam, J., Kolter, J. Z., Schwager, M. & Manchester, Z. (2022). "Predictive Sampling: Real-time Behaviour Synthesis with MuJoCo." arXiv:2212.00541. - Schulman, J., Wolski, F., Dhariwal, P., Radford, A. & Klimov, O. (2017). "Proximal Policy Optimization Algorithms." arXiv:1707.06347. - Castro, A., Apgar, T., Castillo, G., Hereid, A., Fern, A. & Hurst, J. (2022). "Sim-to-Real Learning of All Common Bipedal Gaits via Periodic Reward Composition." *IROS*. - Radosavovic, I., Xiao, T., Zhang, B., Darrell, T., Malik, J. & Sreenath, K. (2024). "Real-World Humanoid Locomotion with Reinforcement Learning." *Science Robotics* 9. - Kim, S., Bledt, G., Powell, M. J., Katz, B. & Kim, S. (2019). "Cost-Optimal Gait Selection in MIT Mini Cheetah." *ICRA Workshop*. - Apgar, T., Clary, P., Green, K., Fern, A. & Hurst, J. (2018). "Fast online trajectory optimization for the bipedal robot Cassie." *RSS*.