State-Space & LQR Control for Robotics — Linear-Quadratic Regulator, Kalman Filter, LQG

See also (Tier 3 family index): Control Algorithm Zoo

Scope. Theory — derivations of CARE, the Riccati equation, separation principle, controllability/observability proofs — lives in [[Engineering/state-space-methods]]. This note is the robotics-applied counterpart: the rules you follow when picking $(A, B, C, D)$ for an actual mechanism, the $Q / R$ choices that survive on real hardware, the gotchas at every operating point you re-linearise about, and the case studies of LQR/LQG running on balancing platforms, drones, and arms. PID and its limits are in [[Robotics/pid-control]]; this note covers where you outgrow that workhorse.

1. At a glance

LQR is the canonical “step beyond PID” in robotics. You reach for it when one or more of the following becomes painful with cascaded SISO PID:

MIMO coupling that’s too strong to ignore. A quadrotor’s roll, pitch and yaw rates are coupled through the inertia tensor and the gyroscopic precession from spinning rotors. A humanoid balance controller couples ankle torque to hip motion through the linear-inverted-pendulum dynamics. Independent per-axis PIDs fight each other; LQR optimises across the full state.
Measurements that aren’t direct outputs. A balancing robot measures wheel angle and IMU tilt; the state you want to control includes velocities you never measure directly. A Kalman filter reconstructs them; you regulate the estimated state.
Optimal-cost objectives. “Minimise energy at the rotor while holding altitude” or “minimise wheel torque while keeping the pole upright” are weighted-quadratic objectives that PID can’t express directly. LQR makes the trade-off a tuning knob — the weight matrices $Q$ and $R$ .
Open-loop unstable plants. A cart-pole upright, a Segway, a hopping robot’s flight phase, a tail-sitter VTOL transition — these have right-half-plane eigenvalues. PID can stabilise them, but the design margin is razor-thin; LQR gives provable closed-loop stability with guaranteed gain/phase margins (Doyle 1978).

The deliverable is a constant gain matrix $K \in R^{m \times n}$ such that $u = - K \overset{x}{^}$ . Compute $K$ once offline by solving the continuous-time algebraic Riccati equation (CARE); ship it as a $6 \times 12$ or $3 \times 6$ numerical array; the runtime cost is a matrix-vector multiply that any modern microcontroller does in microseconds.

Where it sits in the design stack. Sensors → state estimator (Kalman / EKF / UKF / Luenberger) → LQR feedback gain → actuator allocator → plant. The estimator and the gain are dual: design them by the same Riccati-equation machinery with the role of $A, B, Q, R$ versus $A^{⊤}, C^{⊤}, W, V$ interchanged. The combination — full-state feedback driven by a Kalman-filtered state estimate — is LQG (Linear-Quadratic-Gaussian), the most common optimal-control architecture in real robotics.

Common variants.

LQR — full-state feedback, no estimator; assumes you measure (or can compute) every state.
LQG — LQR + Kalman filter; for noisy/partial measurements. Workhorse architecture.
LTR (Loop-Transfer Recovery) — restores LQR’s robustness margins after the Kalman observer eats some of them (Doyle-Stein 1979).
DLQR / DLQG — discrete-time versions; what actually runs on the MCU after c2d() discretisation.
iLQR / DDP — iterative LQR for nonlinear trajectory optimisation; not strictly LQR but uses the same Riccati machinery on time-varying linearisations. Used by Boston Dynamics, MuJoCo MPC, Drake’s TrajectoryOptimization.
Frequency-shaped LQR — adds a frequency weighting on $u$ to penalise high-frequency control effort; produces smoother actuator commands (Anderson-Moore 2007 ch. 8).
LQR with integral augmentation — appends $\int (y_{des} - y) d t$ to the state; gives zero steady-state tracking error against constant references and constant disturbances.

First ask before applying LQR: Is my linearisation valid in the operating envelope I care about? If yes → LQR is appropriate. If the plant explores nonlinear territory (humanoid walking, drone aggressive maneuvers, manipulator working near singularity) → gain-schedule a family of LQRs, or step up to iLQR/MPC. Is my plant controllable from the available actuators? If rank[B, AB, ..., A^{n-1}B] < n, no LQR design exists — fix the actuator set or accept a partially controllable design. Is the state observable? Same test on $(A, C)$ ; if not, the Kalman filter cannot reconstruct the unobservable modes.

2. First principles

The full derivations — controllability/observability Gramians, Hamiltonian structure of CARE, separation principle — are in [[Engineering/state-space-methods]] §§5–8. Here we list only the equations you use during design.

State-space form for a robot

The general continuous-time LTI form:

$\overset{x}{˙} = A x + B u, y = C x + D u$

For most robots: $D = 0$ (no direct feedthrough from actuator to sensor). Examples:

Cart-pole near upright. $x = [p, \overset{p}{˙}, θ, \dot{θ}]^{⊤}$ , $u = F$ (cart force), $y = [p, θ]^{⊤}$ .
Quadrotor attitude near hover. $x = [ϕ, θ, ψ, p, q, r]^{⊤}$ , $u = [τ_{x}, τ_{y}, τ_{z}]^{⊤}$ , $y$ = direct IMU measurements.
6-DOF arm joint (linearised). $x = [q, \overset{q}{˙}]^{⊤} \in R^{12}$ , $u = τ \in R^{6}$ after gravity-comp feedforward.
Differential-drive base. $x = [v, ω]^{⊤}$ , $u = [τ_{L}, τ_{R}]^{⊤}$ , after wheel-radius and track-width factoring.

Linearisation about an operating point

For a nonlinear robot $\overset{x}{˙} = f (x, u)$ at an equilibrium $(x^{*}, u^{*})$ with $f (x^{*}, u^{*}) = 0$ :

$A = \frac{\partial f}{\partial x}_{x^{*}, u^{*}}, B = \frac{\partial f}{\partial u}_{x^{*}, u^{*}}$

In Python this is scipy.optimize to find $x^{*}$ then jax.jacfwd or casadi.jacobian to obtain $A, B$ symbolically; in MATLAB, linearize() on a Simulink model around a findop() operating point. The manipulator equation $M (q) \overset{q}{¨} + C (q, \overset{q}{˙}) \overset{q}{˙} + g (q) = τ$ from [[Robotics/dynamics-rigid-body]] linearises into the canonical second-order form $A = [0 - M^{- 1} \partial g / \partial q I - M^{- 1} C]$ , $B = [0 M^{- 1}]$ near a static pose.

Controllability and observability rank tests

The controllability matrix $C = [B, A B, A^{2} B, \dots, A^{n - 1} B] \in R^{n \times nm}$ . The plant is controllable iff $rank (C) = n$ . The observability matrix $O = [C^{⊤}, A^{⊤} C^{⊤}, \dots, (A^{⊤})^{n - 1} C^{⊤}]^{⊤}$ . Observable iff $rank (O) = n$ . In practice use ctrb(sys), obsv(sys), rank(...) in MATLAB or python-control’s ctrb, obsv. Numerically reliable test for near-rank deficiency: the controllability Gramian’s smallest singular value vs the largest — ratios below $1 0^{- 6}$ flag practical uncontrollability even when the algebraic rank is full.

LQR cost and feedback gain

The continuous-time LQR minimises

$J = \int_{0}^{\infty} (x^{⊤} Q x + u^{⊤} R u) d t, Q ⪰ 0, R ≻ 0$

The optimal gain is $K = R^{- 1} B^{⊤} P$ where $P = P^{⊤} ⪰ 0$ solves the continuous-time algebraic Riccati equation (CARE):

$A^{⊤} P + P A - PB R^{- 1} B^{⊤} P + Q = 0$

The closed-loop dynamics $\overset{x}{˙} = (A - B K) x$ are guaranteed asymptotically stable provided $(A, B)$ is stabilisable and $(A, Q^{1/2})$ is detectable. Numerical solution: scipy.linalg.solve_continuous_are, MATLAB care(), SLICOT SB02OD, Julia MatrixEquations.arec. All use the Schur or sign-function method on the Hamiltonian matrix $H = [A - Q - B R^{- 1} B^{⊤} - A^{⊤}]$ .

Robust margins of pure LQR

For full-state feedback (no observer), LQR has guaranteed gain margin $[\frac{1}{2}, \infty)$ and phase margin $\geq 60°$ on every input channel individually — proven by Anderson-Moore 1971 for SISO, extended to MIMO loop-by-loop by Lehtomaki-Sandell-Athans 1981. This is one of the few things you get for free in modern control. Once you add a Kalman observer, these margins can collapse (Doyle 1978’s famous counterexample) unless you apply Loop-Transfer Recovery.

Kalman filter — the dual

For the noisy plant $\overset{x}{˙} = A x + B u + w$ , $y = C x + v$ with $w \sim N (0, W)$ and $v \sim N (0, V)$ independent zero-mean Gaussian noises, the steady-state Kalman filter runs

$\dot{\overset{x}{^}} = A \overset{x}{^} + B u + L (y - C \overset{x}{^})$

with Kalman gain $L = P_{e} C^{⊤} V^{- 1}$ where $P_{e}$ solves the dual CARE

$A P_{e} + P_{e} A^{⊤} - P_{e} C^{⊤} V^{- 1} C P_{e} + W = 0$

This is literally the LQR Riccati equation with the substitutions $A \to A^{⊤}$ , $B \to C^{⊤}$ , $Q \to W$ , $R \to V$ . In code: solve_continuous_are(A.T, C.T, W, V).

LQG via the separation principle

LQR gain $K$ + Kalman gain $L$ combine into the LQG controller:

$\dot{\overset{x}{^}} = (A - B K - L C) \overset{x}{^} + L y, u = - K \overset{x}{^}$

The closed-loop poles split into LQR poles (eigenvalues of $A - B K$ ) and observer poles (eigenvalues of $A - L C$ ). For the linear Gaussian case the separation principle holds exactly: design $K$ and $L$ independently, combine them, and you get the optimal LQG controller. The robustness margins of pure LQR are no longer guaranteed once the observer is in the loop.

Rule of thumb for observer-pole placement: choose observer poles 5–10× faster than the LQR closed-loop poles. Faster than 10× and the observer amplifies sensor noise; slower than 5× and the observer’s state estimate lags the true state enough to degrade the LQR response. This rule is the practical alternative to formal LTR for early designs.

Reachability, stabilisability, detectability

A subtle but important distinction:

Controllable — every state can be driven to zero in finite time (all $n$ modes reachable from $u$ ).
Stabilisable — only the unstable modes need be controllable; stable uncontrollable modes are OK (they decay on their own).
Observable — every state is reflected in $y$ (all $n$ modes visible from $C$ ).
Detectable — only the unstable modes need be observable.

LQR/Kalman require stabilisability (not full controllability) and detectability (not full observability) for the Riccati equation to have a positive-semi-definite solution. The robot designer’s bar is therefore lower: you don’t need to instrument every joint; you just need to instrument every mode that’s both unstable and not naturally damped.

Discretisation for the MCU

The continuous design is discretised with c2d(sys, T_s, 'zoh') (zero-order hold — exact for the LTI piecewise-constant input model). Discrete LQR (dlqr) solves the discrete-time algebraic Riccati equation (DARE):

$A_{d}^{⊤} P A_{d} - P - A_{d}^{⊤} P B_{d} (R + B_{d}^{⊤} P B_{d})^{- 1} B_{d}^{⊤} P A_{d} + Q = 0$

For practical robot sample rates ( $T_{s}$ < 1/(20·BW)), the continuous-time gain transferred verbatim is almost identical to the DLQR gain. For slower sample rates redesign in discrete time.

3. Practical math & worked examples

Example A — Cart-pole / inverted pendulum balance

The textbook test case for LQR; also the actual control problem for the Segway, hoverboard, two-wheel balancing robot, and Cassie’s standing controller (in linearised form).

Plant. Cart of mass $M = 1$ kg, pole of mass $m = 0.1$ kg, length $ℓ = 0.5$ m, gravity $g = 9.81$ m/s². State $x = [p, \overset{p}{˙}, θ, \dot{θ}]^{⊤}$ (cart position, cart velocity, pole angle from vertical, pole rate). Input $u = F$ (horizontal force on cart, N). Linearise about upright ( $θ = 0$ ):

$A = 00001000 0 - \frac{m g}{M} 0 \frac{( M + m ) g}{M ℓ} 0010 = 00001000 0 - 0.981 0 21.58 0010$

$B = 0 1/ M 0 - 1/ (M ℓ) = 010 - 2$

Eigenvalues of $A$ : ${0, 0, \pm 4.646}$ . Two zero eigenvalues (cart position drift) and one unstable mode at $+ 4.65$ rad/s — pendulum falls.

Bryson scaling for Q and R. With desired max excursions $p_{max} = 0.5$ m, $\overset{p}{˙}_{max} = 1$ m/s, $θ_{max} = 10° = 0.175$ rad, $\dot{θ}_{max} = 1$ rad/s, $u_{max} = 20$ N:

$Q = diag (\frac{1}{0. 5 ^{2}}, \frac{1}{1 ^{2}}, \frac{1}{0.17 5 ^{2}}, \frac{1}{1 ^{2}}) = diag (4, 1, 32.7, 1)$ $R = \frac{1}{2 0 ^{2}} = 0.0025$

Solve in python-control:

import numpy as np
from control import lqr
A = np.array([[0,1,0,0],[0,0,-0.981,0],[0,0,0,1],[0,0,21.58,0]])
B = np.array([[0],[1],[0],[-2]])
Q = np.diag([4, 1, 32.7, 1])
R = np.array(0.0025)
K, S, E = lqr(A, B, Q, R)
# K ≈ -40.0, -34.7, -147.9, -32.3
# closed-loop poles E ≈ [-13.3, -4.7, -3.9, -3.0]

Sanity check. $∣ K_{p} ∣ = 40$ N/m for cart position, $∣ K_{θ} ∣ = 148$ N/rad for pole angle — pole gain is 4× cart gain, expected since the unstable mode is on the pole. A $10°$ initial perturbation generates a transient force of $∣ K_{θ} ∣ \cdot 0.175 \approx 26$ N — slightly above the 20 N limit, indicating the actuator will saturate briefly. Either accept the brief saturation (LQR remains stable in the linear regime) or reduce $θ_{max}$ in the Bryson scaling to relax the gain. Recovery time from $10°$ perturbation in simulation: $\sim 800$ ms.

Discrete implementation. At $T_{s} = 2$ ms (500 Hz loop):

from scipy.signal import cont2discrete
Ad, Bd, _, _, _ = cont2discrete((A, B, np.eye(4), 0), 0.002)
from control import dlqr
Kd, Sd, Ed = dlqr(Ad, Bd, Q*0.002, R*0.002)
# Kd ≈ K (negligible difference at 500 Hz)

Example B — Quadrotor attitude LQR (PX4-style alternative to cascaded PID)

For a $m = 1$ kg X-frame quadrotor at hover, linearise the rotational dynamics. State $x = [ϕ, θ, ψ, p, q, r]^{⊤}$ (roll, pitch, yaw, body rates). Input $u = [τ_{x}, τ_{y}, τ_{z}]^{⊤}$ (body torques from the rotor mixer). The body inertia tensor (typical 250 mm frame): $I = diag (0.0095, 0.0095, 0.018)$ kg·m².

Near hover and small angles, the rotational dynamics decouple per axis: $\dot{ϕ} = p$ , $\overset{p}{˙} = τ_{x} / I_{xx}$ , etc. So

$A = [0_{3 \times 3} 0_{3 \times 3} I_{3} 0_{3 \times 3}], B = [0_{3 \times 3} I_{body}^{- 1}]$

with $I_{body}^{- 1} = diag (105.3, 105.3, 55.6)$ . Eigenvalues of $A$ : all zero — three double-integrators in parallel, marginally stable, fully controllable from $B$ .

Bryson Q, R. Pick max angle $ϕ_{max} = 0.3$ rad ( $\approx 17°$ ), max rate $p_{max} = 3$ rad/s, max torque $τ_{x, max} = 0.5$ N·m, similar for pitch; for yaw the inertia is larger and $τ_{z, max} = 0.3$ N·m, $ψ_{max} = 0.5$ rad, $r_{max} = 2$ rad/s:

$Q = diag (11.1, 11.1, 4, 0.11, 0.11, 0.25), R = diag (4, 4, 11.1)$

Solving: $K \in R^{3 \times 6}$ with $K \approx$

$1.67 00 0 1.67 0 00 0.60 0.20 00 0 0.20 0 00 0.13$

Closed-loop bandwidth per axis: ~17 rad/s ≈ 2.7 Hz on angle, ~50 rad/s on rate. The off-diagonal terms are exactly zero because the linearised inertia tensor is diagonal — if you replace it with a real measured inertia (with cross-products), the off-diagonals appear naturally and LQR couples the axes optimally.

Energy comparison vs cascaded PID. Simulate a 0.2 rad roll-angle setpoint step with identical disturbance rejection bandwidth: LQR ~22% lower $\int u^{⊤} u d t$ than the equivalent PX4 cascaded PID with same closed-loop bandwidth (typical numbers from Furrer-Burri ICRA 2016 RotorS benchmark on Crazyflie). The saving comes from LQR exploiting the cross-coupling explicitly when the inertia tensor is non-diagonal.

Example C — UR5-class arm joint with Kalman velocity estimation

Each joint of a 6-DOF cobot, after gravity-comp feedforward ([[Robotics/dynamics-rigid-body]] §3), has the residual linear plant

$J_{eff} \overset{q}{¨} + b \overset{q}{˙} = τ$

with $J_{eff} = 0.5$ kg·m² (joint-side, post-100:1 harmonic drive), $b = 0.05$ N·m·s/rad. State $x = [q, \overset{q}{˙}]^{⊤}$ , input $u = τ$ , measurement $y = q$ (the encoder). The velocity $\overset{q}{˙}$ is not measured directly — naïve differencing amplifies encoder LSB noise by $1/ T_{s}$ .

$A = [00 1 - b / J_{eff}] = [00 1 - 0.1], B = [0 1/ J_{eff}] = [02], C = [10]$

Process noise. Model parameter uncertainty as $1%$ of the torque amplitude reflected as state acceleration noise: $W = diag (1 0^{- 6}, 1 0^{- 3})$ (rad², (rad/s)²).

Measurement noise. Optical encoder with $2^{17}$ counts/rev gives $σ_{q} = 2 π / 2^{17} / 12 \approx 1.4 \times 1 0^{- 5}$ rad. So $V = 2 \times 1 0^{- 10}$ rad².

Kalman gain (steady-state). Solve solve_continuous_are(A.T, C.T, W, V):

P = solve_continuous_are(A.T, C.T, W, V)
L = P @ C.T / V    # ≈ [[7.07e2], [5.0e4]]

The position gain $L_{q} \approx 707$ s⁻¹ reflects fast acceptance of position measurements. The velocity gain $L_{\overset{q}{˙}} \approx 5 \times 1 0^{4}$ s⁻¹ reflects the fact that velocity must be inferred from position differences scaled by the assumed process noise.

Effect on downstream PD control. Replace naïve (q[k] - q[k-1])/T_s with the Kalman estimate $\hat{\overset{q}{˙}}$ . With $T_{s} = 1$ ms and $σ_{q} = 1.4 \times 1 0^{- 5}$ rad, the naïve velocity noise is $σ_{q} / T_{s} = 0.014$ rad/s; the Kalman estimate has noise floor near $1 0^{- 3}$ rad/s — about 14× quieter. The downstream PD’s $K_{d} \overset{q}{˙}$ term no longer drives the motor with audible whine. This is the dominant reason any modern cobot ships with a state estimator instead of direct numerical differentiation.

EKF when nonlinearities matter. For a multi-joint arm where joint coupling can’t be ignored (Coriolis terms significant at high speed), replace the linear Kalman with an Extended Kalman Filter linearising about the current state estimate every cycle. Implementation in [[Robotics/bayesian-estimation]] (planned).

Example D — Two-wheel balancing robot (Segway-class)

A pedagogical favourite and a real product class (Segway PT, Ninebot S, every educational kit). Combine cart-pole balance with the wheel dynamics. State $x = [θ, \dot{θ}, ϕ, \dot{ϕ}]^{⊤}$ where $θ$ is body tilt (forward positive) and $ϕ$ is wheel angle. Inputs $u = [τ_{L}, τ_{R}]^{⊤}$ , but symmetric balance lets you reduce to a single mean wheel torque $τ_{w} = (τ_{L} + τ_{R}) /2$ with the differential $(τ_{L} - τ_{R}) /2$ controlled separately for heading. Linearised about upright:

For body mass $m_{b} = 5$ kg, body inertia $I_{b} = 0.05$ kg·m² about wheel axis, wheel radius $r = 0.1$ m, body COM height $ℓ = 0.2$ m, total wheel inertia $I_{w} = 0.005$ kg·m², the linearisation around $θ = 0$ gives an unstable plant with eigenvalues near ${0, 0, + 6.3, - 6.3}$ (one wheel position drift, one balance instability).

Bryson $Q = diag (1/0. 1^{2}, 1/ 2^{2}, 1/ (2 π)^{2}, 1/ (4 π)^{2}) = diag (100, 0.25, 0.025, 0.006)$ , $R = 1/1. 5^{2} \approx 0.44$ N²·m². Solving in python-control yields $K \approx [- 37, - 6.5, - 0.16, - 0.32]$ — the dominant gain on tilt angle, weak gain on wheel position. The recovered closed-loop bandwidth on tilt is ~5 rad/s; the robot stays upright against a 1° initial tilt with peak torque ~0.6 N·m.

Subtlety: wheel position is intentionally weakly controlled. The user wants the robot to track position commands, but during balance recovery the wheels temporarily move backwards to “catch” a forward fall. A heavy gain on $ϕ$ would fight this catching motion. The standard fix is to integrator-augment $ϕ - ϕ_{des}$ with a small gain — slow position tracking while preserving fast balance.

4. Design heuristics

Q and R tuning — Bryson’s rule then iterate

Start with Bryson’s rule (Bryson-Ho 1975 §5):

$Q_{ii} = \frac{1}{x _{i, max}^{2}}, R_{jj} = \frac{1}{u _{j, max}^{2}}$

This scales each state and input by its physically meaningful range. Then iterate: bigger $Q$ → faster response, bigger $R$ → less actuator effort. A common second pass is to multiply the entire $Q$ matrix by a scalar $ρ$ and sweep $ρ \in [0.1, 100]$ logarithmically while watching the closed-loop bandwidth and the peak control effort in simulation.

Diagonal $Q$ and $R$ are sufficient in practice. Off-diagonal terms exist mathematically (they penalise correlations like $x_{i} \cdot x_{j}$ ) but rarely have physical meaning. Exception: in operational-space control you may write $Q = J^{⊤} Q_{cart} J$ to penalise Cartesian error rather than joint error — that produces an off-diagonal $Q$ naturally.

Always check controllability first

If $(A, B)$ is not controllable, LQR’s CARE has no positive-definite solution and the solver will return junk (numerical garbage from a singular Hamiltonian). Check before designing:

from control import ctrb
np.linalg.matrix_rank(ctrb(A, B)) == A.shape[0]

Common ways a robot ends up uncontrollable:

Underactuated DOF. A drone with three rotors can’t independently command yaw + thrust + roll + pitch. Quadcopter is minimally controllable; tricopter requires servo tilt to maintain controllability.
Symmetric workspace pose. A 6-DOF arm at a wrist singularity has rank-deficient Jacobian; full state controllability is lost in some directions.
Locked joint. Disabling a joint by braking it changes $B$ — the remaining plant may not be controllable to the original state vector. Recompute $K$ for the locked subsystem.

Integrator augmentation for tracking

Pure LQR drives $x \to 0$ , not $x \to x_{des}$ . For constant non-zero setpoints with constant disturbances, augment the state with the integrated tracking error:

$x_{aug} = [x, x_{int}]^{⊤}, \overset{x}{˙}_{int} = y_{des} - y$

$A_{aug} = [A - C 00], B_{aug} = [B 0]$

Solve LQR on $(A_{aug}, B_{aug})$ with augmented $Q$ that includes a weight on $x_{int}$ . This gives zero steady-state error with the same robustness guarantees. Mandatory for altitude hold on drones (rejecting weight uncertainty), Cartesian setpoint tracking on arms (rejecting load disturbances), and balance position regulation.

Anti-windup in state-space

LQR doesn’t model actuator saturation; if $∣ u ∣ > u_{max}$ the closed-loop guarantees fail. Two production options:

Clamp + back-calculate the integrator state in the augmented form: when $u$ saturates, freeze $x_{int}$ or back-propagate the saturation error into it (same idea as PID anti-windup in [[Robotics/pid-control]] §2).
Step up to MPC if hard limits matter. MPC honours constraints explicitly; LQR doesn’t.

Saturation handling — when LQR isn’t enough

Stationary LQR cannot honour hard limits $u_{min} \leq u \leq u_{max}$ or $x_{min} \leq x \leq x_{max}$ . Workarounds:

Soft saturation in the cost: add penalty $ρ \cdot (∣ u ∣ - u_{max})_{+}^{2}$ — but LQR can’t represent piecewise-quadratic penalties; you’d need iLQR.
Reference governor that pre-filters $r$ to keep $u$ inside limits.
Explicit MPC that computes $K (x)$ as a piecewise-affine function on a polyhedral partition of state space.

Gain scheduling for the operating envelope

A single LQR is valid only in the linear neighbourhood of one operating point. For robots that visit multiple poses (a humanoid standing upright, crouching, walking; an arm working folded, extended, at full reach), design LQR at each pose and interpolate:

poses = [pose_upright, pose_crouched, pose_walking]
K_table = [lqr(*linearize_around(p), Q, R)[0] for p in poses]
# at runtime: select K by nearest pose, or blend linearly

Boston Dynamics’ Atlas uses ~30 gain-schedule points across its whole-body LQR + MPC stack (Kuindersma et al. 2016, “Optimization-based locomotion planning, estimation, and control design for the atlas humanoid robot”); UR’s e-series schedules joint-side gains on payload mass.

LQR vs PID vs MPC decision

Situation	Choose
SISO, slow plant, no hard limits, no model	PID
Single robot joint after gravity-comp feedforward	PID (per-joint, `[[Robotics/pid-control]]`)
MIMO coupling, plant is linear or linearisable, soft limits	LQR
MIMO, noisy/partial measurements	LQG
Hard input/state constraints, plant model known	MPC (`[[Engineering/mpc-control]]`)
Nonlinear plant, optimise trajectory	iLQR / DDP
Adversarial uncertainty, want worst-case bound	H∞
Plant model unknown, adapt online	MRAC / L1 adaptive

A common production pattern: MPC at the outer trajectory level + LQR at the joint level + PID inside the motor drive. Spot, Atlas, modern quadrotors, and the Tesla Optimus follow variants of this stack.

Frequency-shaped LQR for smoother control

Add a high-pass weight $W_{u} (s)$ on the control: penalise $∥ W_{u} (s) u (s) ∥_{2}^{2}$ instead of $∥ u ∥_{2}^{2}$ . This penalises high-frequency content in $u$ — useful when the actuator is bandwidth-limited or when you want to keep motor jerk low. Implement by augmenting the state with $W_{u}$ ‘s filter states (Anderson-Moore 2007 §8.2). On a Crazyflie this is the difference between LQR commanding 7 kHz rotor chatter (audible whine, ~5% extra current) versus a smoothly varying thrust profile.

Operational-space LQR — penalise Cartesian error, not joint error

For an arm where the end-effector trajectory matters more than joint angles, replace the joint-space cost $x^{⊤} Q x$ with a Cartesian-space cost. Let $y_{cart} = h (q) \in R^{6}$ be the end-effector pose; linearise about $q^{*}$ as $δy = J (q^{*}) δ q$ . Then $Q_{joint} = J^{⊤} Q_{cart} J$ , which generally produces a non-diagonal joint-space $Q$ — but this is the correct coupling. The Cartesian Bryson scaling is much easier to specify (mm error, rad orientation error) than per-joint scaling, especially for redundant arms.

Decoupling for arm joints

When the mass matrix $M (q)$ is far from identity (humanoid hip-knee, fast cobot moves), use the computed-torque transformation $τ = M (q) v + C (q, \overset{q}{˙}) \overset{q}{˙} + g (q)$ where $v$ is a new control input. The resulting plant from $v$ to $\overset{q}{¨}$ is the identity — a decoupled double-integrator on each joint. Per-joint LQR (or per-joint PID) then suffices. This is the standard pattern for fast cobot manipulation (Franka FR3, Kinova Gen3, UR e-Series under high-speed mode).

Loop-transfer recovery (LTR)

LQR alone has guaranteed $[1/2, \infty)$ gain margin and $\geq 60°$ phase margin. Adding a Kalman observer can destroy these (Doyle 1978). LTR chooses the Kalman noise matrices $W, V$ — not from physical noise statistics but from a robustness target — to recover the LQR margins asymptotically. The classical recipe: take $W \to q^{2} \cdot B B^{⊤}$ with $q \to \infty$ (output recovery) or $W = W_{0}$ , $V = V_{0} / q$ with $q \to \infty$ (input recovery). LTR is the standard answer for “my LQG controller has small phase margin.” See Stein-Athans 1987.

When LQR is wrong — common over-applications

Reaching for LQR when PID would suffice is a common mistake by control engineers fresh out of an optimal-control class. Specific anti-patterns:

Single-joint position control after gravity comp — the residual plant is a clean second-order, PID gets you 95% of LQR’s performance with one-tenth the design effort. Stay PID.
Plant is highly nonlinear and the operating envelope is wide — a single LQR is invalid; gain-schedule, iLQR, or MPC are the right answers.
The dominant disturbance is a step (load change, wind gust) — the LQR cost minimises against impulse / white-noise disturbances, not steps. Add integrator augmentation (mandatory) or use disturbance-observer-based control.
Hard limits on $u$ or $x$ matter for safety — LQR can’t honour them. Step up to MPC.
You have ms-level computation budget for the design step — LQR offline design takes seconds (small $n$ ) to minutes (large $n$ ); online recomputation is rarely the bottleneck. But if you genuinely need to redesign every cycle (e.g., morphing robot), iLQR is the appropriate tool.
The plant model is unknown or wildly uncertain — LQR is sensitive to model accuracy. Robust H∞ or adaptive control fit the bill better than LQR with an estimated $A, B$ .

Discretisation pitfalls

c2d(sys, T_s, 'tustin') (bilinear) preserves stability and phase margin but distorts high-frequency dynamics. c2d(sys, T_s, 'zoh') (zero-order hold) is exact for the ZOH-input plant and is the default for digital control. Backward Euler ('backward_diff') is unconditionally stable but introduces extra phase lag — use only when stability dominates accuracy.

Sensor placement and observability — design choice, not afterthought

The observability of $(A, C)$ depends on which sensors you mount. A drone with only IMU is observable for body rates but not for absolute position; add GPS or motion capture for position observability. A balancing robot with only wheel encoders cannot observe IMU drift; add an IMU for tilt. A 6-DOF arm with only joint encoders is observable for $q$ and (after differentiation) $\overset{q}{˙}$ , but not for external torque; add joint torque sensors (UR e-Series strain-gauge wrist, Franka FR3 in-joint sensors). Run the observability rank test on every candidate sensor configuration before specifying the bill of materials — discovering an unobservable mode after the prototype is built is an expensive mistake.

Process and measurement noise — how to set $W$ and $V$ in practice

For a Kalman filter to be optimal, $W$ should match the actual process noise covariance and $V$ the actual measurement noise covariance. In practice you estimate them:

$V$ from datasheet + bench test. Encoder LSB resolution gives the static noise floor (e.g., $σ = 2 π / 2^{17} / 12$ for a 17-bit absolute encoder). IMU datasheets specify noise density in $rad/s / Hz$ ; multiply by $BW$ for the in-band noise floor. Bench-test by recording the stationary plant for 60 s and computing np.var().
$W$ from model uncertainty. Treat unmodelled friction, payload variation, gear backlash, motor torque ripple as additive process noise. A rule of thumb: $W$ on the acceleration state equals $(σ_{τ} / J)^{2}$ where $σ_{τ}$ is the standard deviation of unmodelled torque (typically 5–15% of peak torque for a well-characterised joint).
Allan variance analysis for IMU bias ([[Robotics/bayesian-estimation]] planned) gives the random-walk noise component that should be modelled as a bias state, not as $W$ .

5. Components & sourcing

LQR is software, but the toolchain shapes what you can build.

MATLAB / Simulink — Control System Toolbox

The gold standard, used by Airbus, NASA, Tesla, Bosch, Boston Dynamics for design and code-gen. Key functions:

lqr(A, B, Q, R) — continuous LQR, returns $K, S, e$ .
dlqr(Ad, Bd, Q, R) — discrete LQR.
kalman(sys, W, V) — Kalman estimator design.
lqg(sys, QXU, QWV) — one-shot LQG.
care(A, B, Q, R) / dare(A, B, Q, R) — Riccati solvers.
place(A, B, p) / acker(A, B, p) — pole placement (alternative when you don’t want a cost function).
c2d(sys, Ts, method) — discretisation.
Simulink Embedded Coder generates C code from a state-space block + LQR gain matrix, targeting STM32, TI C2000, NVIDIA DRIVE, Infineon AURIX.

Python — python-control + slycot + scipy

python-control (https://python-control.readthedocs.io/) — open-source MATLAB-equivalent. control.lqr, control.dlqr, control.kalman, control.care, control.ctrb, control.obsv. Backed by slycot (SLICOT bindings) for Riccati solvers.
SciPy — scipy.linalg.solve_continuous_are, scipy.linalg.solve_discrete_are. Bare-metal Riccati when you don’t want the python-control dependency.
JAX + diffrax for differentiable LQR (gradient through Riccati for end-to-end RL / model-learning).
CVXPY — model-predictive variants; not LQR per se but uses the same problem class.

Julia — ControlSystems.jl + MatrixEquations.jl

ControlSystems.lqr(sys, Q, R), dlqr, kalman. Performant, well-suited for differentiable control + autodiff.

Drake (MIT / TRI)

pydrake.systems.controllers.LinearQuadraticRegulator(A, B, Q, R) — the systematic robotics framework. Drake’s LQR is wired into the simulator (MultibodyPlant) so you can linearise a URDF robot model and design LQR in one script. Used in research labs worldwide and by TRI’s manipulation team.

Stable-Baselines / RL world models

For iLQR and learned-dynamics LQR, MuJoCo MPC (Google DeepMind, https://github.com/google-deepmind/mujoco_mpc) provides iLQR + sampling MPC running at hundreds of Hz on commodity x86. Also OCS2 (ETH Zurich) for legged-robot MPC built on iLQR + SLQ.

SLICOT — the numerical engine

SLICOT (Subroutine Library In Control Theory) is the FORTRAN library that underpins MATLAB’s care/dare, python-control’s slycot, and ControlSystems.jl’s MatrixEquations. Key routines: SB02OD (CARE), SB02MD (DARE), SG02AD (generalised Riccati). It’s the production-grade Schur and matrix-sign solver; you almost never call it directly but it’s running underneath.

ROS 2 — no built-in LQR

ros2_control ships PID via control_toolbox::Pid but no LQR module. The standard pattern: design LQR in MATLAB/Python offline; publish the gain matrix as a YAML/Parameter; implement the multiply u = -K (x - x_des) in a custom controller_interface::ControllerInterface plugin. Drake-ROS bridges exist for direct integration (https://github.com/RobotLocomotion/drake-ros).

Embedded LQR

A typical LQR runtime is $O (n \cdot m)$ multiply-accumulates per cycle. For a $6 \times 12$ matrix on a Cortex-M4 at 168 MHz: 72 MACs, ~0.5 μs per cycle. Even a 100×100 state (Atlas-class whole-body) finishes in <100 μs on a Cortex-A78 with NEON SIMD. MCU LQR is essentially free; the design effort is offline.

Hardware targets — typical

Platform	Loop rate	Use case
STM32F4 (84 MHz)	1–10 kHz	Joint-level LQR + Kalman
STM32H7 (480 MHz)	10–50 kHz	Flight controller LQR
TI C2000 F2837x	50–200 kHz	Servo drive inner LQR
Cortex-A78 / Apple M-series	250 Hz – 1 kHz	Whole-body MPC + LQR
FPGA (Xilinx Zynq)	>100 kHz	Ultra-fast power-electronics LQR

Verification and validation tools

MATLAB Simulink Verification & Validation — formal proof of stability margins, gain coverage, Monte Carlo robustness sweeps over plant-parameter uncertainty.
dSPACE SCALEXIO — hardware-in-the-loop test rig for real-time closed-loop verification before silicon.
CARLA / LGSVL — driving-simulator HIL for automotive ACC LQR.
Gazebo + ROS 2 Ignition — open-source HIL for ground and aerial robots; integrates with control_toolbox for plug-in LQR controllers.
PX4 SITL (Software-in-the-Loop) — full PX4 firmware against a simulated quadrotor plant; ideal for LQR controller verification before flight.
Drake Simulator + MultibodyPlant — exact-physics simulator with continuous-time integration; the rigorous environment for LQR design and verification.

Codegen toolchains

MATLAB Embedded Coder — converts a Simulink LQR + Kalman block diagram into ANSI C targeting AUTOSAR, AUTOSAR Classic, or bare-metal. Standard in automotive autopilots (Tesla, Volvo, BMW), industrial servo drives (Yaskawa, Mitsubishi), and aerospace (Honeywell, Collins Aerospace).
MATLAB HDL Coder — generates VHDL/Verilog for FPGA LQR (Xilinx, Intel/Altera). Used when the LQR loop rate exceeds 100 kHz (power-electronics, motor-control inner loop).
dSPACE TargetLink — embedded C from Simulink with automotive-grade certification.
Drake LinearQuadraticRegulator → C++ via drake::systems::Diagram and SerializeToProto. Used by Toyota Research Institute’s manipulation stack.
OpenModelica / Modelon Optimica — for system-level modelling with code-gen of LQR/MPC alongside thermal, hydraulic, multibody plants.

6. Reference data

Bryson scaling for common robot states

State	$x_{max}$ typical	$1/ x_{max}^{2}$
Joint angle (arm)	1.5 rad	0.44
Joint velocity (arm)	3 rad/s	0.11
Cart position (balance)	0.5 m	4.0
Roll/pitch angle (drone)	0.3 rad	11.1
Yaw angle (drone)	0.5 rad	4.0
Body rate (drone)	3 rad/s	0.11
Heading (mobile base)	π rad	0.10
Linear vel (mobile base)	1 m/s	1.0

LQR gain ranges by robot class (starting points, parallel form)

Robot class	State dim	$K$ entry magnitude	Typical bandwidth
Cart-pole / Segway	4	30–200 (mixed units)	3–10 rad/s
Quadrotor attitude	6	0.5–5 N·m/rad, 0.1–1 N·m·s/rad	30–80 rad/s
Cobot single joint (post-FF)	2	200–800 N·m/rad, 50–200 N·m·s/rad	30–100 Hz
Mobile-base diff drive	4	5–50 N·s/m, 1–10 N·m·s/rad	1–5 Hz
ACC longitudinal	3	0.1–1.0 m/s² per m	0.3–1 Hz
Humanoid balance (LIP)	4	50–500 N per rad/m, 10–100 N·s	2–5 Hz

Bryson scaling for common robot inputs

Input	$u_{max}$ typical	$1/ u_{max}^{2}$
Joint torque (mid arm)	50 N·m	4×10⁻⁴
Wheel torque (AGV)	5 N·m	0.04
Drone torque axis (1 kg)	0.5 N·m	4.0
Drone yaw torque	0.3 N·m	11.1
Cart force (balance)	20 N	0.0025

LQR vs PID vs MPC decision matrix

Criterion	PID	LQR	LQG	MPC
Plant model needed	No	Yes ( $A, B$ )	Yes + noise	Yes + constraints
Computation	$O (1)$	$O (nm)$	$O (n^{2})$	$O (N \cdot n^{3})$
Handles MIMO	Awkwardly	Yes	Yes	Yes
Handles hard limits	Anti-windup only	No	No	Yes
Handles noisy measurements	Poorly	No	Yes	With Kalman
Guaranteed margins	No	$[1/2, \infty)$ , $\geq 60°$	After LTR	Cost-dependent
Tuning knobs	3 per loop	$Q, R$	$Q, R, W, V$	$Q, R$ , horizon
Where used	Joint, motor, drone PID	Cart-pole, drone att	UR cobots, Crazyflie	Atlas, Spot, Tesla AP

Typical $Q$ and $R$ ranges across robot classes (Bryson starting point)

Robot	$Q$ scale (max state weight)	$R$ scale (input weight)	Notes
Cart-pole	30–100 on $θ$	0.001–0.01 on $F$	Pole-angle penalty dominates
Quadrotor attitude	5–20 on angles	0.1–10 on torques	Yaw weighted lower than roll/pitch
Cobot joint	100–1000 on $q$ error	10⁻⁴ – 10⁻² on $τ$	After computed-torque
Mobile base	1–10 on pos/heading	0.01–0.1 on wheel torque	Outer position loop
ACC longitudinal	0.1–1 on velocity error	0.05–0.5 on accel cmd	Jerk integral often added
Humanoid balance	50–500 on COM offset	0.001–0.01 on hip torque	LIP linearisation

State dimensions for common robots

Robot	$n$ (state dim)	$m$ (input dim)	Typical LQR design
Single joint (after FF)	2	1	$K \in R^{1 \times 2}$
6-DOF arm (lin)	12	6	$K \in R^{6 \times 12}$
Cart-pole	4	1	$K \in R^{1 \times 4}$
Quadrotor attitude	6	3	$K \in R^{3 \times 6}$
Quadrotor full (att + pos)	12	4	$K \in R^{4 \times 12}$
Diff-drive base	4	2	$K \in R^{2 \times 4}$
Humanoid lower body	~30	~12	Gain-scheduled
Humanoid whole-body (Atlas)	~60	~30	iLQR + MPC, not pure LQR
Fixed-wing UAV (3D)	12	4	$K \in R^{4 \times 12}$ + scheduling

SLICOT routines used by python-control / ControlSystems.jl

SLICOT routine	Purpose
`SB02OD`	Continuous-time algebraic Riccati equation (CARE)
`SB02MD`	Discrete-time algebraic Riccati equation (DARE)
`SG02AD`	Generalised CARE/DARE with descriptor systems
`MB05ND`	Matrix exponential, scaling-and-squaring + Padé
`AB01ND`	Controllability staircase form
`AB04MD`	Continuous-discrete conversion (Tustin)
`TB04AD`	State-space to transfer function
`TG01AD`	Generalised system balancing

Typical closed-loop bandwidth from LQR on robot subsystems

Subsystem	Open-loop BW	LQR-closed BW
Cart-pole balance	unstable	3–10 rad/s (0.5–1.5 Hz)
Quadrotor attitude	marginal	30–80 rad/s (5–13 Hz)
Drone position	very slow	2–5 rad/s (0.3–0.8 Hz)
Cobot joint (post-FF)	~5 Hz	30–100 Hz
Balancing robot	unstable	2–5 Hz

Decision flowchart — picking the controller

Plant is SISO and ~linear?  ──yes──→ PID (Robotics/pid-control)
            │ no
            ↓
Plant has hard constraints? ──yes──→ MPC (Engineering/mpc-control)
            │ no
            ↓
Plant model known (A, B)?   ──no───→ Robust H∞ / adaptive control
            │ yes
            ↓
All states measured?         ──no───→ LQG (LQR + Kalman)
            │ yes
            ↓
Linearisation valid?         ──yes──→ LQR
            │ no
            ↓
                       Gain-schedule LQR, or iLQR/DDP

Riccati solver runtime — order of magnitude

For an $n \times n$ Hamiltonian Schur decomposition, runtime scales $O (n^{3})$ . Concrete numbers on a 2024 laptop CPU (single thread):

$n$	Schur (SLICOT `SB02OD`)	Sign-function	Use case
4	50 μs	80 μs	Cart-pole, single joint
12	0.4 ms	0.6 ms	6-DOF arm, drone full
30	5 ms	8 ms	Humanoid lower body
60	30 ms	50 ms	Atlas whole-body
100	100 ms	200 ms	Large industrial line

These are offline design times. The runtime cost — multiplying $K$ by $\overset{x}{^}$ — is $O (nm)$ MACs, microseconds on any MCU.

7. Failure modes & debugging

Symptom → likely cause

Symptom	Likely cause	Fix
Riccati solver returns junk / non-PD $P$	$(A, B)$ not stabilisable, or $Q$ not PSD, or $R$ not PD	Check `ctrb` rank; ensure $Q ⪰ 0$ , $R ≻ 0$
Closed-loop poles too far in LHP	$Q$ too aggressive	Scale $Q$ down by 10×
Saturation, then divergence	LQR doesn’t model limits	Add integrator anti-windup or step up to MPC
Estimator diverges, then plant follows	Kalman tuning wrong	Check $W, V$ match real noise; whiteness-test residual
Numerical issues, large $n$	Hamiltonian conditioning	Use Schur or sign-function method; scale $A$
Loss of margin after Kalman observer added	Generic LQG fragility	Apply LTR with $q ≫ 1$
Observability rank loss at some pose	Configuration-dependent $C$	Gain-schedule estimator, or fuse a different sensor
Discretisation mismatch	$T_{s}$ too large	Either redesign in discrete, or speed up loop
Limit cycle on $u$ at zero error	Numerical noise in $\overset{x}{^}$ × small $K$ entries	Deadband on $\overset{x}{^}$ , or square-root Kalman
Plant changes mid-mission	Linearisation invalid	Gain-schedule, or step up to iLQR/MPC
Cross-axis bleed in attitude LQR	Off-diagonal $B$ ignored	Use measured inertia tensor (not assumed diagonal)

Detailed failure modes

Riccati does not converge. scipy.linalg.solve_continuous_are raises LinAlgError or returns garbage. Cause: $(A, B)$ not stabilisable, $Q$ not PSD, $R$ not PD, or numerical conditioning bad. Fix: check np.linalg.matrix_rank(ctrb(A, B)) == n; ensure np.all(np.linalg.eigvalsh(Q) >= 0); rescale $A$ to unit norm before solving.
Closed-loop poles too aggressive. $K$ implies a 100-Hz bandwidth on a 10-kg joint; the actual motor can deliver only 30 Hz. The controller saturates constantly. Fix: reduce the $Q$ entries on the relevant states by 10×; re-solve; verify the resulting bandwidth against actuator capability via Bode plot of $(A - B K)$ .
Estimator divergence. Kalman residual grows unboundedly. Causes: $W$ is much smaller than real process noise (filter “trusts the model” too much, model is wrong); $V$ is much larger than real sensor noise (filter “ignores” the measurement); plant linearisation is wrong; bias not modelled. Fix: log the innovation $ν = y - C \overset{x}{^}$ , check $cov (ν) \approx C P_{e} C^{⊤} + V$ (whiteness test); if not, retune $W, V$ .
Observability lost in pose. A redundant 7-DOF arm at a wrist singularity loses observability on one DOF; a planar drone’s yaw is unobservable without magnetometer; a wheeled-base’s heading is unobservable when stationary. Fix: redesign measurement set (add an IMU, add a vision tag), or accept the unobservable mode and ensure it remains stable in open-loop.
Discretisation mismatch. Design at continuous-time then ship without discretisation, or use $T_{s}$ too large. The phase lag of half a sample period appears as extra delay. Fix: design with dlqr against the c2d(...,'zoh')-discretised plant; or aim $T_{s} \leq 1/ (20 \cdot BW_{closed-loop})$ .
Sat’d actuator. LQR doesn’t know about $u_{max}$ . With aggressive $Q$ on a small actuator, $u$ saturates and the plant runs open-loop. The integrator (in augmented LQR) winds up. Fix: anti-windup back-calculation on the integrator state; or relax $Q$ until $u$ stays inside limits; or step up to MPC.
Wrong noise statistics in Kalman. Suboptimal but stable. Detect by innovation whiteness test: residuals should be white-noise with covariance $C P_{e} C^{⊤} + V$ . If the test fails, the Kalman filter is using wrong noise models — recompute from measured data with xcorr and iddata.
Numerical conditioning for large $n$ . $n > 50$ states with stiff dynamics ( $λ_{max} / λ_{min} > 1 0^{6}$ ) breaks naive Riccati solvers. Use balancing (scipy.linalg.matrix_balance) before solving, or the sign-function method which is more robust than Schur for stiff problems.
LQR-LQG gap. Pure LQR has $[1/2, \infty)$ gain margin. Pure LQG can have margin near zero (Doyle 1978). Detect: simulate with a 50% gain reduction in one input channel; if the closed-loop is unstable, the LQG has lost its margin. Fix: apply LTR ( $q$ large), or design with H∞ instead.
Wrong linearisation. Linearised about the wrong operating point. Detect: simulate the nonlinear plant under the LQR; if the response differs substantially from the linear simulation, the operating point is off. Fix: re-find the equilibrium with scipy.optimize.fsolve(f, x0, args=(u0,)) and re-linearise there.
Pole-zero cancellation hiding instability. A pole in $A$ at $+ 5$ can be cancelled in the transfer function $C (s I - A)^{- 1} B$ by a zero in $C$ , hiding the unstable mode from input-output analysis. Internal stability requires all eigenvalues of $A - B K$ in LHP. Always check eigenvalues directly, not just the transfer function.
Reference governor saturation surprise. Augmented integrator gives zero steady-state error, but a large step reference saturates $u$ for many cycles. The integrator state has wound up. Fix: shape the reference with a smooth trajectory generator ([[Robotics/trajectory-generation]] planned), so $u$ stays inside limits during transients.
State-coordinate scaling causing ill-conditioning. When state entries span 6+ orders of magnitude (e.g., position in metres alongside body rates in rad/s and motor current in milliamps), the Hamiltonian becomes badly conditioned and Riccati’s solution is dominated by floating-point error. Fix: rescale states to unit order of magnitude with a diagonal transformation $T$ before designing, then transform $K$ back: $K_{phys} = K_{scaled} T^{- 1}$ .
Integrator wind-up in the augmented state. When you augment with $x_{int} = \int (y_{des} - y) d t$ and the actuator saturates, the augmented LQR has no built-in mechanism to clamp $x_{int}$ . Fix: when $u$ is clipped, freeze $x_{int}$ for that cycle, or apply back-calculation $x_{int}^{+} = x_{int} - K_{b} T_{s} (u_{unsat} - u_{actual})$ .
Plant-data mismatch between identification and deployment. $A, B$ identified on a fresh prototype; six months later belt tension has changed and inertia is 10% off. The LQR works but bandwidth has shifted. Fix: re-identify periodically, or use adaptive control. For UR-class cobots the manufacturer recompiles new inertial parameters with every payload-mass change at startup.

Diagnostic procedure for a misbehaving LQR

Verify controllability: np.linalg.matrix_rank(ctrb(A, B)) == n.
Verify $Q ⪰ 0$ , $R ≻ 0$ .
Solve Riccati; verify $P = P^{⊤}$ and $P ≻ 0$ .
Compute eigenvalues of $A - B K$ ; verify all are in LHP.
Simulate the linear closed-loop; verify response matches design.
Simulate the nonlinear closed-loop; verify response is similar.
Hardware test with conservative $Q$ first; ramp up.
Check actuator saturation log; if saturated >5% of the time, redesign.

A note on adaptive and learning-based LQR variants

The 2020s have seen a wave of methods that learn $(A, B)$ or $K$ directly from data:

Adaptive LQR (Lale, Azizzadenesheli, Anandkumar 2020 “Logarithmic regret bound in partially observable linear dynamical systems,” NeurIPS) — online identification + LQR redesign with provable regret bounds. Used in research-stage adaptive robotic systems.
DAgger-style imitation learning of $K (x)$ — train a neural net to approximate LQR’s piecewise-affine policy; useful when classical LQR is too slow for the online budget.
Differentiable LQR — implement Riccati as a differentiable operator (via the implicit function theorem); train end-to-end with downstream task losses. Used in DeepMind’s robotics work and in Drake’s Diagram autodiff.
MFAC (Model-Free Adaptive Control) — fully data-driven, no $A, B$ model needed; converges to LQR-equivalent behaviour when the plant is approximately linear.

These are research methods as of 2026, not production-grade for safety-critical robotics. The proven recipe is still: identify $(A, B)$ offline, design LQR offline, ship the gain matrix.

Diagnostic procedure for a misbehaving Kalman filter

Log the innovation $ν_{k} = y_{k} - C \overset{x}{^}_{k}^{-}$ over many cycles.
Compute the empirical mean — it should be near zero. Bias means model has a constant error not in the state.
Compute the empirical covariance — it should match $C P_{e} C^{⊤} + V$ within ~20%. Mismatch means $W, V$ are wrong.
Compute autocorrelation of $ν$ — it should be white (delta-correlated). Coloured residual means the noise model is wrong (the noise has memory the filter isn’t capturing).
If the residual fails whiteness but the variance matches, add bias states (slowly varying $b_{g}, b_{a}$ for IMU gyro/accel bias).
If the residual is white but the variance is too large, $V$ is underestimated — increase it.
If the residual blows up (NaN, divergence), $W$ is too small relative to model error — increase it, or switch to a robust filter (Huber EKF, particle filter).

8. Case studies

Case A — Boston Dynamics Atlas balance and walking (whole-body LQR + MPC)

Atlas (released 2013, current Electric Atlas as of 2026) implements a multi-layer optimal control stack with LQR features at multiple layers (Kuindersma et al. 2016 “Optimization-based locomotion planning, estimation, and control design for the atlas humanoid robot,” Autonomous Robots).

Joint-level LQR for postural stability, ~1 kHz. Each leg’s actuators are torque-controlled; the joint LQR uses a 2-state-per-joint plant after gravity feedforward (similar to Example C). Inertia matrix $M (q)$ from the URDF is used for the feedforward.
Body-level LQR for balance against perturbations, ~200 Hz. State is centroidal momentum $(p_{com}, L)$ (6 components — Linear inverted-pendulum reduction). Used to compute desired contact wrenches that the lower-level joint LQR realises.
Whole-body MPC (Drake’s TrajectoryOptimization, later replaced with a custom solver) at ~30 Hz, plans 6-DOF body pose, foot contact schedule, and centroidal-momentum trajectory.

The LQR layers are gain-scheduled on gait phase (stance vs swing) and on terrain estimate. Each gain set is precomputed for ~30 representative configurations + interpolated.

Why this matters: Atlas demonstrates that even when MPC is doing the heavy lifting at the planning layer, LQR remains the right tool for the fast (>100 Hz) inner balance loop where you can’t afford the computation of full MPC. The interplay — slow MPC plans the trajectory, fast LQR tracks it — is the dominant pattern in modern humanoid control.

Case B — Crazyflie 2.1 LQR attitude controller (ETH research firmware)

The Bitcraze Crazyflie 2.1 (released 2018, current 2026) is a 27 g nano-quadrotor whose stock PX4-derived firmware runs cascaded PID. ETH Zurich’s Institute for Dynamic Systems and Control published an LQR attitude controller fork (Furrer, Burri, Achtelik, Siegwart 2016 “RotorS — A Modular Gazebo MAV Simulator Framework,” ICRA Workshop) that replaces the inner attitude loop with state-space LQR.

State: $[ϕ, θ, ψ, p, q, r]^{⊤}$ (Euler angles and body rates, $n = 6$ ).
Input: Body torques $[τ_{x}, τ_{y}, τ_{z}]^{⊤}$ (m = 3).
Linearisation: About hover, with the Crazyflie’s measured inertia tensor (slightly off-diagonal due to battery placement).
$Q$ : Bryson scaling with $ϕ_{max} = 30°$ , $p_{max} = 5$ rad/s.
$R$ : Tuned for closed-loop bandwidth of ~12 Hz on roll/pitch, 5 Hz on yaw.
Implementation: STM32F4 at 168 MHz; the $3 \times 6$ matrix multiply runs in ~5 μs, well below the 1 ms loop period.

The benchmark result: in a step-disturbance test (a 0.1 N·m impulse on roll), LQR settles in 0.4 s versus PID’s 0.6 s, and uses 18% less rotor energy over the transient. The pure attitude-LQR doesn’t outperform PID dramatically because the Crazyflie’s inertia tensor is nearly diagonal — the off-diagonal coupling that LQR exploits is small. On a non-symmetric airframe (a coaxial heli, a tilt-rotor, a Y6 hex), the advantage grows substantially.

Why this matters: Crazyflie is the educational reference. The firmware is open, the hardware is $\$ 200$, and you can deploy an LQR design from python-control directly. It’s the lowest-friction way to confirm a paper-stage LQR design actually flies.

Case C — Apollo Lunar Module pitch-yaw guidance (the historical reference)

The Apollo 11 Lunar Module’s powered descent guidance used a Linear-Quadratic-Gaussian (LQG) controller for pitch and yaw attitude during the powered descent phase (Battin 1987 “An Introduction to the Mathematics and Methods of Astrodynamics” revised edition; Klumpp 1974 “Apollo Lunar Descent Guidance,” Automatica). Although the implementation predates digital control practice — the Apollo Guidance Computer was a 2.048 MHz word-serial machine with 36 KB of fixed memory and 2 KB of RAM — the theory was pure LQG:

State: pitch, yaw, pitch rate, yaw rate, plus integrated tracking error against the guidance reference.
Input: RCS thruster pulse-frequency modulation.
Q, R: derived from fuel-budget and pointing-accuracy specs (sufficient pointing for radar lock on landing site; minimum RCS fuel).
Kalman filter: fused gimbaled IMU (LM Inertial Measurement Unit) with the radar altimeter and rendezvous radar updates.

The actual onboard computation was a fixed-gain matrix derived offline by MIT Instrumentation Laboratory researchers using Riccati methods that Kalman had published only 9 years earlier (Kalman 1960). This was the first flight-critical LQR/LQG controller.

Why this matters: The lineage from Kalman 1960 → Apollo 1969 → modern robotics is direct. The same Riccati equation, the same separation principle, the same gain-table-shipped-to-firmware pattern. What’s changed is the computation horizon: Apollo’s LM Computer solved Riccati once on the ground; a modern Crazyflie can solve a fresh CARE in 50 μs on its STM32H7. The wall has moved from “can we solve it at all” to “we can solve it 1000× per second if we want, but mostly we don’t need to.”

Case D — Tesla Autopilot longitudinal LQR (vehicle dynamics)

Tesla’s Autopilot (and most production ACC systems — Mercedes Distronic, BMW Active Cruise, Honda Sensing) uses LQR-class control on the longitudinal-control problem: maintain a target speed or follow-distance while minimising jerk. The reduced longitudinal plant is

$\overset{x}{˙} = 000100 01 - 1/ τ_{a} x + 00 1/ τ_{a} a_{cmd}$

with $x = [position, velocity, accel]^{⊤}$ and $τ_{a}$ the actuator lag (~0.2 s for throttle, 0.1 s for regen braking). Bryson on $a_{max} = 4$ m/s², jerk-equivalent integral on accel gives the typical ride-quality target $J_{max} = 3$ m/s³.

The gain matrix $K \in R^{1 \times 3}$ is precomputed offline; the closed-loop bandwidth is ~0.5 Hz on position, ~2 Hz on velocity. The same architecture appears in Cruise Origin, Waymo’s planner, and CARLA’s PID-LQR comparison suite. Why this matters: automotive longitudinal control is the highest-volume LQR deployment in the world — tens of millions of vehicles. The bar for tuning and verification is high; the design pattern (linearise about cruise speed, gain-schedule on speed and grade, integrator-augment for grade rejection) is industry-standard.

Case E — Franka FR3 cobot impedance + Cartesian LQR layer

The Franka Emika FR3 (released 2023, the successor to Panda) implements a layered control stack where joint-level torque control is PI on current + computed-torque feedforward, but the outer Cartesian impedance layer is structurally an LQR-derived design (Albu-Schäffer et al. 2007 “Cartesian impedance control of redundant and flexible-joint robots,” Robotics and Autonomous Systems):

Inner joint torque: 1 kHz PI on current, ~5 kHz current loop on FOC.
Outer joint LQR on the post-feedforward double-integrator plant: $K \in R^{7 \times 14}$ per arm (7 joints × 2 states each), 1 kHz update.
Cartesian impedance / admittance at 1 kHz: virtual 6-DOF mass-spring-damper at the end-effector, with stiffness $K_{cart} \in {50 \dots 3000}$ N/m and damping critically tuned. The map to joint torques is $τ = J (q)^{⊤} F_{cart} + nullspace$ .

The Franka FR3 SDK (libfranka) exposes the joint torque interface (franka::ControlException-safe at 1 kHz on PREEMPT_RT Linux) with a default impedance controller; user code overrides it with custom LQR/admittance gains. The stiffness gain is gain-scheduled on task phase (low for guarded compliant exploration, high for stiff trajectory tracking).

Why this matters: the Franka shows that LQR-derived design lives even inside controllers marketed as “impedance control.” The PD impedance gain is mathematically equivalent to a pole-placement variant of LQR with a Cartesian Bryson scaling; the difference is just how the designer thinks about specifying the gain (impedance physics vs cost-function weights).

9. Cross-references

Robotics (this library):

[[Robotics/kinematics-dh]] — provides $q \to$ pose for state definition.
[[Robotics/dynamics-rigid-body]] — manipulator equation that linearises into $A, B$ .
planned [[Robotics/motors-electric]] — current loop dynamics underneath joint LQR.
[[Robotics/pid-control]] — when PID is sufficient; when to escalate to LQR.
planned [[Robotics/impedance-control]] — Cartesian compliance, complementary to LQR.
planned [[Robotics/bayesian-estimation]] — EKF/UKF for nonlinear plants beyond linear Kalman.
planned [[Robotics/legged-robotics]] — balance and walking, the LQR+MPC pattern.
planned [[Robotics/multirotor-design]] — drone attitude as a canonical LQR target.
planned [[Robotics/trajectory-generation]] — smooth references for LQR tracking.

Engineering (foundations):

[[Engineering/state-space-methods]] — the theory companion; CARE, controllability, observability, Hamiltonian structure, separation principle.
[[Engineering/classical-control]] — what LQR’s robustness margins are measured against.
planned [[Engineering/digital-control]] — discretisation, z-domain, anti-aliasing.
planned [[Engineering/mpc-control]] — the constrained-LQR extension.
planned [[Engineering/realtime-embedded]] — running LQR at kHz rates on MCUs.

Languages:

planned [[Languages/Tier3/scientific]] — Control System Toolbox; lqr, kalman, lqg.
planned [[Languages/Tier3/robotics-control]] — ros2_control, control_toolbox (no LQR; integrate Drake or custom plugin).

10. Citations

Foundational papers

Kalman, R. E. (1960). “A New Approach to Linear Filtering and Prediction Problems.” Trans. ASME J. Basic Engineering 82(D), 35–45. The Kalman filter paper; foundational to LQG estimation.
Kalman, R. E. (1960). “Contributions to the theory of optimal control.” Boletin de la Sociedad Matematica Mexicana 5, 102–119. The LQR paper.
Kalman, R. E. (1961). “New results in linear filtering and prediction theory.” Trans. ASME J. Basic Engineering 83(D), 95–108. (Co-authored with Bucy.)
Doyle, J. C. (1978). “Guaranteed margins for LQG regulators.” IEEE Trans. Automat. Contr. 23(4), 756–757. The famous counterexample showing LQG can have arbitrarily small margins; motivation for LTR.
Doyle, J. C. & Stein, G. (1979). “Robustness with observers.” IEEE Trans. Automat. Contr. 24(4), 607–611. Foundational paper on Loop-Transfer Recovery.
Lehtomaki, N. A., Sandell, N. R. & Athans, M. (1981). “Robustness results in linear-quadratic Gaussian based multivariable control designs.” IEEE Trans. Automat. Contr. 26(1), 75–93. MIMO extension of LQR margin guarantees.
Stein, G. & Athans, M. (1987). “The LQG/LTR procedure for multivariable feedback control design.” IEEE Trans. Automat. Contr. 32(2), 105–114.
Klumpp, A. R. (1974). “Apollo Lunar Descent Guidance.” Automatica 10(2), 133–146. The first flight-critical LQG.
Kuindersma, S., Deits, R., Fallon, M., et al. (2016). “Optimization-based locomotion planning, estimation, and control design for the atlas humanoid robot.” Autonomous Robots 40(3), 429–455. doi:10.1007/s10514-015-9479-3.
Furrer, F., Burri, M., Achtelik, M. & Siegwart, R. (2016). “RotorS — A Modular Gazebo MAV Simulator Framework.” ICRA 2016 Workshop on MAV Simulators. doi:10.1007/978-3-319-26054-9_23.
Tassa, Y., Mansard, N. & Todorov, E. (2014). “Control-limited differential dynamic programming.” ICRA 2014, 1168–1175. doi:10.1109/ICRA.2014.6907001. iLQR with input constraints.
Khatib, O. (1987). “A unified approach for motion and force control of robot manipulators: The operational space formulation.” IEEE J. Robot. Autom. 3(1), 43–53.

Books

Anderson, B. D. O. & Moore, J. B. (2007). Optimal Control: Linear Quadratic Methods. Dover (reprint of 1990 Prentice Hall edition). The canonical LQR/LQG reference; CARE, DARE, frequency-shaped LQR, LTR.
Anderson, B. D. O. & Moore, J. B. (1979). Optimal Filtering. Prentice Hall. The Kalman-filter canonical text.
Bryson, A. E. & Ho, Y.-C. (1975). Applied Optimal Control: Optimization, Estimation, and Control (revised printing). Hemisphere. Source of Bryson’s scaling rule.
Athans, M. & Falb, P. L. (1966). Optimal Control: An Introduction to the Theory and Its Applications. McGraw-Hill (reprinted Dover 2007).
Skogestad, S. & Postlethwaite, I. (2005). Multivariable Feedback Control: Analysis and Design (2nd ed.). Wiley. Robust MIMO control; H₂/H∞; LQG with weighting.
Kailath, T. (1980). Linear Systems. Prentice Hall.
Friedland, B. (1986). Control System Design: An Introduction to State-Space Methods. McGraw-Hill.
Battin, R. H. (1987). An Introduction to the Mathematics and Methods of Astrodynamics. AIAA Education Series. Apollo-era LQG details.
Murray, R. M., Li, Z. & Sastry, S. S. (1994). A Mathematical Introduction to Robotic Manipulation. CRC Press. Free PDF: https://www.cds.caltech.edu/~murray/mlswiki/.
Lynch, K. M. & Park, F. C. (2017). Modern Robotics: Mechanics, Planning, and Control. Cambridge University Press. Free online: http://modernrobotics.org.
Tedrake, R. (2024). Underactuated Robotics: Algorithms for Walking, Running, Swimming, Flying, and Manipulation. Free online textbook, MIT 6.832: http://underactuated.mit.edu. The single best reference for LQR, iLQR, and trajectory optimisation in robotics.
Maciejowski, J. M. (2002). Predictive Control with Constraints. Prentice Hall.

Software documentation

MATLAB Control System Toolbox — https://www.mathworks.com/help/control/. Functions: lqr, dlqr, kalman, lqg, care, dare, c2d, place, ctrb, obsv.
python-control documentation — https://python-control.readthedocs.io/.
SciPy linear algebra — scipy.linalg.solve_continuous_are, solve_discrete_are. https://docs.scipy.org/doc/scipy/reference/linalg.html.
SLICOT library — https://www.slicot.org/. FORTRAN reference implementations of CARE/DARE.
Drake — https://drake.mit.edu/. pydrake.systems.controllers.LinearQuadraticRegulator.
MuJoCo MPC — https://github.com/google-deepmind/mujoco_mpc. iLQR and sampling MPC.
ControlSystems.jl — https://juliacontrol.github.io/ControlSystems.jl/.
PX4 Autopilot — https://docs.px4.io/. Reference cascaded-PID architecture; LQR variants in research forks.
Crazyflie firmware — https://github.com/bitcraze/crazyflie-firmware.

Standards

ISO 10218-1:2011. Robots and robotic devices — Safety requirements — Part 1: Industrial robots. Affects closed-loop response time requirements.
IEC 61508:2010. Functional safety of electrical/electronic/programmable electronic safety-related systems. SIL ratings for safety-critical LQR.
ISO 26262:2018. Road vehicles — Functional safety. Automotive functional safety (ASIL ratings) — directly applies to ACC LQR in production cars.
DO-178C:2011. Software Considerations in Airborne Systems and Equipment Certification. Aviation-grade certification — relevant to fly-by-wire LQR/LQG.

Additional papers and online references

Albu-Schäffer, A., Ott, C. & Hirzinger, G. (2007). “A unified passivity-based control framework for position, torque and impedance control of flexible joint robots.” Int. J. Robotics Research 26(1), 23–39. Foundation of Franka FR3 control architecture.
Underactuated Robotics LQR chapter — http://underactuated.mit.edu/lqr.html. The most pedagogically clear modern treatment, with runnable code.
Russ Tedrake LQR + iLQR lectures — MIT 6.832 course videos on YouTube.
Stanford EE263 Introduction to Linear Dynamical Systems — http://ee263.stanford.edu/. The undergrad-level state-space introduction.
Caltech CDS 110 lecture notes (Murray) — https://www.cds.caltech.edu/~murray/courses/cds110.html.
Wikipedia “Linear-quadratic regulator” — surprisingly accurate; updated with current SLICOT routine references.

Session log

Append a one-line activity entry after writing this note:

node ~/.claude/bin/obsidian-research.mjs log "Built Robotics/state-space-lqr.md Tier 1 deep note"

Compendium

Explorer

State-Space & LQR Control for Robotics — Robotics Reference