State-Space & LQR Control for Robotics — Linear-Quadratic Regulator, Kalman Filter, LQG
See also (Tier 3 family index): Control Algorithm Zoo
Scope. Theory — derivations of CARE, the Riccati equation, separation principle, controllability/observability proofs — lives in
[[Engineering/state-space-methods]]. This note is the robotics-applied counterpart: the rules you follow when picking for an actual mechanism, the choices that survive on real hardware, the gotchas at every operating point you re-linearise about, and the case studies of LQR/LQG running on balancing platforms, drones, and arms. PID and its limits are in[[Robotics/pid-control]]; this note covers where you outgrow that workhorse.
1. At a glance
LQR is the canonical “step beyond PID” in robotics. You reach for it when one or more of the following becomes painful with cascaded SISO PID:
- MIMO coupling that’s too strong to ignore. A quadrotor’s roll, pitch and yaw rates are coupled through the inertia tensor and the gyroscopic precession from spinning rotors. A humanoid balance controller couples ankle torque to hip motion through the linear-inverted-pendulum dynamics. Independent per-axis PIDs fight each other; LQR optimises across the full state.
- Measurements that aren’t direct outputs. A balancing robot measures wheel angle and IMU tilt; the state you want to control includes velocities you never measure directly. A Kalman filter reconstructs them; you regulate the estimated state.
- Optimal-cost objectives. “Minimise energy at the rotor while holding altitude” or “minimise wheel torque while keeping the pole upright” are weighted-quadratic objectives that PID can’t express directly. LQR makes the trade-off a tuning knob — the weight matrices and .
- Open-loop unstable plants. A cart-pole upright, a Segway, a hopping robot’s flight phase, a tail-sitter VTOL transition — these have right-half-plane eigenvalues. PID can stabilise them, but the design margin is razor-thin; LQR gives provable closed-loop stability with guaranteed gain/phase margins (Doyle 1978).
The deliverable is a constant gain matrix such that . Compute once offline by solving the continuous-time algebraic Riccati equation (CARE); ship it as a or numerical array; the runtime cost is a matrix-vector multiply that any modern microcontroller does in microseconds.
Where it sits in the design stack. Sensors → state estimator (Kalman / EKF / UKF / Luenberger) → LQR feedback gain → actuator allocator → plant. The estimator and the gain are dual: design them by the same Riccati-equation machinery with the role of versus interchanged. The combination — full-state feedback driven by a Kalman-filtered state estimate — is LQG (Linear-Quadratic-Gaussian), the most common optimal-control architecture in real robotics.
Common variants.
- LQR — full-state feedback, no estimator; assumes you measure (or can compute) every state.
- LQG — LQR + Kalman filter; for noisy/partial measurements. Workhorse architecture.
- LTR (Loop-Transfer Recovery) — restores LQR’s robustness margins after the Kalman observer eats some of them (Doyle-Stein 1979).
- DLQR / DLQG — discrete-time versions; what actually runs on the MCU after
c2d()discretisation. - iLQR / DDP — iterative LQR for nonlinear trajectory optimisation; not strictly LQR but uses the same Riccati machinery on time-varying linearisations. Used by Boston Dynamics, MuJoCo MPC, Drake’s TrajectoryOptimization.
- Frequency-shaped LQR — adds a frequency weighting on to penalise high-frequency control effort; produces smoother actuator commands (Anderson-Moore 2007 ch. 8).
- LQR with integral augmentation — appends to the state; gives zero steady-state tracking error against constant references and constant disturbances.
First ask before applying LQR: Is my linearisation valid in the operating envelope I care about? If yes → LQR is appropriate. If the plant explores nonlinear territory (humanoid walking, drone aggressive maneuvers, manipulator working near singularity) → gain-schedule a family of LQRs, or step up to iLQR/MPC. Is my plant controllable from the available actuators? If rank[B, AB, ..., A^{n-1}B] < n, no LQR design exists — fix the actuator set or accept a partially controllable design. Is the state observable? Same test on ; if not, the Kalman filter cannot reconstruct the unobservable modes.
2. First principles
The full derivations — controllability/observability Gramians, Hamiltonian structure of CARE, separation principle — are in [[Engineering/state-space-methods]] §§5–8. Here we list only the equations you use during design.
State-space form for a robot
The general continuous-time LTI form:
For most robots: (no direct feedthrough from actuator to sensor). Examples:
- Cart-pole near upright. , (cart force), .
- Quadrotor attitude near hover. , , = direct IMU measurements.
- 6-DOF arm joint (linearised). , after gravity-comp feedforward.
- Differential-drive base. , , after wheel-radius and track-width factoring.
Linearisation about an operating point
For a nonlinear robot at an equilibrium with :
In Python this is scipy.optimize to find then jax.jacfwd or casadi.jacobian to obtain symbolically; in MATLAB, linearize() on a Simulink model around a findop() operating point. The manipulator equation from [[Robotics/dynamics-rigid-body]] linearises into the canonical second-order form , near a static pose.
Controllability and observability rank tests
The controllability matrix . The plant is controllable iff . The observability matrix . Observable iff . In practice use ctrb(sys), obsv(sys), rank(...) in MATLAB or python-control’s ctrb, obsv. Numerically reliable test for near-rank deficiency: the controllability Gramian’s smallest singular value vs the largest — ratios below flag practical uncontrollability even when the algebraic rank is full.
LQR cost and feedback gain
The continuous-time LQR minimises
The optimal gain is where solves the continuous-time algebraic Riccati equation (CARE):
The closed-loop dynamics are guaranteed asymptotically stable provided is stabilisable and is detectable. Numerical solution: scipy.linalg.solve_continuous_are, MATLAB care(), SLICOT SB02OD, Julia MatrixEquations.arec. All use the Schur or sign-function method on the Hamiltonian matrix .
Robust margins of pure LQR
For full-state feedback (no observer), LQR has guaranteed gain margin and phase margin on every input channel individually — proven by Anderson-Moore 1971 for SISO, extended to MIMO loop-by-loop by Lehtomaki-Sandell-Athans 1981. This is one of the few things you get for free in modern control. Once you add a Kalman observer, these margins can collapse (Doyle 1978’s famous counterexample) unless you apply Loop-Transfer Recovery.
Kalman filter — the dual
For the noisy plant , with and independent zero-mean Gaussian noises, the steady-state Kalman filter runs
with Kalman gain where solves the dual CARE
This is literally the LQR Riccati equation with the substitutions , , , . In code: solve_continuous_are(A.T, C.T, W, V).
LQG via the separation principle
LQR gain + Kalman gain combine into the LQG controller:
The closed-loop poles split into LQR poles (eigenvalues of ) and observer poles (eigenvalues of ). For the linear Gaussian case the separation principle holds exactly: design and independently, combine them, and you get the optimal LQG controller. The robustness margins of pure LQR are no longer guaranteed once the observer is in the loop.
Rule of thumb for observer-pole placement: choose observer poles 5–10× faster than the LQR closed-loop poles. Faster than 10× and the observer amplifies sensor noise; slower than 5× and the observer’s state estimate lags the true state enough to degrade the LQR response. This rule is the practical alternative to formal LTR for early designs.
Reachability, stabilisability, detectability
A subtle but important distinction:
- Controllable — every state can be driven to zero in finite time (all modes reachable from ).
- Stabilisable — only the unstable modes need be controllable; stable uncontrollable modes are OK (they decay on their own).
- Observable — every state is reflected in (all modes visible from ).
- Detectable — only the unstable modes need be observable.
LQR/Kalman require stabilisability (not full controllability) and detectability (not full observability) for the Riccati equation to have a positive-semi-definite solution. The robot designer’s bar is therefore lower: you don’t need to instrument every joint; you just need to instrument every mode that’s both unstable and not naturally damped.
Discretisation for the MCU
The continuous design is discretised with c2d(sys, T_s, 'zoh') (zero-order hold — exact for the LTI piecewise-constant input model). Discrete LQR (dlqr) solves the discrete-time algebraic Riccati equation (DARE):
For practical robot sample rates ( < 1/(20·BW)), the continuous-time gain transferred verbatim is almost identical to the DLQR gain. For slower sample rates redesign in discrete time.
3. Practical math & worked examples
Example A — Cart-pole / inverted pendulum balance
The textbook test case for LQR; also the actual control problem for the Segway, hoverboard, two-wheel balancing robot, and Cassie’s standing controller (in linearised form).
Plant. Cart of mass kg, pole of mass kg, length m, gravity m/s². State (cart position, cart velocity, pole angle from vertical, pole rate). Input (horizontal force on cart, N). Linearise about upright ():
Eigenvalues of : . Two zero eigenvalues (cart position drift) and one unstable mode at rad/s — pendulum falls.
Bryson scaling for Q and R. With desired max excursions m, m/s, rad, rad/s, N:
Solve in python-control:
import numpy as np
from control import lqr
A = np.array([[0,1,0,0],[0,0,-0.981,0],[0,0,0,1],[0,0,21.58,0]])
B = np.array([[0],[1],[0],[-2]])
Q = np.diag([4, 1, 32.7, 1])
R = np.array(0.0025)
K, S, E = lqr(A, B, Q, R)
# K ≈ -40.0, -34.7, -147.9, -32.3
# closed-loop poles E ≈ [-13.3, -4.7, -3.9, -3.0]Sanity check. N/m for cart position, N/rad for pole angle — pole gain is 4× cart gain, expected since the unstable mode is on the pole. A initial perturbation generates a transient force of N — slightly above the 20 N limit, indicating the actuator will saturate briefly. Either accept the brief saturation (LQR remains stable in the linear regime) or reduce in the Bryson scaling to relax the gain. Recovery time from perturbation in simulation: ms.
Discrete implementation. At ms (500 Hz loop):
from scipy.signal import cont2discrete
Ad, Bd, _, _, _ = cont2discrete((A, B, np.eye(4), 0), 0.002)
from control import dlqr
Kd, Sd, Ed = dlqr(Ad, Bd, Q*0.002, R*0.002)
# Kd ≈ K (negligible difference at 500 Hz)Example B — Quadrotor attitude LQR (PX4-style alternative to cascaded PID)
For a kg X-frame quadrotor at hover, linearise the rotational dynamics. State (roll, pitch, yaw, body rates). Input (body torques from the rotor mixer). The body inertia tensor (typical 250 mm frame): kg·m².
Near hover and small angles, the rotational dynamics decouple per axis: , , etc. So
with . Eigenvalues of : all zero — three double-integrators in parallel, marginally stable, fully controllable from .
Bryson Q, R. Pick max angle rad (), max rate rad/s, max torque N·m, similar for pitch; for yaw the inertia is larger and N·m, rad, rad/s:
Solving: with
Closed-loop bandwidth per axis: ~17 rad/s ≈ 2.7 Hz on angle, ~50 rad/s on rate. The off-diagonal terms are exactly zero because the linearised inertia tensor is diagonal — if you replace it with a real measured inertia (with cross-products), the off-diagonals appear naturally and LQR couples the axes optimally.
Energy comparison vs cascaded PID. Simulate a 0.2 rad roll-angle setpoint step with identical disturbance rejection bandwidth: LQR ~22% lower than the equivalent PX4 cascaded PID with same closed-loop bandwidth (typical numbers from Furrer-Burri ICRA 2016 RotorS benchmark on Crazyflie). The saving comes from LQR exploiting the cross-coupling explicitly when the inertia tensor is non-diagonal.
Example C — UR5-class arm joint with Kalman velocity estimation
Each joint of a 6-DOF cobot, after gravity-comp feedforward ([[Robotics/dynamics-rigid-body]] §3), has the residual linear plant
with kg·m² (joint-side, post-100:1 harmonic drive), N·m·s/rad. State , input , measurement (the encoder). The velocity is not measured directly — naïve differencing amplifies encoder LSB noise by .
Process noise. Model parameter uncertainty as of the torque amplitude reflected as state acceleration noise: (rad², (rad/s)²).
Measurement noise. Optical encoder with counts/rev gives rad. So rad².
Kalman gain (steady-state). Solve solve_continuous_are(A.T, C.T, W, V):
P = solve_continuous_are(A.T, C.T, W, V)
L = P @ C.T / V # ≈ [[7.07e2], [5.0e4]]The position gain s⁻¹ reflects fast acceptance of position measurements. The velocity gain s⁻¹ reflects the fact that velocity must be inferred from position differences scaled by the assumed process noise.
Effect on downstream PD control. Replace naïve (q[k] - q[k-1])/T_s with the Kalman estimate . With ms and rad, the naïve velocity noise is rad/s; the Kalman estimate has noise floor near rad/s — about 14× quieter. The downstream PD’s term no longer drives the motor with audible whine. This is the dominant reason any modern cobot ships with a state estimator instead of direct numerical differentiation.
EKF when nonlinearities matter. For a multi-joint arm where joint coupling can’t be ignored (Coriolis terms significant at high speed), replace the linear Kalman with an Extended Kalman Filter linearising about the current state estimate every cycle. Implementation in [[Robotics/bayesian-estimation]] (planned).
Example D — Two-wheel balancing robot (Segway-class)
A pedagogical favourite and a real product class (Segway PT, Ninebot S, every educational kit). Combine cart-pole balance with the wheel dynamics. State where is body tilt (forward positive) and is wheel angle. Inputs , but symmetric balance lets you reduce to a single mean wheel torque with the differential controlled separately for heading. Linearised about upright:
For body mass kg, body inertia kg·m² about wheel axis, wheel radius m, body COM height m, total wheel inertia kg·m², the linearisation around gives an unstable plant with eigenvalues near (one wheel position drift, one balance instability).
Bryson , N²·m². Solving in python-control yields — the dominant gain on tilt angle, weak gain on wheel position. The recovered closed-loop bandwidth on tilt is ~5 rad/s; the robot stays upright against a 1° initial tilt with peak torque ~0.6 N·m.
Subtlety: wheel position is intentionally weakly controlled. The user wants the robot to track position commands, but during balance recovery the wheels temporarily move backwards to “catch” a forward fall. A heavy gain on would fight this catching motion. The standard fix is to integrator-augment with a small gain — slow position tracking while preserving fast balance.
4. Design heuristics
Q and R tuning — Bryson’s rule then iterate
Start with Bryson’s rule (Bryson-Ho 1975 §5):
This scales each state and input by its physically meaningful range. Then iterate: bigger → faster response, bigger → less actuator effort. A common second pass is to multiply the entire matrix by a scalar and sweep logarithmically while watching the closed-loop bandwidth and the peak control effort in simulation.
Diagonal and are sufficient in practice. Off-diagonal terms exist mathematically (they penalise correlations like ) but rarely have physical meaning. Exception: in operational-space control you may write to penalise Cartesian error rather than joint error — that produces an off-diagonal naturally.
Always check controllability first
If is not controllable, LQR’s CARE has no positive-definite solution and the solver will return junk (numerical garbage from a singular Hamiltonian). Check before designing:
from control import ctrb
np.linalg.matrix_rank(ctrb(A, B)) == A.shape[0]Common ways a robot ends up uncontrollable:
- Underactuated DOF. A drone with three rotors can’t independently command yaw + thrust + roll + pitch. Quadcopter is minimally controllable; tricopter requires servo tilt to maintain controllability.
- Symmetric workspace pose. A 6-DOF arm at a wrist singularity has rank-deficient Jacobian; full state controllability is lost in some directions.
- Locked joint. Disabling a joint by braking it changes — the remaining plant may not be controllable to the original state vector. Recompute for the locked subsystem.
Integrator augmentation for tracking
Pure LQR drives , not . For constant non-zero setpoints with constant disturbances, augment the state with the integrated tracking error:
Solve LQR on with augmented that includes a weight on . This gives zero steady-state error with the same robustness guarantees. Mandatory for altitude hold on drones (rejecting weight uncertainty), Cartesian setpoint tracking on arms (rejecting load disturbances), and balance position regulation.
Anti-windup in state-space
LQR doesn’t model actuator saturation; if the closed-loop guarantees fail. Two production options:
- Clamp + back-calculate the integrator state in the augmented form: when saturates, freeze or back-propagate the saturation error into it (same idea as PID anti-windup in
[[Robotics/pid-control]]§2). - Step up to MPC if hard limits matter. MPC honours constraints explicitly; LQR doesn’t.
Saturation handling — when LQR isn’t enough
Stationary LQR cannot honour hard limits or . Workarounds:
- Soft saturation in the cost: add penalty — but LQR can’t represent piecewise-quadratic penalties; you’d need iLQR.
- Reference governor that pre-filters to keep inside limits.
- Explicit MPC that computes as a piecewise-affine function on a polyhedral partition of state space.
Gain scheduling for the operating envelope
A single LQR is valid only in the linear neighbourhood of one operating point. For robots that visit multiple poses (a humanoid standing upright, crouching, walking; an arm working folded, extended, at full reach), design LQR at each pose and interpolate:
poses = [pose_upright, pose_crouched, pose_walking]
K_table = [lqr(*linearize_around(p), Q, R)[0] for p in poses]
# at runtime: select K by nearest pose, or blend linearlyBoston Dynamics’ Atlas uses ~30 gain-schedule points across its whole-body LQR + MPC stack (Kuindersma et al. 2016, “Optimization-based locomotion planning, estimation, and control design for the atlas humanoid robot”); UR’s e-series schedules joint-side gains on payload mass.
LQR vs PID vs MPC decision
| Situation | Choose |
|---|---|
| SISO, slow plant, no hard limits, no model | PID |
| Single robot joint after gravity-comp feedforward | PID (per-joint, [[Robotics/pid-control]]) |
| MIMO coupling, plant is linear or linearisable, soft limits | LQR |
| MIMO, noisy/partial measurements | LQG |
| Hard input/state constraints, plant model known | MPC ([[Engineering/mpc-control]]) |
| Nonlinear plant, optimise trajectory | iLQR / DDP |
| Adversarial uncertainty, want worst-case bound | H∞ |
| Plant model unknown, adapt online | MRAC / L1 adaptive |
A common production pattern: MPC at the outer trajectory level + LQR at the joint level + PID inside the motor drive. Spot, Atlas, modern quadrotors, and the Tesla Optimus follow variants of this stack.
Frequency-shaped LQR for smoother control
Add a high-pass weight on the control: penalise instead of . This penalises high-frequency content in — useful when the actuator is bandwidth-limited or when you want to keep motor jerk low. Implement by augmenting the state with ‘s filter states (Anderson-Moore 2007 §8.2). On a Crazyflie this is the difference between LQR commanding 7 kHz rotor chatter (audible whine, ~5% extra current) versus a smoothly varying thrust profile.
Operational-space LQR — penalise Cartesian error, not joint error
For an arm where the end-effector trajectory matters more than joint angles, replace the joint-space cost with a Cartesian-space cost. Let be the end-effector pose; linearise about as . Then , which generally produces a non-diagonal joint-space — but this is the correct coupling. The Cartesian Bryson scaling is much easier to specify (mm error, rad orientation error) than per-joint scaling, especially for redundant arms.
Decoupling for arm joints
When the mass matrix is far from identity (humanoid hip-knee, fast cobot moves), use the computed-torque transformation where is a new control input. The resulting plant from to is the identity — a decoupled double-integrator on each joint. Per-joint LQR (or per-joint PID) then suffices. This is the standard pattern for fast cobot manipulation (Franka FR3, Kinova Gen3, UR e-Series under high-speed mode).
Loop-transfer recovery (LTR)
LQR alone has guaranteed gain margin and phase margin. Adding a Kalman observer can destroy these (Doyle 1978). LTR chooses the Kalman noise matrices — not from physical noise statistics but from a robustness target — to recover the LQR margins asymptotically. The classical recipe: take with (output recovery) or , with (input recovery). LTR is the standard answer for “my LQG controller has small phase margin.” See Stein-Athans 1987.
When LQR is wrong — common over-applications
Reaching for LQR when PID would suffice is a common mistake by control engineers fresh out of an optimal-control class. Specific anti-patterns:
- Single-joint position control after gravity comp — the residual plant is a clean second-order, PID gets you 95% of LQR’s performance with one-tenth the design effort. Stay PID.
- Plant is highly nonlinear and the operating envelope is wide — a single LQR is invalid; gain-schedule, iLQR, or MPC are the right answers.
- The dominant disturbance is a step (load change, wind gust) — the LQR cost minimises against impulse / white-noise disturbances, not steps. Add integrator augmentation (mandatory) or use disturbance-observer-based control.
- Hard limits on or matter for safety — LQR can’t honour them. Step up to MPC.
- You have ms-level computation budget for the design step — LQR offline design takes seconds (small ) to minutes (large ); online recomputation is rarely the bottleneck. But if you genuinely need to redesign every cycle (e.g., morphing robot), iLQR is the appropriate tool.
- The plant model is unknown or wildly uncertain — LQR is sensitive to model accuracy. Robust H∞ or adaptive control fit the bill better than LQR with an estimated .
Discretisation pitfalls
c2d(sys, T_s, 'tustin') (bilinear) preserves stability and phase margin but distorts high-frequency dynamics. c2d(sys, T_s, 'zoh') (zero-order hold) is exact for the ZOH-input plant and is the default for digital control. Backward Euler ('backward_diff') is unconditionally stable but introduces extra phase lag — use only when stability dominates accuracy.
Sensor placement and observability — design choice, not afterthought
The observability of depends on which sensors you mount. A drone with only IMU is observable for body rates but not for absolute position; add GPS or motion capture for position observability. A balancing robot with only wheel encoders cannot observe IMU drift; add an IMU for tilt. A 6-DOF arm with only joint encoders is observable for and (after differentiation) , but not for external torque; add joint torque sensors (UR e-Series strain-gauge wrist, Franka FR3 in-joint sensors). Run the observability rank test on every candidate sensor configuration before specifying the bill of materials — discovering an unobservable mode after the prototype is built is an expensive mistake.
Process and measurement noise — how to set and in practice
For a Kalman filter to be optimal, should match the actual process noise covariance and the actual measurement noise covariance. In practice you estimate them:
- from datasheet + bench test. Encoder LSB resolution gives the static noise floor (e.g., for a 17-bit absolute encoder). IMU datasheets specify noise density in ; multiply by for the in-band noise floor. Bench-test by recording the stationary plant for 60 s and computing
np.var(). - from model uncertainty. Treat unmodelled friction, payload variation, gear backlash, motor torque ripple as additive process noise. A rule of thumb: on the acceleration state equals where is the standard deviation of unmodelled torque (typically 5–15% of peak torque for a well-characterised joint).
- Allan variance analysis for IMU bias (
[[Robotics/bayesian-estimation]]planned) gives the random-walk noise component that should be modelled as a bias state, not as .
5. Components & sourcing
LQR is software, but the toolchain shapes what you can build.
MATLAB / Simulink — Control System Toolbox
The gold standard, used by Airbus, NASA, Tesla, Bosch, Boston Dynamics for design and code-gen. Key functions:
lqr(A, B, Q, R)— continuous LQR, returns .dlqr(Ad, Bd, Q, R)— discrete LQR.kalman(sys, W, V)— Kalman estimator design.lqg(sys, QXU, QWV)— one-shot LQG.care(A, B, Q, R)/dare(A, B, Q, R)— Riccati solvers.place(A, B, p)/acker(A, B, p)— pole placement (alternative when you don’t want a cost function).c2d(sys, Ts, method)— discretisation.- Simulink Embedded Coder generates C code from a state-space block + LQR gain matrix, targeting STM32, TI C2000, NVIDIA DRIVE, Infineon AURIX.
Python — python-control + slycot + scipy
- python-control (https://python-control.readthedocs.io/) — open-source MATLAB-equivalent.
control.lqr,control.dlqr,control.kalman,control.care,control.ctrb,control.obsv. Backed by slycot (SLICOT bindings) for Riccati solvers. - SciPy —
scipy.linalg.solve_continuous_are,scipy.linalg.solve_discrete_are. Bare-metal Riccati when you don’t want the python-control dependency. - JAX + diffrax for differentiable LQR (gradient through Riccati for end-to-end RL / model-learning).
- CVXPY — model-predictive variants; not LQR per se but uses the same problem class.
Julia — ControlSystems.jl + MatrixEquations.jl
ControlSystems.lqr(sys, Q, R),dlqr,kalman. Performant, well-suited for differentiable control + autodiff.
Drake (MIT / TRI)
pydrake.systems.controllers.LinearQuadraticRegulator(A, B, Q, R) — the systematic robotics framework. Drake’s LQR is wired into the simulator (MultibodyPlant) so you can linearise a URDF robot model and design LQR in one script. Used in research labs worldwide and by TRI’s manipulation team.
Stable-Baselines / RL world models
For iLQR and learned-dynamics LQR, MuJoCo MPC (Google DeepMind, https://github.com/google-deepmind/mujoco_mpc) provides iLQR + sampling MPC running at hundreds of Hz on commodity x86. Also OCS2 (ETH Zurich) for legged-robot MPC built on iLQR + SLQ.
SLICOT — the numerical engine
SLICOT (Subroutine Library In Control Theory) is the FORTRAN library that underpins MATLAB’s care/dare, python-control’s slycot, and ControlSystems.jl’s MatrixEquations. Key routines: SB02OD (CARE), SB02MD (DARE), SG02AD (generalised Riccati). It’s the production-grade Schur and matrix-sign solver; you almost never call it directly but it’s running underneath.
ROS 2 — no built-in LQR
ros2_control ships PID via control_toolbox::Pid but no LQR module. The standard pattern: design LQR in MATLAB/Python offline; publish the gain matrix as a YAML/Parameter; implement the multiply u = -K (x - x_des) in a custom controller_interface::ControllerInterface plugin. Drake-ROS bridges exist for direct integration (https://github.com/RobotLocomotion/drake-ros).
Embedded LQR
A typical LQR runtime is multiply-accumulates per cycle. For a matrix on a Cortex-M4 at 168 MHz: 72 MACs, ~0.5 µs per cycle. Even a 100×100 state (Atlas-class whole-body) finishes in <100 µs on a Cortex-A78 with NEON SIMD. MCU LQR is essentially free; the design effort is offline.
Hardware targets — typical
| Platform | Loop rate | Use case |
|---|---|---|
| STM32F4 (84 MHz) | 1–10 kHz | Joint-level LQR + Kalman |
| STM32H7 (480 MHz) | 10–50 kHz | Flight controller LQR |
| TI C2000 F2837x | 50–200 kHz | Servo drive inner LQR |
| Cortex-A78 / Apple M-series | 250 Hz – 1 kHz | Whole-body MPC + LQR |
| FPGA (Xilinx Zynq) | >100 kHz | Ultra-fast power-electronics LQR |
Verification and validation tools
- MATLAB Simulink Verification & Validation — formal proof of stability margins, gain coverage, Monte Carlo robustness sweeps over plant-parameter uncertainty.
- dSPACE SCALEXIO — hardware-in-the-loop test rig for real-time closed-loop verification before silicon.
- CARLA / LGSVL — driving-simulator HIL for automotive ACC LQR.
- Gazebo + ROS 2 Ignition — open-source HIL for ground and aerial robots; integrates with control_toolbox for plug-in LQR controllers.
- PX4 SITL (Software-in-the-Loop) — full PX4 firmware against a simulated quadrotor plant; ideal for LQR controller verification before flight.
- Drake
Simulator+MultibodyPlant— exact-physics simulator with continuous-time integration; the rigorous environment for LQR design and verification.
Codegen toolchains
- MATLAB Embedded Coder — converts a Simulink LQR + Kalman block diagram into ANSI C targeting AUTOSAR, AUTOSAR Classic, or bare-metal. Standard in automotive autopilots (Tesla, Volvo, BMW), industrial servo drives (Yaskawa, Mitsubishi), and aerospace (Honeywell, Collins Aerospace).
- MATLAB HDL Coder — generates VHDL/Verilog for FPGA LQR (Xilinx, Intel/Altera). Used when the LQR loop rate exceeds 100 kHz (power-electronics, motor-control inner loop).
- dSPACE TargetLink — embedded C from Simulink with automotive-grade certification.
- Drake
LinearQuadraticRegulator→ C++ viadrake::systems::DiagramandSerializeToProto. Used by Toyota Research Institute’s manipulation stack. - OpenModelica / Modelon Optimica — for system-level modelling with code-gen of LQR/MPC alongside thermal, hydraulic, multibody plants.
6. Reference data
Bryson scaling for common robot states
| State | typical | |
|---|---|---|
| Joint angle (arm) | 1.5 rad | 0.44 |
| Joint velocity (arm) | 3 rad/s | 0.11 |
| Cart position (balance) | 0.5 m | 4.0 |
| Roll/pitch angle (drone) | 0.3 rad | 11.1 |
| Yaw angle (drone) | 0.5 rad | 4.0 |
| Body rate (drone) | 3 rad/s | 0.11 |
| Heading (mobile base) | π rad | 0.10 |
| Linear vel (mobile base) | 1 m/s | 1.0 |
LQR gain ranges by robot class (starting points, parallel form)
| Robot class | State dim | entry magnitude | Typical bandwidth |
|---|---|---|---|
| Cart-pole / Segway | 4 | 30–200 (mixed units) | 3–10 rad/s |
| Quadrotor attitude | 6 | 0.5–5 N·m/rad, 0.1–1 N·m·s/rad | 30–80 rad/s |
| Cobot single joint (post-FF) | 2 | 200–800 N·m/rad, 50–200 N·m·s/rad | 30–100 Hz |
| Mobile-base diff drive | 4 | 5–50 N·s/m, 1–10 N·m·s/rad | 1–5 Hz |
| ACC longitudinal | 3 | 0.1–1.0 m/s² per m | 0.3–1 Hz |
| Humanoid balance (LIP) | 4 | 50–500 N per rad/m, 10–100 N·s | 2–5 Hz |
Bryson scaling for common robot inputs
| Input | typical | |
|---|---|---|
| Joint torque (mid arm) | 50 N·m | 4×10⁻⁴ |
| Wheel torque (AGV) | 5 N·m | 0.04 |
| Drone torque axis (1 kg) | 0.5 N·m | 4.0 |
| Drone yaw torque | 0.3 N·m | 11.1 |
| Cart force (balance) | 20 N | 0.0025 |
LQR vs PID vs MPC decision matrix
| Criterion | PID | LQR | LQG | MPC |
|---|---|---|---|---|
| Plant model needed | No | Yes () | Yes + noise | Yes + constraints |
| Computation | ||||
| Handles MIMO | Awkwardly | Yes | Yes | Yes |
| Handles hard limits | Anti-windup only | No | No | Yes |
| Handles noisy measurements | Poorly | No | Yes | With Kalman |
| Guaranteed margins | No | , | After LTR | Cost-dependent |
| Tuning knobs | 3 per loop | , horizon | ||
| Where used | Joint, motor, drone PID | Cart-pole, drone att | UR cobots, Crazyflie | Atlas, Spot, Tesla AP |
Typical and ranges across robot classes (Bryson starting point)
| Robot | scale (max state weight) | scale (input weight) | Notes |
|---|---|---|---|
| Cart-pole | 30–100 on | 0.001–0.01 on | Pole-angle penalty dominates |
| Quadrotor attitude | 5–20 on angles | 0.1–10 on torques | Yaw weighted lower than roll/pitch |
| Cobot joint | 100–1000 on error | 10⁻⁴ – 10⁻² on | After computed-torque |
| Mobile base | 1–10 on pos/heading | 0.01–0.1 on wheel torque | Outer position loop |
| ACC longitudinal | 0.1–1 on velocity error | 0.05–0.5 on accel cmd | Jerk integral often added |
| Humanoid balance | 50–500 on COM offset | 0.001–0.01 on hip torque | LIP linearisation |
State dimensions for common robots
| Robot | (state dim) | (input dim) | Typical LQR design |
|---|---|---|---|
| Single joint (after FF) | 2 | 1 | |
| 6-DOF arm (lin) | 12 | 6 | |
| Cart-pole | 4 | 1 | |
| Quadrotor attitude | 6 | 3 | |
| Quadrotor full (att + pos) | 12 | 4 | |
| Diff-drive base | 4 | 2 | |
| Humanoid lower body | ~30 | ~12 | Gain-scheduled |
| Humanoid whole-body (Atlas) | ~60 | ~30 | iLQR + MPC, not pure LQR |
| Fixed-wing UAV (3D) | 12 | 4 | + scheduling |
SLICOT routines used by python-control / ControlSystems.jl
| SLICOT routine | Purpose |
|---|---|
SB02OD | Continuous-time algebraic Riccati equation (CARE) |
SB02MD | Discrete-time algebraic Riccati equation (DARE) |
SG02AD | Generalised CARE/DARE with descriptor systems |
MB05ND | Matrix exponential, scaling-and-squaring + Padé |
AB01ND | Controllability staircase form |
AB04MD | Continuous-discrete conversion (Tustin) |
TB04AD | State-space to transfer function |
TG01AD | Generalised system balancing |
Typical closed-loop bandwidth from LQR on robot subsystems
| Subsystem | Open-loop BW | LQR-closed BW |
|---|---|---|
| Cart-pole balance | unstable | 3–10 rad/s (0.5–1.5 Hz) |
| Quadrotor attitude | marginal | 30–80 rad/s (5–13 Hz) |
| Drone position | very slow | 2–5 rad/s (0.3–0.8 Hz) |
| Cobot joint (post-FF) | ~5 Hz | 30–100 Hz |
| Balancing robot | unstable | 2–5 Hz |
Decision flowchart — picking the controller
Plant is SISO and ~linear? ──yes──→ PID (Robotics/pid-control)
│ no
↓
Plant has hard constraints? ──yes──→ MPC (Engineering/mpc-control)
│ no
↓
Plant model known (A, B)? ──no───→ Robust H∞ / adaptive control
│ yes
↓
All states measured? ──no───→ LQG (LQR + Kalman)
│ yes
↓
Linearisation valid? ──yes──→ LQR
│ no
↓
Gain-schedule LQR, or iLQR/DDP
Riccati solver runtime — order of magnitude
For an Hamiltonian Schur decomposition, runtime scales . Concrete numbers on a 2024 laptop CPU (single thread):
Schur (SLICOT SB02OD) | Sign-function | Use case | |
|---|---|---|---|
| 4 | 50 µs | 80 µs | Cart-pole, single joint |
| 12 | 0.4 ms | 0.6 ms | 6-DOF arm, drone full |
| 30 | 5 ms | 8 ms | Humanoid lower body |
| 60 | 30 ms | 50 ms | Atlas whole-body |
| 100 | 100 ms | 200 ms | Large industrial line |
These are offline design times. The runtime cost — multiplying by — is MACs, microseconds on any MCU.
7. Failure modes & debugging
Symptom → likely cause
| Symptom | Likely cause | Fix |
|---|---|---|
| Riccati solver returns junk / non-PD | not stabilisable, or not PSD, or not PD | Check ctrb rank; ensure , |
| Closed-loop poles too far in LHP | too aggressive | Scale down by 10× |
| Saturation, then divergence | LQR doesn’t model limits | Add integrator anti-windup or step up to MPC |
| Estimator diverges, then plant follows | Kalman tuning wrong | Check match real noise; whiteness-test residual |
| Numerical issues, large | Hamiltonian conditioning | Use Schur or sign-function method; scale |
| Loss of margin after Kalman observer added | Generic LQG fragility | Apply LTR with |
| Observability rank loss at some pose | Configuration-dependent | Gain-schedule estimator, or fuse a different sensor |
| Discretisation mismatch | too large | Either redesign in discrete, or speed up loop |
| Limit cycle on at zero error | Numerical noise in × small entries | Deadband on , or square-root Kalman |
| Plant changes mid-mission | Linearisation invalid | Gain-schedule, or step up to iLQR/MPC |
| Cross-axis bleed in attitude LQR | Off-diagonal ignored | Use measured inertia tensor (not assumed diagonal) |
Detailed failure modes
-
Riccati does not converge.
scipy.linalg.solve_continuous_areraisesLinAlgErroror returns garbage. Cause: not stabilisable, not PSD, not PD, or numerical conditioning bad. Fix: checknp.linalg.matrix_rank(ctrb(A, B)) == n; ensurenp.all(np.linalg.eigvalsh(Q) >= 0); rescale to unit norm before solving. -
Closed-loop poles too aggressive. implies a 100-Hz bandwidth on a 10-kg joint; the actual motor can deliver only 30 Hz. The controller saturates constantly. Fix: reduce the entries on the relevant states by 10×; re-solve; verify the resulting bandwidth against actuator capability via Bode plot of .
-
Estimator divergence. Kalman residual grows unboundedly. Causes: is much smaller than real process noise (filter “trusts the model” too much, model is wrong); is much larger than real sensor noise (filter “ignores” the measurement); plant linearisation is wrong; bias not modelled. Fix: log the innovation , check (whiteness test); if not, retune .
-
Observability lost in pose. A redundant 7-DOF arm at a wrist singularity loses observability on one DOF; a planar drone’s yaw is unobservable without magnetometer; a wheeled-base’s heading is unobservable when stationary. Fix: redesign measurement set (add an IMU, add a vision tag), or accept the unobservable mode and ensure it remains stable in open-loop.
-
Discretisation mismatch. Design at continuous-time then ship without discretisation, or use too large. The phase lag of half a sample period appears as extra delay. Fix: design with
dlqragainst thec2d(...,'zoh')-discretised plant; or aim . -
Sat’d actuator. LQR doesn’t know about . With aggressive on a small actuator, saturates and the plant runs open-loop. The integrator (in augmented LQR) winds up. Fix: anti-windup back-calculation on the integrator state; or relax until stays inside limits; or step up to MPC.
-
Wrong noise statistics in Kalman. Suboptimal but stable. Detect by innovation whiteness test: residuals should be white-noise with covariance . If the test fails, the Kalman filter is using wrong noise models — recompute from measured data with
xcorrandiddata. -
Numerical conditioning for large . states with stiff dynamics () breaks naive Riccati solvers. Use balancing (
scipy.linalg.matrix_balance) before solving, or the sign-function method which is more robust than Schur for stiff problems. -
LQR-LQG gap. Pure LQR has gain margin. Pure LQG can have margin near zero (Doyle 1978). Detect: simulate with a 50% gain reduction in one input channel; if the closed-loop is unstable, the LQG has lost its margin. Fix: apply LTR ( large), or design with H∞ instead.
-
Wrong linearisation. Linearised about the wrong operating point. Detect: simulate the nonlinear plant under the LQR; if the response differs substantially from the linear simulation, the operating point is off. Fix: re-find the equilibrium with
scipy.optimize.fsolve(f, x0, args=(u0,))and re-linearise there. -
Pole-zero cancellation hiding instability. A pole in at can be cancelled in the transfer function by a zero in , hiding the unstable mode from input-output analysis. Internal stability requires all eigenvalues of in LHP. Always check eigenvalues directly, not just the transfer function.
-
Reference governor saturation surprise. Augmented integrator gives zero steady-state error, but a large step reference saturates for many cycles. The integrator state has wound up. Fix: shape the reference with a smooth trajectory generator (
[[Robotics/trajectory-generation]]planned), so stays inside limits during transients. -
State-coordinate scaling causing ill-conditioning. When state entries span 6+ orders of magnitude (e.g., position in metres alongside body rates in rad/s and motor current in milliamps), the Hamiltonian becomes badly conditioned and Riccati’s solution is dominated by floating-point error. Fix: rescale states to unit order of magnitude with a diagonal transformation before designing, then transform back: .
-
Integrator wind-up in the augmented state. When you augment with and the actuator saturates, the augmented LQR has no built-in mechanism to clamp . Fix: when is clipped, freeze for that cycle, or apply back-calculation .
-
Plant-data mismatch between identification and deployment. identified on a fresh prototype; six months later belt tension has changed and inertia is 10% off. The LQR works but bandwidth has shifted. Fix: re-identify periodically, or use adaptive control. For UR-class cobots the manufacturer recompiles new inertial parameters with every payload-mass change at startup.
Diagnostic procedure for a misbehaving LQR
- Verify controllability:
np.linalg.matrix_rank(ctrb(A, B)) == n. - Verify , .
- Solve Riccati; verify and .
- Compute eigenvalues of ; verify all are in LHP.
- Simulate the linear closed-loop; verify response matches design.
- Simulate the nonlinear closed-loop; verify response is similar.
- Hardware test with conservative first; ramp up.
- Check actuator saturation log; if saturated >5% of the time, redesign.
A note on adaptive and learning-based LQR variants
The 2020s have seen a wave of methods that learn or directly from data:
- Adaptive LQR (Lale, Azizzadenesheli, Anandkumar 2020 “Logarithmic regret bound in partially observable linear dynamical systems,” NeurIPS) — online identification + LQR redesign with provable regret bounds. Used in research-stage adaptive robotic systems.
- DAgger-style imitation learning of — train a neural net to approximate LQR’s piecewise-affine policy; useful when classical LQR is too slow for the online budget.
- Differentiable LQR — implement Riccati as a differentiable operator (via the implicit function theorem); train end-to-end with downstream task losses. Used in DeepMind’s robotics work and in Drake’s
Diagramautodiff. - MFAC (Model-Free Adaptive Control) — fully data-driven, no model needed; converges to LQR-equivalent behaviour when the plant is approximately linear.
These are research methods as of 2026, not production-grade for safety-critical robotics. The proven recipe is still: identify offline, design LQR offline, ship the gain matrix.
Diagnostic procedure for a misbehaving Kalman filter
- Log the innovation over many cycles.
- Compute the empirical mean — it should be near zero. Bias means model has a constant error not in the state.
- Compute the empirical covariance — it should match within ~20%. Mismatch means are wrong.
- Compute autocorrelation of — it should be white (delta-correlated). Coloured residual means the noise model is wrong (the noise has memory the filter isn’t capturing).
- If the residual fails whiteness but the variance matches, add bias states (slowly varying for IMU gyro/accel bias).
- If the residual is white but the variance is too large, is underestimated — increase it.
- If the residual blows up (NaN, divergence), is too small relative to model error — increase it, or switch to a robust filter (Huber EKF, particle filter).
8. Case studies
Case A — Boston Dynamics Atlas balance and walking (whole-body LQR + MPC)
Atlas (released 2013, current Electric Atlas as of 2026) implements a multi-layer optimal control stack with LQR features at multiple layers (Kuindersma et al. 2016 “Optimization-based locomotion planning, estimation, and control design for the atlas humanoid robot,” Autonomous Robots).
- Joint-level LQR for postural stability, ~1 kHz. Each leg’s actuators are torque-controlled; the joint LQR uses a 2-state-per-joint plant after gravity feedforward (similar to Example C). Inertia matrix from the URDF is used for the feedforward.
- Body-level LQR for balance against perturbations, ~200 Hz. State is centroidal momentum (6 components — Linear inverted-pendulum reduction). Used to compute desired contact wrenches that the lower-level joint LQR realises.
- Whole-body MPC (Drake’s TrajectoryOptimization, later replaced with a custom solver) at ~30 Hz, plans 6-DOF body pose, foot contact schedule, and centroidal-momentum trajectory.
The LQR layers are gain-scheduled on gait phase (stance vs swing) and on terrain estimate. Each gain set is precomputed for ~30 representative configurations + interpolated.
Why this matters: Atlas demonstrates that even when MPC is doing the heavy lifting at the planning layer, LQR remains the right tool for the fast (>100 Hz) inner balance loop where you can’t afford the computation of full MPC. The interplay — slow MPC plans the trajectory, fast LQR tracks it — is the dominant pattern in modern humanoid control.
Case B — Crazyflie 2.1 LQR attitude controller (ETH research firmware)
The Bitcraze Crazyflie 2.1 (released 2018, current 2026) is a 27 g nano-quadrotor whose stock PX4-derived firmware runs cascaded PID. ETH Zurich’s Institute for Dynamic Systems and Control published an LQR attitude controller fork (Furrer, Burri, Achtelik, Siegwart 2016 “RotorS — A Modular Gazebo MAV Simulator Framework,” ICRA Workshop) that replaces the inner attitude loop with state-space LQR.
- State: (Euler angles and body rates, ).
- Input: Body torques (m = 3).
- Linearisation: About hover, with the Crazyflie’s measured inertia tensor (slightly off-diagonal due to battery placement).
- : Bryson scaling with , rad/s.
- : Tuned for closed-loop bandwidth of ~12 Hz on roll/pitch, 5 Hz on yaw.
- Implementation: STM32F4 at 168 MHz; the matrix multiply runs in ~5 µs, well below the 1 ms loop period.
The benchmark result: in a step-disturbance test (a 0.1 N·m impulse on roll), LQR settles in 0.4 s versus PID’s 0.6 s, and uses 18% less rotor energy over the transient. The pure attitude-LQR doesn’t outperform PID dramatically because the Crazyflie’s inertia tensor is nearly diagonal — the off-diagonal coupling that LQR exploits is small. On a non-symmetric airframe (a coaxial heli, a tilt-rotor, a Y6 hex), the advantage grows substantially.
Why this matters: Crazyflie is the educational reference. The firmware is open, the hardware is \200$, and you can deploy an LQR design from python-control directly. It’s the lowest-friction way to confirm a paper-stage LQR design actually flies.
Case C — Apollo Lunar Module pitch-yaw guidance (the historical reference)
The Apollo 11 Lunar Module’s powered descent guidance used a Linear-Quadratic-Gaussian (LQG) controller for pitch and yaw attitude during the powered descent phase (Battin 1987 “An Introduction to the Mathematics and Methods of Astrodynamics” revised edition; Klumpp 1974 “Apollo Lunar Descent Guidance,” Automatica). Although the implementation predates digital control practice — the Apollo Guidance Computer was a 2.048 MHz word-serial machine with 36 KB of fixed memory and 2 KB of RAM — the theory was pure LQG:
- State: pitch, yaw, pitch rate, yaw rate, plus integrated tracking error against the guidance reference.
- Input: RCS thruster pulse-frequency modulation.
- Q, R: derived from fuel-budget and pointing-accuracy specs (sufficient pointing for radar lock on landing site; minimum RCS fuel).
- Kalman filter: fused gimbaled IMU (LM Inertial Measurement Unit) with the radar altimeter and rendezvous radar updates.
The actual onboard computation was a fixed-gain matrix derived offline by MIT Instrumentation Laboratory researchers using Riccati methods that Kalman had published only 9 years earlier (Kalman 1960). This was the first flight-critical LQR/LQG controller.
Why this matters: The lineage from Kalman 1960 → Apollo 1969 → modern robotics is direct. The same Riccati equation, the same separation principle, the same gain-table-shipped-to-firmware pattern. What’s changed is the computation horizon: Apollo’s LM Computer solved Riccati once on the ground; a modern Crazyflie can solve a fresh CARE in 50 µs on its STM32H7. The wall has moved from “can we solve it at all” to “we can solve it 1000× per second if we want, but mostly we don’t need to.”
Case D — Tesla Autopilot longitudinal LQR (vehicle dynamics)
Tesla’s Autopilot (and most production ACC systems — Mercedes Distronic, BMW Active Cruise, Honda Sensing) uses LQR-class control on the longitudinal-control problem: maintain a target speed or follow-distance while minimising jerk. The reduced longitudinal plant is
with and the actuator lag (~0.2 s for throttle, 0.1 s for regen braking). Bryson on m/s², jerk-equivalent integral on accel gives the typical ride-quality target m/s³.
The gain matrix is precomputed offline; the closed-loop bandwidth is ~0.5 Hz on position, ~2 Hz on velocity. The same architecture appears in Cruise Origin, Waymo’s planner, and CARLA’s PID-LQR comparison suite. Why this matters: automotive longitudinal control is the highest-volume LQR deployment in the world — tens of millions of vehicles. The bar for tuning and verification is high; the design pattern (linearise about cruise speed, gain-schedule on speed and grade, integrator-augment for grade rejection) is industry-standard.
Case E — Franka FR3 cobot impedance + Cartesian LQR layer
The Franka Emika FR3 (released 2023, the successor to Panda) implements a layered control stack where joint-level torque control is PI on current + computed-torque feedforward, but the outer Cartesian impedance layer is structurally an LQR-derived design (Albu-Schäffer et al. 2007 “Cartesian impedance control of redundant and flexible-joint robots,” Robotics and Autonomous Systems):
- Inner joint torque: 1 kHz PI on current, ~5 kHz current loop on FOC.
- Outer joint LQR on the post-feedforward double-integrator plant: per arm (7 joints × 2 states each), 1 kHz update.
- Cartesian impedance / admittance at 1 kHz: virtual 6-DOF mass-spring-damper at the end-effector, with stiffness N/m and damping critically tuned. The map to joint torques is .
The Franka FR3 SDK (libfranka) exposes the joint torque interface (franka::ControlException-safe at 1 kHz on PREEMPT_RT Linux) with a default impedance controller; user code overrides it with custom LQR/admittance gains. The stiffness gain is gain-scheduled on task phase (low for guarded compliant exploration, high for stiff trajectory tracking).
Why this matters: the Franka shows that LQR-derived design lives even inside controllers marketed as “impedance control.” The PD impedance gain is mathematically equivalent to a pole-placement variant of LQR with a Cartesian Bryson scaling; the difference is just how the designer thinks about specifying the gain (impedance physics vs cost-function weights).
9. Cross-references
Robotics (this library):
[[Robotics/kinematics-dh]]— provides pose for state definition.[[Robotics/dynamics-rigid-body]]— manipulator equation that linearises into .- planned
[[Robotics/motors-electric]]— current loop dynamics underneath joint LQR. [[Robotics/pid-control]]— when PID is sufficient; when to escalate to LQR.- planned
[[Robotics/impedance-control]]— Cartesian compliance, complementary to LQR. - planned
[[Robotics/bayesian-estimation]]— EKF/UKF for nonlinear plants beyond linear Kalman. - planned
[[Robotics/legged-robotics]]— balance and walking, the LQR+MPC pattern. - planned
[[Robotics/multirotor-design]]— drone attitude as a canonical LQR target. - planned
[[Robotics/trajectory-generation]]— smooth references for LQR tracking.
Engineering (foundations):
[[Engineering/state-space-methods]]— the theory companion; CARE, controllability, observability, Hamiltonian structure, separation principle.[[Engineering/classical-control]]— what LQR’s robustness margins are measured against.- planned
[[Engineering/digital-control]]— discretisation, z-domain, anti-aliasing. - planned
[[Engineering/mpc-control]]— the constrained-LQR extension. - planned
[[Engineering/realtime-embedded]]— running LQR at kHz rates on MCUs.
Languages:
- planned
[[Languages/Tier3/scientific]]— Control System Toolbox;lqr,kalman,lqg. - planned
[[Languages/Tier3/robotics-control]]— ros2_control, control_toolbox (no LQR; integrate Drake or custom plugin).
10. Citations
Foundational papers
- Kalman, R. E. (1960). “A New Approach to Linear Filtering and Prediction Problems.” Trans. ASME J. Basic Engineering 82(D), 35–45. The Kalman filter paper; foundational to LQG estimation.
- Kalman, R. E. (1960). “Contributions to the theory of optimal control.” Boletin de la Sociedad Matematica Mexicana 5, 102–119. The LQR paper.
- Kalman, R. E. (1961). “New results in linear filtering and prediction theory.” Trans. ASME J. Basic Engineering 83(D), 95–108. (Co-authored with Bucy.)
- Doyle, J. C. (1978). “Guaranteed margins for LQG regulators.” IEEE Trans. Automat. Contr. 23(4), 756–757. The famous counterexample showing LQG can have arbitrarily small margins; motivation for LTR.
- Doyle, J. C. & Stein, G. (1979). “Robustness with observers.” IEEE Trans. Automat. Contr. 24(4), 607–611. Foundational paper on Loop-Transfer Recovery.
- Lehtomaki, N. A., Sandell, N. R. & Athans, M. (1981). “Robustness results in linear-quadratic Gaussian based multivariable control designs.” IEEE Trans. Automat. Contr. 26(1), 75–93. MIMO extension of LQR margin guarantees.
- Stein, G. & Athans, M. (1987). “The LQG/LTR procedure for multivariable feedback control design.” IEEE Trans. Automat. Contr. 32(2), 105–114.
- Klumpp, A. R. (1974). “Apollo Lunar Descent Guidance.” Automatica 10(2), 133–146. The first flight-critical LQG.
- Kuindersma, S., Deits, R., Fallon, M., et al. (2016). “Optimization-based locomotion planning, estimation, and control design for the atlas humanoid robot.” Autonomous Robots 40(3), 429–455. doi:10.1007/s10514-015-9479-3.
- Furrer, F., Burri, M., Achtelik, M. & Siegwart, R. (2016). “RotorS — A Modular Gazebo MAV Simulator Framework.” ICRA 2016 Workshop on MAV Simulators. doi:10.1007/978-3-319-26054-9_23.
- Tassa, Y., Mansard, N. & Todorov, E. (2014). “Control-limited differential dynamic programming.” ICRA 2014, 1168–1175. doi:10.1109/ICRA.2014.6907001. iLQR with input constraints.
- Khatib, O. (1987). “A unified approach for motion and force control of robot manipulators: The operational space formulation.” IEEE J. Robot. Autom. 3(1), 43–53.
Books
- Anderson, B. D. O. & Moore, J. B. (2007). Optimal Control: Linear Quadratic Methods. Dover (reprint of 1990 Prentice Hall edition). The canonical LQR/LQG reference; CARE, DARE, frequency-shaped LQR, LTR.
- Anderson, B. D. O. & Moore, J. B. (1979). Optimal Filtering. Prentice Hall. The Kalman-filter canonical text.
- Bryson, A. E. & Ho, Y.-C. (1975). Applied Optimal Control: Optimization, Estimation, and Control (revised printing). Hemisphere. Source of Bryson’s scaling rule.
- Athans, M. & Falb, P. L. (1966). Optimal Control: An Introduction to the Theory and Its Applications. McGraw-Hill (reprinted Dover 2007).
- Skogestad, S. & Postlethwaite, I. (2005). Multivariable Feedback Control: Analysis and Design (2nd ed.). Wiley. Robust MIMO control; H₂/H∞; LQG with weighting.
- Kailath, T. (1980). Linear Systems. Prentice Hall.
- Friedland, B. (1986). Control System Design: An Introduction to State-Space Methods. McGraw-Hill.
- Battin, R. H. (1987). An Introduction to the Mathematics and Methods of Astrodynamics. AIAA Education Series. Apollo-era LQG details.
- Murray, R. M., Li, Z. & Sastry, S. S. (1994). A Mathematical Introduction to Robotic Manipulation. CRC Press. Free PDF: https://www.cds.caltech.edu/~murray/mlswiki/.
- Lynch, K. M. & Park, F. C. (2017). Modern Robotics: Mechanics, Planning, and Control. Cambridge University Press. Free online: http://modernrobotics.org.
- Tedrake, R. (2024). Underactuated Robotics: Algorithms for Walking, Running, Swimming, Flying, and Manipulation. Free online textbook, MIT 6.832: http://underactuated.mit.edu. The single best reference for LQR, iLQR, and trajectory optimisation in robotics.
- Maciejowski, J. M. (2002). Predictive Control with Constraints. Prentice Hall.
Software documentation
- MATLAB Control System Toolbox — https://www.mathworks.com/help/control/. Functions:
lqr,dlqr,kalman,lqg,care,dare,c2d,place,ctrb,obsv. - python-control documentation — https://python-control.readthedocs.io/.
- SciPy linear algebra —
scipy.linalg.solve_continuous_are,solve_discrete_are. https://docs.scipy.org/doc/scipy/reference/linalg.html. - SLICOT library — https://www.slicot.org/. FORTRAN reference implementations of CARE/DARE.
- Drake — https://drake.mit.edu/.
pydrake.systems.controllers.LinearQuadraticRegulator. - MuJoCo MPC — https://github.com/google-deepmind/mujoco_mpc. iLQR and sampling MPC.
- ControlSystems.jl — https://juliacontrol.github.io/ControlSystems.jl/.
- PX4 Autopilot — https://docs.px4.io/. Reference cascaded-PID architecture; LQR variants in research forks.
- Crazyflie firmware — https://github.com/bitcraze/crazyflie-firmware.
Standards
- ISO 10218-1:2011. Robots and robotic devices — Safety requirements — Part 1: Industrial robots. Affects closed-loop response time requirements.
- IEC 61508:2010. Functional safety of electrical/electronic/programmable electronic safety-related systems. SIL ratings for safety-critical LQR.
- ISO 26262:2018. Road vehicles — Functional safety. Automotive functional safety (ASIL ratings) — directly applies to ACC LQR in production cars.
- DO-178C:2011. Software Considerations in Airborne Systems and Equipment Certification. Aviation-grade certification — relevant to fly-by-wire LQR/LQG.
Additional papers and online references
- Albu-Schäffer, A., Ott, C. & Hirzinger, G. (2007). “A unified passivity-based control framework for position, torque and impedance control of flexible joint robots.” Int. J. Robotics Research 26(1), 23–39. Foundation of Franka FR3 control architecture.
- Underactuated Robotics LQR chapter — http://underactuated.mit.edu/lqr.html. The most pedagogically clear modern treatment, with runnable code.
- Russ Tedrake LQR + iLQR lectures — MIT 6.832 course videos on YouTube.
- Stanford EE263 Introduction to Linear Dynamical Systems — http://ee263.stanford.edu/. The undergrad-level state-space introduction.
- Caltech CDS 110 lecture notes (Murray) — https://www.cds.caltech.edu/~murray/courses/cds110.html.
- Wikipedia “Linear-quadratic regulator” — surprisingly accurate; updated with current SLICOT routine references.
Session log
Append a one-line activity entry after writing this note:
node ~/.claude/bin/obsidian-research.mjs log "Built Robotics/state-space-lqr.md Tier 1 deep note"