Robot Communication Buses — CAN, EtherCAT, RS-485, I²C, SPI
See also (Tier 3 family index): Connector Families
1. At a glance
A modern robot is a small distributed real-time computer system. A six-axis cobot contains 50–200 silicon endpoints — joint drives, brake controllers, encoders, torque sensors, force/torque wrist sensors, end-effector tools, cameras, IMUs, safety I/O, teach pendant, system controller, supervisory PLC. None of them are useful in isolation. The communication buses that move data between them are the load-bearing structure of the machine: under-spec the bus and the control loops stall, jitter, or drop frames; over-spec it and the machine costs three times what it should.
The bus hierarchy in a real robot stacks across roughly six decades of latency:
- Sub-microsecond, intra-board (chip-to-chip on the same PCB): SPI, I²C, parallel buses, MIPI CSI-2 for cameras, PCIe between SoC and accelerator. Bandwidth in the 1 Mbps – 40 Gbps range; latency dominated by silicon, not wire.
- Sub-millisecond, board-to-board real-time (joints, drives, safety I/O): CAN-FD, EtherCAT, SERCOS III, POWERLINK, PROFINET IRT. Bandwidth 1 Mbps – 1 Gbps; latency 10 µs – 1 ms with bounded jitter; this is where motion-control loops live.
- Millisecond, supervisory (top-level coordination, vision, ROS 2 messaging): plain Ethernet TCP/IP, DDS, OPC-UA, MQTT, HTTP/gRPC. Bandwidth 100 Mbps – 100 Gbps; latency 0.5 ms – 100 ms; jitter usually not bounded.
- Wireless / external (teach pendant, remote ops, fleet manager, V2X): Wi-Fi 6/6E/7, 5G URLLC, BLE 5.4, UWB, LoRa, Zigbee/Thread/Matter. Latency 1 ms – seconds.
The “first ask” before specifying any bus: what control loop runs on it, and at what closed-loop bandwidth? If a current loop runs at 16 kHz on the drive’s MCU it stays on-chip via SPI to the gate driver — the field bus only carries setpoints at 1–8 kHz. If position control is centralised on a PLC, the field bus is in the control loop and its jitter directly limits achievable gains.
The second ask: how many wires, how long, and which connectors? Cabling is a major cost driver and the dominant cause of field failure. Daisy-chained Ethernet-class buses (EtherCAT, POWERLINK) almost always beat star-wired equivalents on cable cost, mass, and reliability for distributed joints.
2. First principles
Topology and signaling
Topologies in industrial use: bus (multidrop on one cable; CAN, RS-485), star (one cable per node back to a switch; classic Ethernet, PROFINET), tree (switched Ethernet with multiple switches), line / daisy-chain (each node has two ports, frame walks through; EtherCAT, POWERLINK with hub IP, PROFINET MRP), ring (closed daisy-chain; redundancy via MRP, HSR, PRP), point-to-point (SPI, MIPI CSI-2, USB).
Bus impedance and reflection. A transmission line carries a signal cleanly only when source, line, and termination impedances match. CAN, RS-485 and PROFIBUS specify a characteristic impedance of 120 Ω; 100BASE-TX Ethernet uses Cat 5e cable at 100 Ω; USB at 90 Ω; HDMI / DisplayPort at 100 Ω differential. A 120 Ω termination resistor at each physical end of a CAN bus absorbs the wave; missing or mid-cable termination produces reflections that the receiver sees as multiple bit transitions, causing CRC errors. Stubs (un-terminated branches) must be kept much shorter than λ/10 of the highest frequency component to avoid acting as antennas.
Differential signaling dominates anywhere wires leave the PCB. A pair of conductors carries equal-and-opposite voltage swings; the receiver looks at the difference, so any common-mode noise (ground bounce, EMI pickup, induced 50/60 Hz mains hum) cancels. CAN, RS-485, RS-422, USB, Ethernet 100BASE-TX / 1000BASE-T / 100BASE-T1, MIPI D-PHY, LVDS, HDMI, DisplayPort, PCIe, SATA — all differential. Single-ended buses (SPI, I²C, parallel) cannot survive cable runs longer than tens of centimetres before crosstalk and ground bounce destroy signal integrity.
Dominant vs recessive states (CAN-specific). On a CAN bus, a “dominant” 0 bit pulls both wires toward each other (CAN_H low, CAN_L high — differential ≈ 0 V); a “recessive” 1 bit lets both float to ~2.5 V (differential ≈ 0 V via internal pull). Multiple transmitters wired-OR onto the bus: whichever pulls dominant wins. This is the mechanism behind CAN’s lossless bitwise arbitration — and the reason a stuck-dominant transmitter (shorted PHY, latched driver) kills the entire bus until it is physically disconnected.
Determinism vs throughput
The single distinction that organises industrial buses is determinism: can you guarantee, in the worst case, that a frame sent at time t arrives by time t + Δ, with Δ small and Δ_jitter bounded? Closed-loop control needs determinism — every missed or jittered sample is an injected disturbance. Plain Ethernet under TCP gives high average throughput but unbounded worst-case latency under load. Industrial Ethernet variants (EtherCAT, PROFINET IRT, POWERLINK, TSN) bolt determinism onto Ethernet by (a) restricting topology, (b) using time slots, (c) replacing the MAC stack with a hardware processor, or (d) using a single managing-node-issues-token scheme.
Time-triggered vs event-triggered
- Time-triggered buses run on a global schedule — every node knows when it may transmit. EtherCAT, POWERLINK, TTEthernet, FlexRay, and TSN with IEEE 802.1Qbv all schedule transmissions in time slots driven by a synchronised clock. Latency and jitter are bounded by construction.
- Event-triggered buses transmit on demand, arbitrating contention with priority. CAN is the canonical event-triggered bus: nodes start transmitting at frame boundaries; when two collide, the bitwise dominant-vs-recessive arbitration on the message ID resolves the winner with no retransmission. Highest priority is guaranteed worst-case latency = one max-frame time + arbitration; lower priorities can be starved indefinitely under load.
OSI layer mapping
| OSI layer | Industrial bus example |
|---|---|
| 1 — Physical | RS-485, ISO 11898-2 CAN PHY, 100BASE-TX, 100BASE-T1, MIPI D-PHY |
| 2 — Data link | CAN frame + arbitration, EtherCAT DLPDU, Ethernet MAC |
| 3 — Network | IP (only on Ethernet-class); none on CAN, EtherCAT data path |
| 4 — Transport | TCP/UDP; no transport layer on CAN, RS-485 raw |
| 7 — Application | CANopen, J1939, DeviceNet, EtherCAT mailbox (CoE, EoE, FoE), OPC-UA, MQTT |
Clock synchronisation
Distributed control needs distributed time. IEEE 1588 Precision Time Protocol (PTP) and its gPTP profile (IEEE 802.1AS) deliver sub-microsecond synchronisation across switched Ethernet by exchanging timestamped sync messages and measuring link delays. EtherCAT Distributed Clocks (DC) achieve sub-100 ns by latching the frame’s arrival timestamp in every slave ASIC and propagating the reference master’s time on every cycle. The slowest practical sync (NTP over the internet) gives milliseconds — fine for logging, useless for joint coordination.
The minimum sync precision required scales with the highest closed-loop bandwidth that crosses node boundaries. If two joints are coordinated at 1 kHz with 10 % tolerance on phase, the sync error budget is ~100 µs (10 % of one cycle). For a 10 kHz inner loop crossing nodes, budget shrinks to ~10 µs — and PTP becomes necessary, not optional. For 100 µs hard-real-time motion, EtherCAT DC is essentially the only practical choice.
Frame structure and overhead
Field-bus efficiency depends on how much of each frame is payload vs overhead. Classic CAN: 1-bit SOF + 11-bit ID + 1 RTR + 6 control + 0–64 data + 15 CRC + 1 CRC delim + 1 ACK + 1 ACK delim + 7 EOF + 3 IFS = at most 135 bits to transport 8 bytes (64 bits) → ~47 % efficiency in the best case, far less with bit-stuffing (one extra bit per five same-polarity bits). CAN-FD with 64-byte payload pushes efficiency to ~70 %. EtherCAT achieves ~80 % at the application layer because the master packs many slaves’ I/O into one Ethernet frame and the overhead is amortised across them.
Bit timing and synchronisation point
CAN, RS-485, and most asynchronous serial protocols have no separate clock wire — receivers recover bit timing from edges in the data. A nominal bit time on CAN is divided into segments (SYNC, PROP_SEG, PHASE_SEG1, PHASE_SEG2); the sample point is at the boundary between PHASE_SEG1 and PHASE_SEG2, typically 75 – 87.5 % of the bit time. A 250 kbps CAN bus has a 4 µs bit time; the sample point at 75 % gives 3 µs of propagation budget — enough for ~600 m of cable at the speed of signal in copper (about 200 m/µs). Pushing to 1 Mbps shrinks the budget to 0.75 µs and the maximum bus length collapses to ~40 m.
3. Buses by tier — depth
3.1 Intra-board (chip-to-chip on one PCB)
- SPI (Serial Peripheral Interface, de-facto Motorola). Four wires: SCLK, MOSI (master-out-slave-in), MISO, /CS. Master-slave only, no addressing — each slave needs its own /CS line. Clock 1 MHz to 50+ MHz. Full-duplex. No acknowledgement, no error detection in the bus standard itself — drivers add CRC at the protocol layer. Used for IMUs (Bosch BMI088, TDK ICM-42688-P), high-speed ADCs (ADI AD7768), NOR/NAND flash, FPGA configuration, motor-control gate drivers (TI DRV8353), magnetic encoders (AMS AS5048A).
- I²C (Inter-Integrated Circuit, NXP/Philips UM10204). Two wires: SDA, SCL — open-drain with pull-up resistors (typically 4.7–10 kΩ to V_CC). Multi-master capable, addressable (7-bit or 10-bit). Standard 100 kHz, fast 400 kHz, fast+ 1 MHz, high-speed 3.4 MHz. Clock stretching by slow slaves is part of the protocol. Used for EEPROM, environmental sensors (Bosch BME280/680), ToF sensors (ST VL53L5CX), temperature sensors (TI TMP117), I/O expanders (NXP PCA9555), and many smart sensors that don’t need SPI bandwidth.
- UART / RS-232. Asynchronous serial, single wire per direction. 300 bps to 12+ Mbps. Single point-to-point or with addressing on top (Modbus ASCII/RTU). Debug consoles, GPS modules (u-blox ZED-F9P), GSM modems, low-rate inter-MCU.
- Parallel buses — legacy LCD controllers, classic memory-mapped peripherals, high-speed ADC/DAC data interfaces. Replaced almost everywhere by SPI, LVDS, or memory-controller IP.
- MIPI CSI-2 for camera-to-SoC. Differential D-PHY (or higher-speed C-PHY) 2–4 lanes at 1–6 Gbps per lane. Standardised by the MIPI Alliance; supported by every modern SoC (NVIDIA Jetson Orin, NXP i.MX 8M Plus, Qualcomm RB5/RB6, TI Sitara).
- PCIe — between SoC, GPU, NVMe, and FPGA. Lanes of 8 / 16 / 32 GT/s (Gen3 / Gen4 / Gen5). The dominant interconnect inside a robot’s compute box once you go beyond a single MCU.
- USB — 2.0 (480 Mbps), 3.x (5 / 10 / 20 Gbps), 4 (40 Gbps). Used for compute-to-LIDAR (Ouster, Velabit), compute-to-camera (Intel RealSense, Stereolabs ZED), compute-to-motor-driver (VESC, ODrive over CDC), debugger interfaces (J-Link, ST-Link).
3.2 Board-to-board real-time
- CAN (Controller Area Network, ISO 11898). Two-wire differential (CAN_H, CAN_L), 120 Ω termination at each end, multidrop bus topology. Classic CAN: 1 Mbps maximum on short buses, 125 kbps at the canonical 500 m. CAN-FD (ISO 11898-1:2015 and -2:2016): up to 5 Mbps (some PHYs 8 Mbps) in the data phase, 64-byte payloads (vs 8 bytes Classic), CRC widened to 17/21 bit. CAN-XL (CiA 610-1) extends payload to 2048 bytes and rates to 10+ Mbps but is still emerging.
- CANopen (CiA 301). Application layer on CAN: PDOs (process data objects — cyclic real-time data, max 8 bytes Classic CAN), SDOs (service data objects — confirmed configuration access), NMT (network management), SYNC, heartbeat / node guarding, EMCY (emergency). Object dictionary at fixed indices; profile families CiA 401 generic I/O, CiA 402 drives and motion, CiA 404 measurement.
- DeviceNet (ODVA, CIP family on CAN). Same physical layer as CAN, different application layer. Predominantly Rockwell-installed-base in North America.
- J1939 / ISO 11783. Heavy-vehicle CAN application layer. Used in trucks, buses, agricultural equipment (CLAAS, John Deere), construction equipment. PGNs (parameter group numbers) and SPNs (suspect parameter numbers) define the data dictionary.
- RS-485 / RS-422. Differential serial — TIA/EIA-485-A. Up to 32 nodes (more with low-loading drivers) on a bus 1200 m long at 100 kbps, or 12 m at 10 Mbps. Mostly carries Modbus RTU (Modicon, 1979) and proprietary protocols. Cheap, robust, ubiquitous; no built-in clock distribution or determinism beyond what the protocol gives.
- EtherCAT (Ethernet for Control Automation Technology, IEC 61158 Type 12 + IEC 61784-3 safety). 100 Mbps Ethernet (gigabit variants emerging in 2024–2025), processed on-the-fly by each slave ASIC (Beckhoff ET1100/ET1200, Microchip LAN9252/LAN9255, TI AM6442/AM2434 ICSS). A single frame circulates through a daisy-chain of slaves; each slave reads its slice and writes its response without buffering the whole frame. Cycle times 100 µs – 1 ms typical; sub-100 ns jitter via Distributed Clocks. Tooling: Beckhoff TwinCAT, open-source SOEM (Simple Open EtherCAT Master) and IgH EtherCAT Master on Linux + PREEMPT_RT.
- PROFINET RT / IRT (IEC 61158 Type 10, IEC 61784-2). Siemens-led industrial Ethernet. Three real-time classes: CC-A (RT, 10 ms cycles, no special HW), CC-B (RT, 5 ms), CC-C (IRT, 1 ms with PROFINET ASIC and topology-aware scheduling). PROFIsafe (IEC 61784-3) layered on top for SIL 3 safety.
- POWERLINK (Ethernet Powerlink, EPSG). Token-passing on 100 Mbps Ethernet: one Managing Node (MN) polls Controlled Nodes (CN) in time slots. Sub-100 µs cycles in optimal topologies. Open source (openPOWERLINK).
- EtherNet/IP (ODVA). CIP over standard TCP/UDP Ethernet. Dominant in Rockwell-installed-base PLCs. Not deterministic without TSN underlay; CIP Motion adds Class 1 cyclic with PTP synchronisation.
- SERCOS III (IEC 61491, IEC 61800-7-304). Motion-bus on 100 Mbps Ethernet with cyclic master-slave timing. Strong in machine tools.
- TTEthernet / TSN (IEEE 802.1 TSN amendments: 802.1Qbv time-aware shaper, 802.1AS gPTP, 802.1CB frame replication and elimination, 802.1Qbu pre-emption, 802.1Qci ingress policing). The standards layer that retrofits hard real-time onto plain Ethernet. Industrial Ethernet variants are migrating onto TSN; automotive uses it natively (100BASE-T1 + TSN as the in-vehicle backbone replacing legacy MOST and FlexRay).
- FlexRay (ISO 17458). Automotive, time-triggered, 10 Mbps, dual-channel for redundancy. Used in BMW iDrive, Audi quattro adaptive damping, and various OEM chassis networks. Slowly being replaced by 100BASE-T1 + TSN.
- LIN (Local Interconnect Network, ISO 17987). Single-wire, master-slave automotive sub-bus at 1–20 kbps. Window lifters, mirror controls, seat motors, ambient lighting.
3.3 Supervisory + IT integration
- Ethernet TCP/IP — the universal substrate. Non-deterministic at the protocol level; deterministic only when underlaid by TSN or industrial Ethernet variants.
- OPC-UA (IEC 62541). The de-facto industrial IoT protocol. Information modelling, security (X.509 certificates + TLS), client-server and pub-sub (over UDP or MQTT) profiles. Companion specs from VDMA define semantic models for robotics (OPC 40010), CNC, injection moulding, energy, etc.
- MQTT (ISO/IEC 20922, OASIS). Pub-sub over TCP, brokered. Three QoS levels (at-most-once, at-least-once, exactly-once). Brokers: Mosquitto, HiveMQ, EMQX, AWS IoT Core, Azure IoT Hub. Used for fleet telemetry, cloud-side coordination.
- DDS (Data Distribution Service, OMG v1.4 2015). Pub-sub with rich QoS (deadline, latency budget, ownership, partition, durability). The transport layer beneath ROS 2. Implementations: Eclipse Cyclone DDS, eProsima Fast DDS, RTI Connext DDS, Twin Oaks CoreDX, Gurum DDS.
- ROS 1 (TCPROS, UDPROS). Roscore-based, TCP transport, not real-time, not safety-rated. End-of-life (Noetic Ninjemys, May 2025).
- ROS 2 RMW (ROS Middleware abstraction). Plugs into Fast DDS (default in Humble–Lyrical), Cyclone DDS, Connext, or Iceoryx for true zero-copy shared memory inside one host. QoS profiles for reliability + latency + history.
- HTTP / REST / gRPC. Cloud-side telemetry, configuration UI, remote teleoperation control planes (Boston Dynamics Spot, Anymal Cerberus).
3.4 Wireless
- Wi-Fi 6 / 6E / 7 (IEEE 802.11ax / be). Up to 9.6 Gbps (Wi-Fi 6) / 46 Gbps (Wi-Fi 7) aggregate; latency single-digit milliseconds best case, variable under load. Used for teleop, fleet manager, software update.
- 5G + URLLC (3GPP Rel-16/17/18). Ultra-Reliable Low-Latency Communications: target 1 ms one-way, 99.999 % reliability. Private 5G networks for factory floors. Real-world ms-class deterministic only inside well-engineered private deployments.
- Bluetooth Low Energy 5.4 (Bluetooth SIG 2023). Sensor tags, key fobs, teach-pendant pairing. Latency 10s of ms; not for control.
- Zigbee / Thread / Matter (IEEE 802.15.4 PHY). Mesh for sensor networks, smart-home integration. Robotics use mostly in smart-home companion robots.
- LoRa / LoRaWAN (Semtech / LoRa Alliance). Long-range (km), low-data (~10 kbps), sub-GHz ISM band. Fleet asset tracking on outdoor AGVs.
- UWB (Ultra-Wideband, IEEE 802.15.4z, Qorvo DW3000 / NXP Trimension / Apple U1/U2). Centimetre-class ranging, used for indoor localisation, robot-to-tag, anti-collision, secure access.
3.5 Application-layer protocols on top of the field bus
The wire is half the story — the application protocol determines what data structures the bus carries:
- CANopen object dictionary (CiA 301): every device exposes a flat dictionary indexed by 16-bit index + 8-bit sub-index. Index 0x6040 is the “Controlword” on a CiA 402 drive; 0x6041 is the “Statusword”. Cyclic PDOs map dictionary entries onto fixed CAN identifiers for fast access; SDOs use one CAN ID per direction for confirmed reads/writes.
- CiA 402 finite-state machine (drives): every CANopen-compliant drive runs the same 14-state FSM (Not ready → Switch on disabled → Ready to switch on → Switched on → Operation enabled → …). The master walks the drive through these states using bit patterns in the Controlword. EtherCAT CoE (CANopen-over-EtherCAT) and POWERLINK reuse the same FSM — write a controller once, drive any compliant servo.
- EtherCAT mailbox protocols — CoE (CANopen-over-EtherCAT, by far the most common), FoE (File-over-EtherCAT, used for firmware update), EoE (Ethernet-over-EtherCAT, for tunnelling TCP/IP to slaves), SoE (Servo-drive Profile over EtherCAT, the IEC 61491 SERCOS interface), AoE (ADS-over-EtherCAT, Beckhoff’s debug/parameter access protocol).
- CIP (Common Industrial Protocol, ODVA): the object model shared by DeviceNet, ControlNet, EtherNet/IP, CompoNet, and CIP Motion. Class 1 (cyclic implicit) and Class 3 (acyclic explicit) messaging.
- J1939 PGN/SPN dictionary: 18-bit parameter group numbers identify the message; suspect parameter numbers identify the data field within. PGN 61444 is “Electronic Engine Controller 1” (EEC1); SPN 190 inside it is the engine speed.
- Modbus function codes: 01 Read Coils, 02 Read Discrete Inputs, 03 Read Holding Registers, 04 Read Input Registers, 05 Write Single Coil, 06 Write Single Register, 15 Write Multiple Coils, 16 Write Multiple Registers, 23 Read/Write Multiple Registers. Trivial protocol — which is exactly why it’s still everywhere 45 years later.
- OPC-UA information model: nodes connected by typed references (HasComponent, HasProperty, HasTypeDefinition). VDMA 40010 “OPC UA for Robotics” defines the robot’s nodes (Axes, MotionDevices, SafetyStates) — letting a SCADA system or MES read robot state in a vendor-independent way.
3.6 Safety overlays
Real machines need functional safety on top of the field bus — IEC 61508 / ISO 13849 / ISO 10218 for industrial robots, ISO 26262 for automotive. The deterministic field-bus families each have a safety overlay riding on top:
- FSoE — Failsafe over EtherCAT (IEC 61784-3-12, ETG.5100). Black-channel principle: safety frames travel as opaque payload through the regular EtherCAT data stream; the FSoE master and slave PLCs handle CRC, watchdog, sequence-number verification themselves. Certified to SIL 3 / PL e. Round-trip safety reaction times typically 4–16 ms depending on watchdog configuration.
- PROFIsafe (IEC 61784-3-3). Same black-channel concept on PROFINET (and PROFIBUS). PROFIsafe F-Host (typically a Siemens S7-1500F) talks to F-Devices through a regular PROFINET network; safety reaction time configurable per F-Device.
- CIP Safety (ODVA, IEC 61784-3-2). The same idea on EtherNet/IP / DeviceNet / CompoNet.
- openSAFETY (EPSG, IEC 61784-3-13). Bus-independent safety protocol; primarily used over POWERLINK but transportable.
Common to all four: the safety layer is the only part that needs to be safety-certified; the underlying field bus does not need certification because it is treated as an unreliable transport. This drastically reduces the qualification cost of new machines — only the safety PLC + safety I/O modules carry SIL paperwork.
3.7 Migration trends 2020–2026
- Automotive Ethernet replacing CAN/FlexRay — 100BASE-T1 / 1000BASE-T1 with TSN displaces FlexRay in chassis networks (BMW iX, Mercedes EQS, Volvo EX90). CAN-FD persists for low-bandwidth body and powertrain ECUs.
- EtherCAT G — 1 Gbps EtherCAT (ETG.1510, 2018) — sees adoption in high-axis-count semiconductor and packaging machines from 2022 onwards.
- TSN adoption inside industrial Ethernet — PROFINET TSN, CC-Link IE TSN, OPC UA FX (Field eXchange) over TSN are converging field-bus traffic onto a shared switched-Ethernet backbone with TSN scheduling.
- Single-Pair Ethernet (SPE) / 10BASE-T1L (IEEE 802.3cg) — 10 Mbps over a single twisted pair, 1000 m reach, with power (PoDL, IEEE 802.3cg). Replaces 4–20 mA loops and Modbus RTU in process automation; growing in mobile robotics as a long-reach sensor bus.
- ROS 2 Zenoh integration —
rmw_zenoh_cpptech preview in Jazzy (May 2024) and ROS 2 Lyrical Luth (May 2025) addresses DDS discovery scaling pain at fleet scale. - CAN XL (CiA 610-1, 2022) — 10+ Mbps with up to 2048-byte payload; designed to coexist on CAN-FD physical media. Adoption pending broad PHY availability (Bosch, NXP, Infineon shipping 2024–2025).
4. Practical math — three worked examples
Example A — CAN bus loading on a six-axis cobot
A research cobot: 6 joint controllers, 3 wrist-mounted sensors (force/torque, IMU, tool), 1 system controller = 10 nodes on a single CAN-FD trunk. Each joint sends a PDO at 1 kHz containing: 4 byte position, 2 byte velocity, 2 byte current, 1 byte status = 9 bytes, padded to one 16-byte FD frame; receives a setpoint of 4 byte target position + 2 byte feed-forward torque = 6 byte payload, padded to one 8-byte FD frame.
Frame overhead. A CAN-FD frame at 5 Mbps data-phase has roughly 30 bits of arbitration-rate overhead (1 Mbps arbitration phase: SOF + 29-bit ID + RTR + control + CRC + ACK + EOF) plus 64 × n bits of payload at 5 Mbps + ~30 bits CRC/EOF in the data phase. For a 16-byte payload: ~80 µs at 1 Mbps arbitration + 128 / 5 = 25.6 µs payload = ~106 µs per frame. For an 8-byte payload: ~83 µs.
Per-cycle bus time. 6 joint PDOs (16 byte each, master-to-joint) + 6 joint replies (16 byte) + 3 sensor PDOs (~16 byte) = 15 frames × ~100 µs = 1.5 ms. At a 1 kHz master cycle the bus is 150 % loaded — won’t work.
Three fixes that actually work:
- Lift to two CAN-FD trunks — drives on one, sensors on the other; halve per-trunk load to ~80 µs / cycle. Practical and common.
- Use full 8 Mbps data phase (TI TCAN4550-Q1, NXP TJA146x). Payload time halves; total ~700 µs / cycle = 70 % — feasible but tight.
- Move to EtherCAT — see Example B. The right answer for cycle times below 1 ms with 10+ nodes.
This worked case is the standard reason cobot vendors moved off CAN onto EtherCAT for inter-joint communication in the 2010s — the math doesn’t close at 1 kHz with double-digit nodes.
Example B — EtherCAT cycle budget on a 12-slave drive chain
12 servo drives daisy-chained on one EtherCAT segment, each exchanging 32 byte input + 32 byte output of process data per cycle. PHY is 100BASE-TX (100 Mbps).
Frame size. EtherCAT can pack up to 1486 bytes of EtherCAT payload (one frame). Per-slave PDU has 12-byte header + 64-byte data + 2-byte working counter = 78 bytes; 12 slaves × 78 = 936 bytes; plus Ethernet headers (14) + EtherCAT header (2) + FCS (4) ≈ 956 bytes. Transmission time at 100 Mbps = 956 × 8 / 100e6 = 76.5 µs.
Slave propagation. Each ESC ASIC adds ~300–600 ns processing delay; 100 m of Cat 5 contributes ~5 ns/m. With 1 m cables between drives: 12 × (500 ns ASIC + 5 ns cable) ≈ 6 µs one-way, 12 µs round-trip.
Total cycle. Frame TX 76.5 µs + propagation 12 µs + master scheduling jitter on PREEMPT_RT Linux ~10 µs = ~100 µs round-trip. EtherCAT comfortably runs at 1 kHz (1000 µs budget — 10× margin) and routinely runs at 4–10 kHz (Beckhoff XFC, Synapticon SOMANET). For comparison, the same payload on Classic CAN at 1 Mbps would consume the whole cycle.
Example C — I²C sensor multiplexing on a swarm robot
A small mobile platform reads four BME280 environmental sensors at four corners (chassis humidity, temp, pressure). All BME280s share the same I²C address 0x76 — they need a multiplexer (TI TCA9548A, 8 channels) or address-strapping pin (BME280 supports 0x76 / 0x77 only — two corners conflict). Use the TCA9548A.
Per-sensor read. BME280 forced-mode measurement: ~10 ms conversion. Register burst read 8 bytes at 400 kHz I²C fast mode: each byte takes 9 clocks (1 byte + 1 ACK) = 22.5 µs; 8 bytes + 1 reg address + start/stop ≈ 250 µs. Mux channel switch (1-byte write to 0x70): ~75 µs.
Per-cycle total. Four sensors: 4 × (10 ms convert + 250 µs read + 75 µs mux switch) = ~41.3 ms if measured sequentially with blocking. Achievable rate ≈ 24 Hz.
If 100 Hz needed: start all four conversions in parallel (write 0xF4 mode register through each mux channel), wait 10 ms, then read all four — total ~11 ms / cycle = 90 Hz. Or run separate I²C buses on four GPIO pairs of an STM32 (each STM32 has 3–4 I²C peripherals) and read in parallel — limited only by conversion time at ~95 Hz.
Example D — DDS QoS budget for a ROS 2 LIDAR pipeline
A Velodyne VLP-16 LIDAR pumps 300 000 points/s at 10 Hz scan rate; serialised as ROS 2 sensor_msgs/PointCloud2 each scan is ~480 kB. Three subscribers: SLAM, obstacle detection, recording bag.
Frame rate vs bandwidth. 480 kB × 10 Hz × 3 subscribers = 14.4 MB/s on the loopback or local network. At gigabit Ethernet (125 MB/s) that’s 12 % — comfortable. At 100 Mbps it would be 115 % — fails. On loopback / shared memory (Iceoryx RMW) the cost is a pointer-copy.
QoS choice. History KEEP_LAST depth 1 (only the latest scan matters — never queue stale point clouds). Reliability RELIABLE if the SLAM node must not drop frames; BEST_EFFORT for the obstacle detector (a dropped frame is harmless at 10 Hz). Durability VOLATILE (no late-joiner needs old clouds). Deadline 150 ms (alert if a scan is more than one cycle late).
Network MTU. Default Linux MTU is 1500 bytes; a 480 kB cloud fragments into ~320 UDP datagrams. Each lost datagram triggers RTPS retransmission. On a noisy Wi-Fi link the retransmission storm can overwhelm the bus. Two mitigations: use jumbo frames (9000-byte MTU) if the switch supports them — fragmentation drops to ~55 datagrams; or move pipeline onto loopback / Iceoryx and avoid the network entirely.
Iceoryx zero-copy. Switching from Cyclone DDS to rmw_iceoryx_cpp on a single host turns the point-cloud copy into a pointer hand-off through shared memory. Measured publish-to-subscribe latency drops from ~6 ms to ~50 µs on an x86 Jetson AGX Orin. Useful for high-bandwidth on-host pipelines (LIDAR fusion, image processing) but does not cross host boundaries — bridge with rmw_cyclonedds if multi-host needed.
5. Design heuristics
Choose by latency requirement.
| Loop bandwidth | Bus tier |
|---|---|
| <100 µs (10+ kHz current loops) | Stay on-chip: SPI to gate driver, no field bus in the loop |
| 100 µs – 1 ms (1–10 kHz velocity / position loops) | EtherCAT, POWERLINK, PROFINET IRT, SERCOS III |
| 1–10 ms (PLC-class motion, coordinated multi-axis) | CAN-FD with CANopen / J1939, PROFINET RT, EtherNet/IP |
| 10–100 ms (supervisory, ROS 2 typical) | DDS over plain Ethernet, MQTT, OPC-UA |
| >100 ms (UI, logging, cloud) | HTTP/REST, gRPC, MQTT |
Cabling distance ceilings.
| Bus | Max useful length |
|---|---|
| SPI, I²C unshielded | 0.3 m |
| USB 2.0 | 5 m (passive), 30 m (active hub) |
| USB 3.x | 3 m (passive), 10 m (active optical) |
| 100BASE-TX / EtherCAT Cat 5e | 100 m per segment |
| 100BASE-T1 (single-pair Ethernet) | 15 m (auto), 40 m (industrial) |
| CAN at 1 Mbps | 40 m |
| CAN at 125 kbps | 500 m |
| RS-485 at 100 kbps | 1200 m |
| LoRa 868/915 MHz | 2–15 km LOS |
Design rules of thumb.
- Favour daisy-chain over star for distributed actuators. EtherCAT, POWERLINK, and ring-redundant PROFINET cut cable cost by 5–10× vs star Ethernet on a 6-axis arm.
- Two-wire field buses (CAN, RS-485) need termination — 120 Ω at each physical end of the bus. Star wiring breaks termination and causes reflections.
- Galvanic isolation at every entry/exit of a control cabinet. Isolated CAN PHY (TI ISO1042, ADI ADM3055) and isolated RS-485 (ADI ADM2682E) eliminate ground-loop currents that destroy transceivers.
- Common-mode chokes on CAN/RS-485 cable entry — kills ESD and conducted EMI.
- Shielded twisted pair (STP) for any industrial bus run outside a single PCB. Foil shield + drain wire grounded at one end only.
- Don’t mix logic levels without buffering. 5 V SPI talking to a 3.3 V sensor: use TI TXS0108E or NXP NTBHF level translator. Forgotten level mismatch is a top-3 cause of mysterious sensor dropouts.
- CAN priority — assign IDs by criticality: emergency stop on lowest ID (highest priority), drive feedback next, slow status messages on high IDs. Don’t accidentally swap.
- Bus utilisation ceiling 70–80 % for CAN — above that, jitter on low-priority frames explodes.
- EtherCAT topology is a DAG — branch points use junction slaves (Beckhoff EK1122). Avoid more than ~3 levels of nested branching for clarity.
- Time sync requirement — if two axes must move in coordinated trajectory, they must share a clock. PTP on plain Ethernet gives sub-µs; EtherCAT DC gives sub-100 ns; nothing gives sub-µs without dedicated time-sync hardware.
- Security on field buses — CAN, RS-485, classic EtherCAT have no authentication. Treat the field bus as a trusted internal segment; firewall it from corporate IT; use OPC-UA with certificate auth at the boundary.
- Redundancy where lives are at stake — automotive uses dual CAN for chassis safety; industrial uses ring-redundant Ethernet (MRP, HSR, PRP); aerospace uses TTEthernet with three replicas.
- Connector grade matters more than cable grade in mobile robots — M12 D-coded (Ethernet) / A-coded (signal) / B-coded (Profibus) / X-coded (gigabit Ethernet) are the industrial standard. Hobby JST / Dupont fail within months on a moving joint.
- Cable flex rating — for any wire that crosses a robot joint, specify continuous-flex rated cable (igus Chainflex, LAPP ÖLFLEX FD, SAB Bröckskes drag-chain). Standard PVC insulation cracks within 10^5 flex cycles; FEP / PUR jacketed cables survive 10^7+.
- Wire colour discipline. Stick to industrial conventions: CAN_H green / CAN_L white per ISO 11898-2 reference; EtherCAT TX+ yellow / TX- white-yellow / RX+ blue / RX- white-blue per the Beckhoff convention. Inconsistent colouring inside one machine is a top cause of mis-wiring at commissioning.
- Diagnostic LEDs at every node. A bicolour LED tied to bus activity + error state pays for itself the first time a slave drops out — a 30-second visual check replaces a 30-minute Wireshark hunt.
- Build the bus to fail gracefully. If a sensor cable can be hot-unplugged without crashing the master, the system has a chance during field service. Hot-pluggable EtherCAT requires Hot Connect groups configured in the ENI file.
- Versioning matters on serialised protocols. ROS 2 message types are hashed; a mismatched .msg between two nodes silently drops messages on Fast DDS. Pin RMW + message-package versions across all robots in a fleet.
- Don’t run buses near VFDs. Variable-frequency motor drives radiate broadband EMI from their switching at 4–20 kHz. Keep CAN / EtherCAT cables 300 mm clear of drive output cables, or cross them at 90°.
- Bus power. PoE (IEEE 802.3af/at/bt) on EtherCAT — only on dedicated PoE-capable slaves; standard EtherCAT slaves expect separate 24 V. PoDL (IEEE 802.3bu) on Single-Pair Ethernet up to 50 W. CAN bus power on the same connector via Deutsch DT04-4P is convention but not standard.
- Topology audit on commissioning. Walk the cable run end-to-end with a known-good cable tester (Fluke MicroScanner Cable Verifier for Ethernet; PEAK PCAN-Diag 2 for CAN). Document the as-built layout in the machine manual. Mismatch between as-designed and as-built drawings is a top cause of multi-day field troubleshooting.
- Cycle-time budget = inner loop × 5. A field-bus cycle should run at least five times faster than the closed-loop bandwidth it carries. For a 100 Hz position loop, a 1 ms (1 kHz) field bus is the bare minimum; 500 µs is comfortable; 100 µs is overkill but harmless.
- Plan for firmware update. Every smart device on the bus needs a way to receive new firmware without dismantling the machine. EtherCAT slaves use FoE (File-over-EtherCAT); CANopen drives use a bootloader exposed via SDO (object 0x1F50); CAN-FD drives often use UDS (ISO 14229) over CAN. Skipping this in the design phase costs days per field update.
- Document the address space. Maintain an authoritative spreadsheet / YAML of node ID → device → CAN IDs / EtherCAT positions / Modbus register ranges. Devices added later by field engineers without updating this document are a guaranteed conflict in 6–24 months.
6. Components & sourcing
CAN transceivers (Classic + FD). TI TCAN1051, TCAN1462, TCAN4550-Q1 (integrated controller + transceiver via SPI); NXP TJA1051, TJA1463, TJA1153 (secure CAN); Microchip MCP2562FD, MCP2517FD (external controller); Infineon TLE9252V, TLE9255W (auto-grade, partial-networking).
Isolated CAN. TI ISO1042, ISO1044; ADI ADM3055, ADM3057; NVE IL35051 — provide 5 kV+ isolation between bus and MCU side.
RS-485 transceivers. TI SN65HVD485, SN65HVD75; Maxim MAX13487, MAX14778; ADI ADM2682E (isolated). Standard 1/8-unit-load drivers give 256-node fan-out.
EtherCAT slave ASICs / SoCs. Beckhoff ET1100, ET1200 (fixed-function ESC); Microchip LAN9252 (3-port slave), LAN9255 (slave + Cortex-M4); TI AM6442 / AM2434 / AM243x with ICSS Industrial Communications Subsystem (slave or master, fully reconfigurable PHY); Renesas RZ/T2L; Texas Instruments PRU-ICSS reference designs.
EtherCAT masters. Beckhoff CX-series (CX5140, CX7000) industrial PCs running TwinCAT 3; Synapticon SOMANET; open-source SOEM (Simple Open EtherCAT Master) and IgH EtherCAT Master running on Linux PREEMPT_RT — used by many academic / robotic platforms (e.g., the OpenCR / Robotis OP3).
PROFINET stacks. Siemens TIA Portal + SIMATIC S7-1500 PLCs; Hilscher netX 90 / 100 / 4000 communication processors; Phoenix Contact Axioline; Softing PROFINET Driver.
OPC-UA stacks. Unified Automation UaModeler + UaExpert (commercial); open62541 (open-source C); node-opcua (Node.js); Eclipse Milo (Java); FreeOpcUa (Python).
DDS implementations. Eclipse Cyclone DDS (default in newer ROS 2), eProsima Fast DDS (historical default), RTI Connext DDS (commercial, used by NASA, DoD), Eclipse Iceoryx (shared-memory RMW for ROS 2, true zero-copy on the same host).
MQTT brokers. Eclipse Mosquitto (lightweight), HiveMQ (commercial enterprise), EMQX, VerneMQ, AWS IoT Core, Azure IoT Hub, Google Cloud IoT Core.
Ethernet PHYs. TI DP83822 (industrial 100 Mbps), DP83867 (gigabit), DP83TC811 (100BASE-T1); NXP TJA1100 / TJA1102 (auto 100BASE-T1); Microchip KSZ8081 (industrial); Realtek RTL8211 (commercial gigabit).
MCUs with on-chip CAN-FD. STM32G4 series (3× CAN-FD), STM32H7 (2× CAN-FD), NXP S32K344 (6× CAN-FD, ASIL-D), Infineon AURIX TC3xx (auto, up to 6× CAN-FD), Microchip dsPIC33CH (2× CAN-FD), Renesas RA6T2.
MCUs / SoCs with EtherCAT slave. TI AM243x / AM6442 (slave or master via ICSS), Microchip LAN9255 (slave + Cortex-M4F), Renesas RZ/T2L (slave + dual Cortex-R52), Infineon XMC4800.
Connectors. M12 — D-coded (4-pin, 100 Mbps Ethernet, PROFINET, EtherCAT), X-coded (8-pin, gigabit Ethernet), A-coded (4/5/8/12-pin signal), B-coded (5-pin Profibus). Phoenix Contact, Harting, Murr, Lumberg. Deutsch DTM / DT — automotive / off-highway CAN (Tyco/TE). Lemo — medical and high-vibration.
Cables. SAB Bröckskes (German industrial), Belden 3084A (CAN), Belden 3105A (RS-485), LAPP ÖLFLEX (drag-chain rated), igus Chainflex (continuous-flex drag chains for moving joints), Helukabel.
Bus analysers / debugging. Vector CANalyzer + CANoe (commercial industry standard), PEAK PCAN-USB FD, Kvaser Memorator / Leaf / Hybrid 2xCAN, Beckhoff TwinCAT Scope, Wireshark with CAN/EtherCAT/PROFINET dissectors, Saleae Logic Pro 16, Total Phase Beagle (USB and I²C/SPI).
7. Reference data tables
7.1 Bus comparison
| Bus | Phys | Max bandwidth | Max length | Topology | Typical latency | Typical cost / node |
|---|---|---|---|---|---|---|
| SPI | TTL/CMOS | 50 Mbps+ | 0.3 m | Star (CS-per-slave) | <1 µs | $ |
| I²C | Open-drain | 3.4 Mbps | 0.3 m | Bus | 10–100 µs | $ |
| UART | TTL | 12 Mbps | 0.3 m | P2P | <1 µs | $ |
| RS-485 | Differential | 10 Mbps | 1200 m | Bus | 100 µs+ | $ |
| Classic CAN | Differential | 1 Mbps | 40 m @ 1M | Bus | 100 µs–10 ms | $ |
| CAN-FD | Differential | 5–8 Mbps data | 40 m | Bus | 50 µs–1 ms | $$ |
| EtherCAT | 100BASE-TX | 100 Mbps | 100 m / hop | Daisy-chain | 50 µs–1 ms | $$$ |
| PROFINET IRT | 100BASE-TX | 100 Mbps | 100 m / hop | Star / ring | 250 µs–1 ms | $$$ |
| EtherNet/IP | Ethernet | 1 Gbps | 100 m / hop | Star | 1–10 ms | $$$ |
| POWERLINK | 100BASE-TX | 100 Mbps | 100 m / hop | Daisy / star | 100 µs–1 ms | $$$ |
| TSN Ethernet | 1 Gbps Eth | 1–10 Gbps | 100 m / hop | Switched | 100 µs–1 ms | $$$$ |
| Plain Ethernet | 100BASE / 1000BASE | 1–100 Gbps | 100 m | Switched | 0.1–10 ms (best-effort) | $$ |
| Wi-Fi 6 | 802.11ax | 9.6 Gbps | ~30 m | Star (AP) | 2–20 ms | $$ |
| 5G URLLC | NR | 10 Gbps | km | Cellular | 1–10 ms | $$$$ |
7.2 CANopen vs EtherCAT vs PROFINET decision matrix
| Need | Pick |
|---|---|
| <10 axes, 1 kHz cycle OK, lowest cost | CANopen on CAN-FD |
| 10+ axes, 1 kHz+ cycle, daisy-chain wiring | EtherCAT |
| Siemens PLC house | PROFINET RT / IRT |
| Rockwell PLC house | EtherNet/IP + CIP Motion |
| Machine tool (CNC) lineage | SERCOS III or EtherCAT |
| Mixed brand, modern Linux master | EtherCAT (open SOEM master) |
| Automotive (in-vehicle) | CAN-FD + LIN + 100BASE-T1 TSN |
| Off-highway, agricultural | J1939 |
| Safety-rated (SIL 2/3) overlay needed | EtherCAT FSoE, PROFIsafe, CIP Safety |
7.3 ROS 2 RMW comparison (Humble / Iron / Jazzy / Lyrical, 2024–2026)
| RMW | Default in | Strengths | Weaknesses |
|---|---|---|---|
| Fast DDS (eProsima) | Foxy, Galactic, Humble (initial) | Mature, good discovery, OMG-compliant | Discovery overhead with 100+ nodes |
| Cyclone DDS (Eclipse) | Iron, Jazzy, Lyrical | Lightweight, lower latency, ROS 2 default since Iron | Smaller ecosystem |
| Connext DDS (RTI) | Commercial / DoD / NASA | Full QoS, safety certifications (DO-178C), tooling | Closed-source, license cost |
| Iceoryx (Eclipse) | Same-host shared memory | True zero-copy, microsecond inter-process | One host only — needs bridge for multi-host |
Zenoh (rmw_zenoh) | Tech preview Jazzy+ | Pub-sub-query-storage, network agnostic, low overhead | Newer; ecosystem catching up |
7.4 MCU families with on-chip CAN-FD
| Family | Vendor | CAN-FD ports | Notes |
|---|---|---|---|
| STM32G4 | ST | 3 | Cortex-M4, 170 MHz, motor-control-oriented |
| STM32H7 | ST | 2 | Cortex-M7 + M4, 480 MHz, plus optional EtherCAT slave via LAN9255 |
| S32K344 / S32K358 | NXP | 6 | ASIL-D, automotive |
| AURIX TC3xx | Infineon | up to 6 | ASIL-D, automotive |
| dsPIC33CH | Microchip | 2 | Dual-core, motor control |
| RA6T2 / RA8 | Renesas | 2–4 | Cortex-M33/M85 |
| AM243x / AM6442 | TI | 4 + ICSS (EtherCAT slave + CAN) | Cortex-R5F + A53, industrial |
7.5 TSN amendment timeline
| Amendment | Year | Adds |
|---|---|---|
| 802.1AS | 2011 / 2020 (rev) | gPTP time synchronisation |
| 802.1Qav | 2009 | Credit-based shaper |
| 802.1Qbv | 2015 | Time-aware shaper (scheduled gates) |
| 802.1Qbu / 802.3br | 2016 | Frame pre-emption |
| 802.1CB | 2017 | Frame replication and elimination for reliability |
| 802.1Qci | 2017 | Per-stream filtering and policing |
| 802.1Qch | 2017 | Cyclic queuing and forwarding |
| 802.1Qcc | 2018 | Stream reservation enhancements |
7.6 I²C / SPI sensor and peripheral cheat-sheet
| Device class | Typical bus | Typical address / CS | Notes |
|---|---|---|---|
| 6-axis IMU (BMI088, ICM-42688-P) | SPI 8–10 MHz | dedicated /CS | high-rate gyro + accel, 8 kHz internal |
| 9-axis IMU (BNO055, ISM330DHCX) | I²C 400 kHz / SPI | 0x28/0x29 or /CS | sensor fusion on-chip |
| ToF (VL53L5CX, VL53L7CX) | I²C 1 MHz | 0x29 (re-programmable) | 8×8 zone ranging |
| Magnetic encoder (AS5048A/B) | SPI 10 MHz / I²C 1 MHz | /CS or 0x40 | 14-bit absolute angle |
| Optical encoder interface (LS7366R) | SPI 8 MHz | /CS | quadrature decoder, 32-bit counter |
| ADC for current sense (INA226, INA228) | I²C 2.94 MHz | 0x40–0x4F | shunt + bus monitoring |
| ADC for high-speed (AD7768, ADS131M08) | SPI 8–25 MHz | /CS | 24-bit Σ-Δ, 32 kSPS |
| Pressure / altitude (BMP388, MS5611) | I²C / SPI | 0x76/0x77 | barometric for drones |
| Temperature (TMP117, MAX31875) | I²C 1 MHz | 0x48–0x4B | ±0.1 °C |
| RTC (DS3231, RV-8803) | I²C 400 kHz | 0x68 | timekeeping during power loss |
| EEPROM (24LC256, 24AA02) | I²C 400 kHz | 0x50–0x57 | small config storage |
| SD card | SPI 25 MHz / SDIO | /CS | logging, firmware update |
7.7 Bandwidth and overhead ready-reckoner
| Bus | Raw rate | Useful payload rate | Min frame latency |
|---|---|---|---|
| Classic CAN @ 1 Mbps | 1 Mbps | ~470 kbps (8 B payload) | 135 µs |
| CAN-FD @ 5 Mbps data, 1 Mbps arb | 5 Mbps (data phase) | ~3 Mbps (64 B payload) | 100 µs |
| RS-485 @ 10 Mbps Modbus RTU | 10 Mbps | ~7 Mbps | 1 ms (typical poll) |
| EtherCAT 100 Mbps | 100 Mbps | ~80 Mbps app | 30 µs (single slave) |
| PROFINET IRT 100 Mbps | 100 Mbps | ~70 Mbps | 250 µs |
| Gigabit Ethernet TSN | 1 Gbps | ~900 Mbps | 100 µs |
| USB 2.0 HS | 480 Mbps | ~320 Mbps | 125 µs (microframe) |
| USB 3.2 Gen 1 | 5 Gbps | ~3.2 Gbps | 125 µs |
| MIPI CSI-2 D-PHY (4 lanes × 2.5 Gbps) | 10 Gbps | ~8 Gbps | line-rate |
| PCIe Gen3 ×4 | 32 GT/s | ~3.94 GB/s | <1 µs |
8. Failure modes & debugging
CAN bus-off. Each transceiver counts TX errors; ≥256 errors triggers bus-off — node stops transmitting until reset. Caused by: dominant-bit short, missing termination, mismatched bitrate, electrical noise, or a single faulty transmitter swamping the bus. Diagnose with a CAN analyser (PCAN-View error frame counter).
Missing or wrong termination. CAN/RS-485 need 120 Ω at each physical end of the bus — not one in the middle. Common error: a star wiring with 120 Ω at the hub plus stubs left dangling. Use a multimeter across CAN_H – CAN_L with bus powered down: should read 60 Ω (two 120 Ω in parallel).
Ground bounce / common-mode shift between two ends of a long bus. Manifests as random errors on long cable runs even when termination is correct. Solution: galvanically isolated PHYs at both ends or a third (ground) wire alongside the differential pair.
CAN priority inversion. A low-priority frame, once started, finishes — a higher-priority frame queued after the start of TX must wait. Worst-case latency for priority p = T_lowest-priority-frame + ΣT_higher-priority-frames-in-window. Diagnose by time-stamped CAN trace showing E-stop messages with 2+ ms tail latency.
Bus loading >80 %. Even at 70 % the jitter on low-priority frames explodes. Diagnose with canstat-style utilisation calculator or Vector / PEAK’s bus-load tool.
CANopen address conflict. Two nodes with the same Node-ID emit conflicting heartbeats. Fix: LSS (Layer Setting Services) commissioning to assign IDs at production; or unique-ID DIP switches on each node.
EtherCAT slave dropout / “Lost frame”. Causes: cable break, marginal connector, DC sync lost, ESC overheated. TwinCAT shows working-counter mismatch. Walk the chain with a cable tester (Beckhoff EL9011 line tester) or replace one cable at a time.
EtherCAT Distributed Clock drift. Caused by reference slave’s local oscillator drifting; happens if reference master is also drifting. Fix: pick the most stable slave as reference (drive with internal TCXO).
PROFINET / PROFIsafe checksum mismatch on watchdog F-frames. SIL 3 chain halts. Usually a wiring or address-conflict issue; re-run engineering with TIA Portal HW Config.
DDS discovery latency in ROS 2 with 50+ nodes. Fast DDS discovers via UDP multicast; on a large network with multicast filtering this stalls. Fix: switch to Discovery Server mode (fastdds discovery -i 0) or move to Cyclone DDS, which has a leaner discovery; or use static discovery files.
Network multicast storm with industrial Ethernet on a managed switch — RT frames get lost in IGMP storm. Enable IGMP snooping; isolate real-time VLAN; if using TSN, configure time-aware shaper.
Wireshark “Malformed packet” on EtherCAT / PROFINET. Capture taken downstream of a managed switch that strips VLAN tags or fragments frames. Use a passive TAP (Profitap, NETSCOUT) instead of a port mirror.
Wrong baud rate — CAN at 250 kbps trying to join a 500 kbps bus produces silent listening + occasional errors. Logic analyser shows the bit time mismatch. CANopen has no autodetection; many drives default to 500 kbps.
Cable break. CAN gracefully degrades to single-wire mode briefly with some PHYs; EtherCAT segments below the break go silent — but ring-redundant configurations (cable-redundancy in TwinCAT) tolerate one break. Always design ring redundancy if a break is operationally intolerable.
I²C clock-stretch hang. A slave holds SCL low waiting for an internal operation; if it hangs there (firmware bug, latch-up), the whole bus hangs. Master must implement timeout + bus recovery (clock SCL 9 times then issue STOP). STM32 HAL has bug-prone I²C drivers — LL drivers or DMA mode are more reliable.
SPI bit-shift during interrupt. /CS deasserted mid-transaction by an unrelated ISR leaves the slave half-clocked. Always protect CS-bracketed SPI transactions from preemption, or use DMA with hardware-controlled CS.
5G URLLC packet loss during cell handover. Predictive handover with multi-connectivity (5G NR Dual Connectivity) reduces but does not eliminate gap. Hard real-time control over public 5G is not viable; private 5G in a single cell is.
Address conflict on a CANopen segment. Two devices respond to the same SDO — the master sees inconsistent object dictionaries. Use LSS or unique-ID strapping; or use J1939 address-claim procedure.
EMC radiated emissions failure during CE / FCC certification. Common offenders: un-twisted CAN/RS-485 pair, ungrounded cable shield, missing common-mode chokes, switching power supply harmonics radiated through a long bus run. Fix at design time: route field-bus cables along grounded metal trays, ground shields at both ends through a bonded clamp, fit ferrites near cable entry.
ESD damage on field-bus PHYs after plugging a charged cable into a powered slave. Symptom: transceiver works initially then fails permanently after disconnect/reconnect. Use ESD-rated transceivers (TI ISO1042 → 5 kV HBM ESD-rated; NXP TJA1463 → 8 kV) and add TVS diodes (Bourns CDSOT23-T05C or Littelfuse SP3010 series) at every external connector.
Mismatched termination state during commissioning. A factory-built robot moved between buildings has cable lengths re-routed and the engineer forgets to re-enable termination jumpers on the new end node. The bus appears to work at low speeds but errors climb above 250 kbps. Always verify termination with a powered-off DMM reading across the differential pair.
ROS 2 multicast disabled by IT. Many corporate Wi-Fi networks block UDP multicast — Fast DDS / Cyclone DDS discovery fails silently across subnets. Fix: enable multicast for the robot VLAN, switch to Fast DDS Discovery Server, or move to rmw_zenoh.
EtherCAT slave stuck in PRE-OP. Configuration mismatch between master’s ENI file and slave’s actual SII (slave information interface) EEPROM. Usually because firmware was updated without re-exporting the ENI. Symptom: ec_config_init() succeeds but ec_statecheck() reports PRE-OP instead of OP. Re-export ENI from the slave’s XML ESI file, or rescan with TwinCAT.
Modbus RTU register-mapping mismatch. Vendors disagree on whether holding registers start at 0 or 1, and whether addresses are 0-based or 1-based. A “register 40001” in one document is “register 0” in another. Read the vendor’s specific manual; cross-check by reading a register with a known value (firmware version, device ID).
9. Case studies
Universal Robots e-series joint architecture. Each joint is a self-contained module with motor, encoder, brake, two MCUs (control + safety) and an EtherCAT slave ASIC. Joints are daisy-chained on a single EtherCAT segment from the controller box at the base. Internal cycle 500 µs (2 kHz); SafeBoundary safety functions implemented in EtherCAT FSoE (Safety over EtherCAT) at 4 ms watchdog. The tool flange exposes both an internal RS-485 bus (Modbus RTU) and a Modbus TCP / EtherNet/IP interface for grippers and tools. Source: UR Service Manual e-Series, Section 5 (Communication interfaces).
Boston Dynamics Spot. External interface is gRPC over HTTPS on the robot’s Wi-Fi / Ethernet — the Spot SDK is a Python / Java / C++ client that calls authenticated gRPC services (RobotCommand, RobotState, Image, etc.). Internal locomotion communication is proprietary; observably each leg has a dedicated control board talking to the body computer over a private high-speed bus. Documented in the Spot SDK Quickstart and bosdyn.api protobuf schema.
F1Tenth autonomous racing platform. 1/10-scale car with NVIDIA Jetson Xavier NX, Hokuyo LIDAR (Ethernet), VESC 6 MkVI motor driver (UART-over-USB CDC for setpoints, separate CAN for status), Realsense D435i depth camera (USB 3.0), ROS 2 Humble running Cyclone DDS over the local Linux loopback. The control loop runs at 50 Hz on the Jetson; motor current loop runs at 20 kHz on the VESC’s STM32F4 with FOC, fully internal — the USB link only carries 50 Hz setpoints. Documented in the F1TENTH Build Guide and the VESC firmware repository (Vedder/Benjamin).
ABB IRB 1100 / OmniCore C30. Industrial 4-kg-payload arm with an EtherCAT internal motion bus from the OmniCore controller to the six joint drives, plus PROFINET / EtherNet/IP / Modbus TCP options for cell integration. The teach pendant (FlexPendant) talks RobotWare RWS over WebSockets / HTTPS, the safety I/O runs CIP Safety or PROFIsafe depending on cell configuration. ABB documents the bus stack and timing in the OmniCore Application Manual: typical motion cycle is 4 ms inner-loop, 24 ms path generator, both running on the controller’s dedicated Intel x86 core under VxWorks RT.
Tesla Model S/3/Y vehicle network. Five physical CAN buses (chassis, powertrain, body, party, debug) on Model S; consolidated to FlexRay + 100BASE-T1 + CAN-FD on Model 3/Y. The central vehicle computer (Autopilot HW3/HW4) talks to camera and radar ECUs over GMSL2 (Maxim/ADI Gigabit Multimedia Serial Link, point-to-point 6 Gbps differential) and to body controllers over CAN-FD. Documented incrementally via Tesla service manuals + scan-tool community reverse engineering — illustrates the migration of automotive networks from many CAN buses to a switched Ethernet backbone with CAN-FD subnets.
MIT Mini Cheetah / MJBots quadruped. Each leg uses 3× MJBots qdd100 actuators with on-board STM32G4 running TI’s FOC firmware and a 1 Mbps CAN-FD uplink to a Raspberry Pi 4 / Jetson Nano running the main control loop. Each actuator runs its own 40 kHz current loop and 1 kHz torque loop internally; the central controller publishes joint setpoints over CAN-FD at 500 Hz – 1 kHz. The CAN-FD bus is the practical bottleneck: 12 actuators × 8 byte setpoint × 1 kHz = 768 kbps payload + arbitration overhead fits at 5 Mbps with margin. Described in Katz et al. “Mini Cheetah: A Platform for Pushing the Limits of Dynamic Quadruped Control” (ICRA 2019) and the MJBots qdd100 documentation.
Anymal C / ANYmal X (ANYbotics). Twelve series-elastic actuators (12-DoF quadruped) communicating with a central x86 onboard computer over EtherCAT at 400 Hz. The locomotion controller (typically based on ANYmal SDK / open-source anymal_research) runs the whole-body MPC at 400 Hz on a single thread pinned to a CPU under Xenomai or PREEMPT_RT. Higher-level perception (depth cameras over USB 3 / Ethernet, IMU over UART/SPI, GNSS) feeds in at 50–200 Hz over plain Ethernet + ROS 1/2. Described in Hutter et al., “ANYmal — a highly mobile and dynamic quadrupedal robot” (IROS 2016) and subsequent papers.
KUKA LBR iiwa cobot. 7-DoF arm; internal joint bus is Sercos III between the KUKA Sunrise cabinet and each joint controller. External interfaces include a Fast Robot Interface (FRI) over UDP/Ethernet at up to 1 kHz for low-latency external control, Robot Sensor Interface (RSI) over TCP, and SmartPad-to-cabinet over a proprietary KLI bus on Ethernet. Documented in KUKA System Software Sunrise.OS manuals; FRI is widely used in research (KUKA Sunrise FRI C++ client library).
10. Cross-references
[[Robotics/motors-electric]]— what the field bus is actually controlling[[Robotics/sensors-pose-motion]]— IMU, encoder, ToF sensor connectivity (SPI, I²C, CAN, EtherCAT)[[Robotics/pid-control]]— the control loops whose bandwidth the bus must support[[Robotics/ros2-architecture]]— DDS, RMW, QoS at the application layer[[Robotics/safety-standards]]— FSoE, PROFIsafe, CIP Safety, ISO 13849 networked safety[[Robotics/power-systems]]— CAN current-sense reporting, smart-battery BMS over CAN/RS-485[[Engineering/digital-logic]]— physical-layer signalling, CMOS levels, differential lines[[Engineering/microcontrollers]]— on-chip peripheral availability (CAN-FD, EtherCAT slave)[[Engineering/realtime-embedded]]— PREEMPT_RT Linux, FreeRTOS, deterministic OS context[[Languages/Tier3/robotics-control]]— config dialects (URDF, SDF) referencing bus topology[[Languages/Tier3/ros2-robotics-config]]— XML / YAML for DDS QoS, RMW selection
11. Citations
- Pfeiffer, Ayre & Keydel, “Embedded Networking with CAN and CANopen”, Copperhill 2008.
- Etschberger, “Controller Area Network — Basics, Protocols, Chips and Applications”, IXXAT 2001.
- Voss, “A Comprehensible Guide to Controller Area Network”, 2nd ed, Copperhill 2008.
- ISO 11898-1:2024 Road vehicles — Controller Area Network (CAN) — Part 1: Data link layer and physical signalling.
- ISO 11898-2:2016 CAN — Part 2: High-speed medium access unit.
- CiA 301 v4.2.0:2011 CANopen application layer and communication profile.
- CiA 402 v4.0.0:2017 CANopen device profile for drives and motion control.
- ISO 11783-1:2017 Agricultural and forestry vehicles — Serial control and communications network (J1939 derivation).
- IEC 61158-3-12 / -4-12 / -5-12 / -6-12 (EtherCAT data link, application, FAL).
- IEC 61784-2:2019 Industrial communication networks — Profiles — Part 2: Additional fieldbus profiles for real-time networks.
- IEC 61784-3:2021 Functional safety on fieldbuses (FSoE, PROFIsafe, CIP Safety).
- IEEE 1588-2019 Precision Clock Synchronisation Protocol for Networked Measurement and Control Systems.
- IEEE 802.1AS-2020 Timing and Synchronisation for Time-Sensitive Applications (gPTP).
- IEEE 802.1Qbv-2015 Enhancements for Scheduled Traffic.
- IEEE 802.1CB-2017 Frame Replication and Elimination for Reliability.
- OMG DDS v1.4 (2015) Data Distribution Service for Real-Time Systems.
- ISO/IEC 20922:2016 MQTT v3.1.1 (and OASIS MQTT 5.0:2019).
- IEC 62541 series (2015–2023) OPC Unified Architecture.
- Macenski, Foote, Gerkey, Lalancette, Woodall, “Robot Operating System 2: Design, architecture, and uses in the wild”, Science Robotics 7(66), 2022, DOI 10.1126/scirobotics.abm6074.
- Open EtherCAT Society, SOEM (Simple Open EtherCAT Master) Documentation, github.com/OpenEtherCATsociety/SOEM.
- IgH EtherCAT Master for Linux, etherlab.org.
- Beckhoff Automation, TwinCAT 3 Documentation, infosys.beckhoff.com.
- eProsima, Fast DDS Documentation, fast-dds.docs.eprosima.com.
- Eclipse Cyclone DDS documentation, cyclonedds.io.
- PEAK-System, PCAN Family — User Manuals, peak-system.com.
- Kvaser, CAN Bus Protocol Tutorial, kvaser.com/can-protocol-tutorial.
- TI TCAN4550-Q1 datasheet (SLLSF91, 2021).
- NXP TJA1463 datasheet (rev 1, 2023).
- ST AN5348 Introduction to STM32 microcontrollers CAN-FD protocol, ST 2021.
- Microchip LAN9255 datasheet (DS00003822, 2022) — EtherCAT slave + Cortex-M4F.
- Texas Instruments AM6442 datasheet (SPRSP86, 2023).
- Bluetooth SIG, Bluetooth Core Specification v5.4, 2023.
- 3GPP TS 22.261 Rel-17, Service requirements for the 5G system; URLLC requirements.
- IEEE 802.15.4z-2020 (UWB ranging amendment).