Swarm Robotics — Self-Organization, Algorithms, Platforms, Field Deployments
Large numbers of relatively simple robots that achieve collective tasks through local interactions rather than centralized command — the engineering descendant of biological swarms (ants, bees, fish schools, bird flocks) and Reynolds-style emergent flocking. The decentralized formulation buys scalability (linear in agent count), flexibility (no single point of failure), robustness (degraded performance, not catastrophic), and parallelism, but pays in design difficulty (emergent behavior is hard to verify), bandwidth limits (gossip + neighbor-only messaging), and global-property guarantees (consensus, coverage, formation). The 2014-2026 wave moved swarm robotics from Kilobot demos (1024 robots, Rubenstein-Cornejo-Nagpal Science 2014) into commercial deployment — Amazon Robotics 500k+ AMRs in fulfillment centers, agricultural fleets at XAG + Carbon Robotics, drone light shows at Disney + Verity + Intel Skyport, and military loitering munitions (Switchblade, Lancet, Shahed-136). The algorithmic stack now spans classical (boids, Voronoi coverage, market-based task allocation), multi-agent RL (MADDPG, QMIX, MAPPO), and conflict-based multi-agent path finding (CBS, ICTS, M*).
See also
- swarm-robotics
- multirotor-design
- aerial-manipulation
- mobile-base-wheeled
- path-planning
- agricultural-robotics
- underwater-robotics
- rl-for-control
1. At a glance
A robotic swarm is a collection of autonomous agents whose collective behavior emerges from local sensing, local computation, and local communication. The defining contrasts:
- Decentralized vs. centralized. A centralized fleet (Amazon Kiva pre-2022, traditional MRS multi-robot systems) has a global planner that schedules and coordinates. A swarm has no global planner — each agent’s policy reads only neighbors’ states. Real systems live on the continuum; Amazon Robotics 2023+ Proteus AMRs run mostly decentralized navigation with a centralized task-dispatch layer above.
- Homogeneous vs. heterogeneous. Pure homogeneous swarms (Kilobot, RoboBee) all run the same firmware; heterogeneous swarms (e.g., TERMES — Werfel/Petersen/Nagpal — termite-inspired with role differentiation) mix sensors / capabilities.
- Scale. “Multi-robot” is 2-20 agents. “Swarm” is 100-10⁶. Above ~1000 agents, individual identity ceases to matter; you reason statistically — like a continuum field.
The trade-off framing introduced by Şahin (2005) and Brambilla-Ferrante-Birattari-Dorigo (2013) has four desired properties: scalability (performance does not degrade with ), flexibility (works across task variations), robustness (graceful failure under agent loss), parallelism (collective speed exceeds single-agent). Decentralized algorithms naturally produce these; centralized ones do not as grows. The cost of decentralization is harder global guarantees — proving that a flock will collectively reach consensus or cover an area requires algorithm-specific proofs (graph Laplacian connectivity, persistence-of-excitation, Lyapunov).
Where this sits. Multirotors are the most-deployed swarm platform; path-planning generalizes to multi-agent path finding (MAPF); agricultural sees the largest commercial swarms (XAG fleet ops); underwater swarms face exotic communication constraints (acoustic, not radio); RL for multi-agent (MARL) is the dominant 2024+ algorithmic frontier.
First ask. What’s the communication topology? Fully connected (small N, expensive RF) → centralized. Local broadcast (Kilobot IR, Crazyflie radio) → gossip + flocking. Global mesh (LTE / 5G) → hybrid. Static or dynamic environment? Static → roadmap + MAPF. Dynamic → reactive (boids, potential fields) or MARL. Is the task discrete or continuous? Pick-and-deliver → task allocation. Coverage, patrolling → Voronoi or anti-flocking. Formation flight → consensus + leader-follower. Are agents identical? If yes → single shared policy. If not → role-conditioned policies.
2. First principles
2.1 Reynolds boids (1987)
Craig Reynolds’ SIGGRAPH paper “Flocks, herds, and schools: A distributed behavioral model” — the founding mathematical model of swarm behavior. Three local rules per agent observing neighbors within sensing radius :
Separation — steer to avoid crowding:
Alignment — steer to average heading:
Cohesion — steer toward neighbor centroid:
The total command produces emergent flocking with no global controller. Boids is still the substrate of most aerial light-show swarms (Verity, Intel Skyport).
2.2 Consensus and graph Laplacian
A swarm reaching agreement on a quantity (heading, position, time) is a consensus problem. With state for agent and communication graph with Laplacian :
Consensus value is reached iff is connected; convergence rate set by the second-smallest eigenvalue — the algebraic connectivity. Olfati-Saber + Murray 2004 paper “Consensus problems in networks of agents with switching topology and time-delays” is the canonical reference.
2.3 Aggregation — BEECLUST (Kornienko-Schmickl-Hamann 2005)
Inspired by honeybee thermal aggregation: agents move randomly until they collide; on collision, wait a duration proportional to local environmental sensing (light, temperature). Agents aggregate at the optimum without communication. Used to demonstrate stigmergy (indirect coordination through environment) at the algorithmic level.
2.4 Dispersion / coverage — Voronoi-based (Cortés-Martínez-Karatas-Bullo 2004)
For each agent in a region , define its Voronoi cell . Each agent moves toward the centroid of its cell:
where is an importance density. The swarm converges to a centroidal Voronoi tesselation — optimal coverage of the region. Frazzoli + Pavone + Bullo extensions handle motion constraints and non-convex regions.
2.5 Foraging and central-place foraging (Saridis 1980s onward)
Agents leave a nest, search for resources, return with payload. Variants:
- Random walk + gradient descent. Lévy flights for search efficiency.
- Extremum seeking (Krstic + Wang 2000s). Inject a sinusoidal perturbation; correlate with measured payload to estimate the gradient direction.
- Pheromone trails. Ant Colony Optimization (Dorigo 1992) abstracted to robot pheromone via deposited markers or virtual fields.
2.6 Collective transport — Kube-Bonabeau, Rubenstein
Push-only multi-robot box pushing was the canonical task in the 1990s (Kube + Bonabeau 2000). Modern: Rubenstein-Cornejo-Nagpal Science 2014 demonstrated 1024 Kilobots collectively moving a target object through pure local interactions — no central coordinator, no global frame. Each agent’s behavior selected from a small finite-state machine triggered by neighbor messages.
2.7 Pattern formation — Rubenstein-Cornejo-Nagpal (Harvard 2014)
The cited Science paper “Programmable self-assembly in a thousand-robot swarm” demonstrated 1024 Kilobots self-assembling into user-specified 2D shapes (letter K, wrench, starfish). Each Kilobot ran the same algorithm:
- Four seed robots establish a global coordinate frame via gradient-propagation (analogous to morphogen gradients in embryology).
- Each Kilobot reads neighbor positions, computes its position in the global frame.
- If its position is inside the target shape, stop. Otherwise, walk along the boundary clockwise until reaching an unfilled inside cell.
The algorithm is fully decentralized, scales to N robots in O(N²) total movement, and is the canonical proof that distributed shape formation works at large N.
2.8 Task allocation — market-based and threshold-based
Response thresholds (Bonabeau-Theraulaz-Deneubourg 1996, from social insects). Each agent has a threshold for each task type; performs the task if stimulus exceeds . Thresholds adapt — frequent execution lowers threshold, neglect raises it.
Auction / market-based (Gerkey-Matarić 2002 MURDOCH; Dias-Stentz Hoplites). Agents bid on tasks; the lowest-cost bidder wins. Recovers optimal allocation given truthful bidding. Variant: CBBA (Choi-Brunet-How 2009) — Consensus-Based Bundle Algorithm — handles bundle bids efficiently for .
Gini coefficient of task balance — a fairness metric for swarms; values 0 (perfect balance) to 1 (one agent does all work). Used to evaluate allocation algorithms.
2.9 Multi-Agent Path Finding (MAPF)
The combinatorial problem: agents on a graph, each with start + goal, find collision-free paths minimizing total time or total cost. NP-hard in general but polynomial with strong heuristics.
- CBS — Conflict-Based Search (Sharon-Stern-Felner-Sturtevant 2015). High-level search over a conflict tree; low-level A* for each agent. Optimal; scales to ~100 agents on typical warehouse maps.
- ICTS — Increasing Cost Tree Search (Boyarski-Felner-Stern 2015). Searches over cost vectors. Optimal, alternative to CBS.
- M* (Wagner-Choset 2015). Subdimensional expansion — only enumerate joint states where agents are in conflict. Sublinear in on sparse maps.
- EECBS (Li 2021). Bounded-suboptimal CBS variant; scales to ~1000 agents.
- PIBT (Okumura-Machida-Défago-Tamura 2019). Priority Inheritance with Backtracking — anytime decentralized MAPF. Used at Amazon Robotics for live re-planning.
2.10 Multi-Agent Reinforcement Learning (MARL)
- MADDPG (Lowe-Wu-Tamar-Harb-Abbeel-Mordatch 2017) — Multi-Agent Deep Deterministic Policy Gradient. Centralized critic, decentralized actor — known as the CTDE (Centralized Training, Decentralized Execution) paradigm.
- QMIX (Rashid-Samvelyan-de Witt-Farquhar-Foerster-Whiteson 2018). Monotonic mixing network combines per-agent Q-values into joint Q-value. Standard for cooperative discrete-action swarms (StarCraft II micromanagement).
- MAPPO (Yu-Vinitsky-Bansal-Ravula-Velagapudi-Wang 2022). PPO with shared policy + centralized value. Surprisingly competitive baseline; standard 2022+.
- VDN — Value Decomposition Networks (Sunehag-Lever-Gruslys 2018). Sum-of-Q decomposition; predecessor to QMIX.
- HASAC (Hierarchical Actor-Critic, various) — Hierarchical MARL with role assignment.
- AlphaStar league learning (Vinyals-Babuschkin-Czarnecki 2019). Multi-policy population training. Adapted to swarm settings for asymmetric games.
3. Practical math — sizing and performance
3.1 Communication scaling
In a fully-connected radio network, message complexity is per round — infeasible above ~50 agents at 2.4 GHz. Practical swarms use:
- Local broadcast. Range ; messages reach neighbors (density ). Round complexity .
- Gossip. Each agent forwards a message to neighbors per round. Information spreads in rounds.
- Pheromone / stigmergy. Communication via the environment. Bandwidth proportional to environment area, not agent count.
3.2 Voronoi coverage convergence
Cortés et al. proved exponential convergence to centroidal Voronoi tesselation:
where depends on the density . Typical convergence in 10-100 control cycles for agents.
3.3 Light-show drone show staging
A 1000-drone light show requires:
- Crazyflie- or Verity-class drones.
- 5 GHz RF mesh or proprietary downlink at ~1 Mbps per agent.
- Centralized choreographer assigns waypoints; onboard decentralized collision-avoidance.
- Battery life ~12 min for 250g class.
- Setup + recovery: 30 min per show with semi-automated landing pads.
- Cost per show: $50k-200k (Verity AB, Intel Skyport, EHang, SkyMagic).
3.4 Warehouse robot density
Amazon fulfillment center scale:
- ~500-1000 AMRs per building (Hercules + Pegasus + Proteus mix).
- ~100 m × 100 m floor.
- Density: ~0.1 robot/m² average; peaks ~0.5 robot/m² at induct stations.
- Network: 802.11n at 5 GHz; centralized fleet manager + on-board PIBT-style replanning.
- Throughput: ~1000-2000 picks/hr per pod station.
3.5 MAPF computational scaling
For CBS on a 100×100 grid with agents:
| Solve time (CBS) | Solve time (EECBS, 5% suboptimal) | |
|---|---|---|
| 10 | 0.1 s | 0.05 s |
| 50 | 2 s | 0.5 s |
| 100 | 30 s | 3 s |
| 500 | timeout | 30 s |
| 1000 | timeout | 5 min |
Live replanning at Amazon scale needs ~1 Hz response → bounded-suboptimal anytime methods (EECBS, PIBT).
3.6 MARL training scaling
QMIX on a StarCraft II 5v5 micromanagement task:
- 50M steps in 8 hours on single A100.
- 5 agents × shared parameters = 1 policy network.
- 50% win rate after 5M steps; 90% after 50M.
MAPPO same task: 30M steps to 90% win rate, faster wall-clock due to better PPO sample efficiency.
4. Design heuristics
- Start with the simplest algorithm. Boids works for many flocking applications. Reach for MARL only when scripted behavior measurably fails.
- Local sensing beats global. A swarm reliant on GPS / SLAM bottlenecks at the localization layer. Use neighbor-relative observations (vision tags, UWB ranging, IR proximity).
- Bandwidth is the budget. Most swarm algorithms send messages per agent per round. If your message size scales with N, redesign.
- Test at 1, 10, 100, 1000. Behaviors that work for 10 agents often fail at 100 (collisions) or 1000 (rounding errors, GPS noise). Scale up in 10× steps.
- Failure is normal. Design for 10% agent failure per mission. Either heal (gradient repair, role reassignment) or accept degraded performance.
- Beware emergent oscillation. Coupling between sensing + action across many agents produces unexpected limit cycles. Add damping; verify in simulation across density ranges.
- Sim-to-real for swarms. Physical issues that don’t matter for 1 robot (radio collisions, charge stations, ambient lighting confusion) dominate at 1000. Sim must include these.
- Centralized for safety, decentralized for performance. A swarm light show needs centralized safety override (geofence, e-stop) even with decentralized choreography. Hybrid architectures are normal.
- Identify the bottleneck operator early. Above ~100 agents, monitoring all agents exceeds operator bandwidth. Aggregate metrics (mission progress, failure count) + exception alerts.
- Build in graceful degradation. Loss of GPS → fall back to relative localization. Loss of comms → fall back to last-commanded behavior. Loss of fleet manager → continue autonomous mission.
5. Components & sourcing — platforms
5.1 Ground-based swarm platforms
| Platform | Vendor / Lab | Specs | Cost | Use |
|---|---|---|---|---|
| Kilobot | Rubenstein + Nagpal, Harvard; K-Team SA produces | 33 mm, vibrating legs, IR comms | $14 / unit | Massive-scale algorithm research |
| Pheeno | Sci-Bot / ASU | 75 mm, differential drive, IR + radio | ~$300 | Affordable classroom swarm |
| Khepera IV | K-Team SA | 140 mm, full sensor suite + camera | ~$3000 | Research-grade |
| e-puck2 | EPFL spin-off (GCtronic) | 70 mm, camera + IR + WiFi | ~$800 | Standard EU research |
| Pi-puck | Sheffield ext. of e-puck2 | + Raspberry Pi compute | ~$1000 | ROS-capable e-puck |
| Jasmine | Schmickl-Hamann | 23 mm, IR comms | open-source | Smallest swarm robot |
| Marxbot / Foot-Bot | IRIDIA ULB (Dorigo lab) | 17 cm wheeled | research | Standard for ARGoS work |
| Mona (Sheffield) | Sheffield | 60 mm wheeled + IR | ~$100 | Cheap research platform |
| TurtleBot 3 | Open Robotics + ROBOTIS | 14 cm, ROS native | ~$700 | Small-N collaborative tasks |
5.2 Aerial swarm platforms
| Platform | Vendor | Specs | Use |
|---|---|---|---|
| Crazyflie 2.1 / 2.2 | Bitcraze (SE) | 27 g nano-quad, open-source | Research swarms (1-100) |
| Crazyflie Loco | Bitcraze | + UWB positioning | Indoor swarm localization |
| Verity Studios drones | Verity AB (CH) | Show drones, integrated platform | Cirque du Soleil, sports events |
| Intel Shooting Star | Intel Skyport | 330 g, light-show grade | 2018 Olympics 1218-drone show |
| EHang | EHang (CN) | Light-show + passenger eVTOL | Largest commercial light shows |
| SkyMagic | SkyMagic (UK) | Show drones | European light-show ops |
| TigerDrone Skyetx | Various | Light-show | Multiple operators |
| Lockheed Indago / Stalker | Lockheed Martin | Military micro-UAS | Reconnaissance swarms |
| Switchblade 300 / 600 | AeroVironment | Loitering munitions | Military deployment |
| Lancet | ZALA Aero (RU) | Loitering munition | Ukraine conflict |
| Shahed-136 | HESA (IR) | Loitering / cruise drone | Iranian-design, mass-produced |
5.3 Underwater swarm platforms
| Platform | Vendor / Lab | Specs | Use |
|---|---|---|---|
| CoCoRo | Schmickl + Hamann (EU) | 41 cm AUV | Cognitive collective AUVs |
| SoFi | Rus / MIT (2018) | Robotic fish, hydraulic tail | Coral-reef monitoring |
| Bluefin SandShark / Bluefin-9 | Bluefin Robotics (HII) | Man-portable AUVs | Military / oceanographic |
| Bluefin SQX-700 | Bluefin / HII | Mid-range AUV | Coordinated survey |
| ECA Group A18 | ECA Group | Survey AUV | Multi-vehicle survey |
| Liquid Robotics Wave Glider | Liquid Robotics / Boeing | Wave-driven persistent | Long-duration swarms |
5.4 Software frameworks
| Framework | Authors / Org | Use |
|---|---|---|
| ARGoS | Pinciroli-Birattari-Dorigo (ULB) | Multi-physics-engine swarm sim; scales to 10k+ |
| Buzz | Pinciroli | Swarm programming DSL with primitives (swarm, neighbors, virtualstigmergy) |
| Gazebo / Ignition multi-robot | Open Robotics | ROS-integrated multi-robot sim |
| Stage | Player Project legacy | Lightweight 2D swarm sim |
| Webots multi-robot | Cyberbotics | ODE-based education + research |
| Coppelia / V-REP | Coppelia Robotics | Multi-robot with multiple physics |
| ROS 2 multimaster + DDS | Open Robotics | Decentralized ROS for swarms |
| ROS 2 + Zenoh | Eclipse | Improved discovery + bandwidth |
| Crazyswarm / Crazyswarm2 | USC + IMRC | ROS bindings for Crazyflie swarm |
| PettingZoo + RLlib MARL | Farama / Anyscale | Standard MARL training |
| EPyMARL | Papoudakis et al. | Multi-agent RL benchmarks |
| MAVE | DeepMind | MARL research framework |
5.5 Localization for swarms
| Tech | Range | Accuracy | Cost |
|---|---|---|---|
| OptiTrack / Vicon | Indoor | mm | high |
| UWB (Loco / Pozyx / Decawave DWM1000) | 50 m | 10 cm | low-mid |
| Ultrasound TDoA | 5 m | cm | low |
| Bluetooth AoA | 20 m | dm | low |
| Visual fiducial (AprilTag, ArUco) | 5 m | cm | trivial |
| GPS + RTK | unlimited | cm (RTK) | mid |
| Visual-inertial SLAM | unlimited | dm | mid (compute) |
6. Reference data
6.1 Notable swarm-robotics milestones
| Year | Event | Significance |
|---|---|---|
| 1987 | Reynolds boids | Mathematical foundation of flocking |
| 1992 | Dorigo Ant Colony Optimization | Bio-inspired pheromone metaheuristic |
| 1995 | Mataric Nerd Herd 1 | First multi-robot system at scale (12 robots) |
| 2000 | Kube + Bonabeau collective transport | Multi-robot box-pushing baseline |
| 2002 | Gerkey + Matarić MURDOCH | Market-based task allocation reference |
| 2004 | Cortés + Bullo Voronoi coverage | Optimal-coverage theory |
| 2005 | BEECLUST | Stigmergy-based aggregation |
| 2008 | iSwarm, Symbrion EU projects | EU funded large-scale swarm research |
| 2010 | Lindsey Stewart drones | Penn quadrotor coordination demos |
| 2014 | Rubenstein 1024 Kilobots Science | Programmable self-assembly at 1k |
| 2014 | TERMES (Werfel-Petersen-Nagpal Science) | Termite-inspired 3D construction |
| 2015 | Sharon CBS | Optimal MAPF state of art |
| 2016 | Intel Drone 100 / Disney drone shows | First commercial drone light shows |
| 2017 | Lowe MADDPG | CTDE MARL paradigm |
| 2018 | Olympics Intel 1218-drone show | Largest synchronized drone display |
| 2019 | OpenAI Hide-and-Seek emergent strategies | Emergent multi-agent tool use |
| 2020 | Crazyswarm 2 standardizes Crazyflie swarms | Reproducible academic swarms |
| 2021 | Amazon Proteus AMR launch | Fully autonomous mobile robots in warehouses |
| 2022 | Ukraine conflict — Switchblade + Lancet | Loitering munitions widely deployed |
| 2022 | MAPPO benchmarks | Strong MARL baseline |
| 2023 | Amazon Sequoia + Sparrow | Hybrid swarm + manipulator fulfillment |
| 2024 | EHang 10k-drone shows (China) | Largest synchronized light shows |
| 2024 | Anduril Bolt + Roadrunner | Production-scale autonomous swarm munitions |
| 2025 | XAG T50 + Burro + Carbon Robotics LaserWeeder fleet ops | Coordinated agricultural swarms |
6.2 Algorithm decision matrix
Task | Centralized | Decentralized
Flocking | n/a | Boids (Reynolds)
Coverage | Optimal MIP | Voronoi (Cortés)
Foraging | Auction | Threshold / pheromone
Pattern formation | Off-line plan | Gradient (Rubenstein)
Transport | Centralized plan | Push-pull stigmergy
Task allocation | Hungarian / MIP | CBBA / MURDOCH / thresholds
Path planning | CBS / EECBS | PIBT / ORCA
Formation flight | Leader-follower | Consensus (Olfati-Saber)
Search + rescue | Decentralized | Frontier / boustrophedon
Construction | Off-line plan | TERMES rules
Cooperative game | Centralized critic| QMIX / MAPPO
6.3 Communication-topology choices
Topology | Bandwidth | Latency | Robust | Typical use
Fully connected | O(N^2) | low | weak | < 10 agents
Star (master) | O(N) | medium | weak | warehouse fleet
Tree | O(N) | medium | medium | military C2
Mesh (gossip) | O(N) total | high | strong | drone swarm
Local broadcast | O(deg) | low | strong | Kilobot
Stigmergy (env) | O(1) | very high| very strong| pheromones
6.4 Field deployment scale
| System | Scale | Year |
|---|---|---|
| Amazon Robotics AMRs | ~750,000 globally (Kiva + Hercules + Pegasus + Proteus) | 2024 |
| Intel/EHang light shows | 10,000+ drones single show | 2024 |
| DJI Agras T50 fleets | 1000s per major operator | 2024 |
| Switchblade deployments | thousands (Ukraine) | 2022-2025 |
| Shahed-136 | ~5000+ produced | 2022-2025 |
| Bluefin AUV fleets | 10-50 per operator | 2020+ |
| Anduril Roadrunner | classified | 2024 |
| Symbotic AMRs | 10,000+ across customers | 2024 |
| Locus Robotics | 25,000+ deployed | 2024 |
| Geek+ AMRs | 50,000+ globally | 2024 |
| HAI Robotics | 30,000+ globally | 2024 |
7. Failure modes & debugging
- Comms collisions. Two agents broadcasting on same channel at same time → dropped packets. Fix: CSMA/CA, slotted MAC, frequency hopping.
- Sensor confusion in dense swarms. Too many neighbors → IR or vision saturates. Fix: limit max-neighbors-considered; reduce sensing range.
- Emergent gridlock. Agents stuck in opposing flows. Fix: random tie-breaking, priority inheritance (PIBT), back-off protocols.
- Localization drift. Cumulative odometry errors → swarm drifts away from intended frame. Fix: anchor agents with GPS / fixed beacons; periodic re-localization.
- Battery synchronization. All agents need charging simultaneously. Fix: stagger missions; have spare agents; design rapid swap.
- Single-point-of-failure in “decentralized” system. Centralized fleet manager fails → swarm stops. Fix: distributed consensus on fleet state; degraded autonomous behavior.
- Software heterogeneity in supposedly homogeneous swarm. Some agents on version , others on → bugs. Fix: enforced version check at startup.
- Behavioral oscillation. Predator-prey-like limit cycles in MARL. Fix: longer training, opponent modeling, league learning.
- GPS denial / jamming. Outdoor swarms lose absolute reference. Fix: vision-based or UWB relative localization fallback.
- Light-show wind gust. All drones drift simultaneously. Fix: monitor wind; abort threshold; over-design control authority.
- Adversarial perturbation in MARL. Trained policy fails against unseen opponent. Fix: population-based training; adversarial curriculum.
- Reward hacking. MARL agents discover unintended cooperation. Fix: reward shaping, behavioral constraints, safety filters.
8. Case studies
8.1 Rubenstein Kilobot self-assembly Science (Harvard 2014)
Michael Rubenstein, Alejandro Cornejo, Radhika Nagpal at Harvard’s Self-Organizing Systems Research Group. 1024 Kilobots — each a $14 unit with vibrating legs, IR communication, LED feedback. Self-assemble into target 2D shapes in 6-12 hours via gradient-based positioning. Algorithm: four seed-robots in a corner establish a coordinate system, agents query neighbors for distance, walk along boundary into the shape. Failures (10-15% misplacement at large N) hand-corrected by intervention. Significance: first proof at scale that decentralized self-assembly works. Spawned an industry of large-N swarm papers, and Rubenstein’s later work at Northwestern (Coachbot for human-robot swarm interaction).
8.2 TERMES — Werfel-Petersen-Nagpal Science 2014
Justin Werfel, Kirstin Petersen, Radhika Nagpal at Harvard. Three small wheeled robots build user-specified 3D structures from foam bricks, inspired by termite mound construction. Each agent runs the same rule-set; no global plan. Roles emerge from local sensing (am I at edge? above? carrying?). Generated structures including towers, castles. Cited as the canonical proof that decentralized swarm construction works for non-trivial 3D structures.
8.3 Intel Skyport Olympics light show (Pyeongchang 2018)
1218 Intel Shooting Star drones, single operator, choreographed Olympic rings. Guinness record at the time. Architecture: centralized choreographer pre-computes trajectories; uploads to each drone before launch; each drone executes open-loop position-time waypoints with GPS guidance; safety geofence + redundant comms. Validated commercial-scale aerial swarms. Subsequent EHang shows in China (2020-2024) pushed past 10,000 drones.
8.4 Amazon Robotics evolution (2012-2026)
Amazon acquired Kiva Systems in 2012 for $775M. Kiva drives — orange floor-traversing AMRs that carry pod-shelves to pickers — became the industry-defining warehouse swarm. Architecture (Kiva era): grid-based deterministic A* with centralized assignment; 200-500 robots per facility. 2017+ Hercules (heavier capacity), 2018+ Pegasus (sorting), 2022+ Proteus (first fully autonomous, navigates around humans), 2023+ Sparrow + Sequoia (manipulator-equipped picking systems). 2024 Amazon Robotics operates roughly 750,000 robots — the largest robotic deployment in history. Algorithmic stack: PIBT for live replanning; CBS for offline schedules; centralized fleet manager with degraded-mode autonomous fallback.
8.5 OpenAI Hide-and-Seek emergent tool use (Baker-Kanitscheider-Markov-Wu 2019)
OpenAI trained multi-agent RL agents in a hide-and-seek game. Over 500M episodes, agents discovered six emergent strategies in sequence: chasing, hiders building shelters with boxes, seekers using ramps to jump over shelters, hiders blocking ramps, seekers “box surfing” (riding a moving box over walls), hiders locking boxes in place. Each new strategy required orders-of-magnitude more compute than the prior. Significance: first compelling demonstration that emergent multi-agent strategy can be qualitatively novel — and that compute scale matters.
8.6 Anduril Roadrunner + Bolt + Dive-LD (2023-2026)
Anduril Industries (founded 2017 by Palmer Luckey) developed autonomous swarm-weapon platforms: Bolt (loitering munition), Roadrunner (jet-powered interceptor swarms), Dive-LD (underwater swarm). Distinguishing feature: each platform runs Lattice OS, a software-defined autonomy stack that lets multiple Anduril platforms coordinate. Production deployment with US DoD as of 2024. Symbolizes the militarization of swarm robotics post-Ukraine.
8.7 Loitering-munition swarms in Ukraine (2022-2025)
The Ukraine-Russia conflict became the first sustained military deployment of swarm-class drones. Switchblade 300 + 600 (AeroVironment), Lancet (ZALA), Shahed-136 (HESA / domestically produced) — produced in tens of thousands. Engagement architecture: human-on-the-loop for initial target selection, autonomous terminal guidance. Counter-swarm RF jamming, EW defenses emerged in parallel. Validated swarm-class munitions as a permanent capability.
8.8 XAG + DJI agricultural-fleet ops (China, 2018-2026)
XAG (Guangzhou) and DJI (Shenzhen) Agras T20/T40/T50 spraying drones operated as fleets — 5-10 per operator, swap-batteries hot-swap, refill stations, single tablet-operator. China deployments: hundreds of thousands of T-series drones, ~1B hectares treated annually. Algorithmic stack: centralized geofenced field-coverage planning; per-drone autonomous flight with collision avoidance; battery + tank management. Validated swarms in commercial agriculture; Burro + Carbon Robotics LaserWeeder are US-side parallels for ground vehicles.
8.9 ARGoS multi-physics simulator (Pinciroli-Birattari, ULB 2012-2026)
Carlo Pinciroli + Mauro Birattari at IRIDIA, ULB developed ARGoS — the dominant academic swarm simulator. Distinguishing feature: per-robot choice of physics engine. Some robots simulate with full dynamics; most simulate with 2D kinematic. Scales to 10,000+ agents on a single workstation. Used in ~200+ swarm-robotics papers including Brambilla’s review, Pinciroli’s Buzz DSL, and most ULB IRIDIA group publications. Released alongside Buzz programming language.
8.10 Crazyflie swarm research (Bitcraze, ETH Zürich, Caltech, USC)
The Crazyflie 2.x platform (Bitcraze AB, Sweden) — 27 g nano-quadrotor with open-source firmware — became the standard research swarm platform 2016+. Caltech AMBER lab + USC ACT lab developed Crazyswarm + Crazyswarm2 ROS bindings for synchronized N-agent flight. Maximum demonstrated: ~50 Crazyflies in a single OptiTrack volume. Research output: thousands of papers on formation flight, MARL, geometric control, fault tolerance. Cost ~$200 per drone makes it the only viable academic-budget aerial swarm platform.
8.11 RoboCup Rescue + DARPA Robotics Challenge legacy
RoboCup Rescue (since 2000) — annual competition for autonomous search-and-rescue robotics, with simulation (Rescue Simulation League) and physical (Rescue Robot League) tracks. DARPA Robotics Challenge (2012-2015) — humanoid disaster response, not swarm per se but the technology trickled down. Subsequent RoboCup@Home + SubT Challenge (2018-2021) extended to underground swarm exploration — CENTAURO + CERBERUS multi-robot teams demonstrated coordinated SLAM, communication relay, and gas-leak localization.
8.12 MARSEM + ETH RSL multi-robot SLAM swarm (2023-2025)
Marco Hutter’s Robotic Systems Lab at ETH Zürich demonstrated coordinated SLAM with 4-8 ANYmal quadrupeds and aerial drones in subterranean environments. Each robot runs local SLAM; centralized backend fuses pose graphs with relative measurements. Deployed for European underground inspection contracts (subway, mines). Demonstrates that heterogeneous swarms work in industrial inspection at scale.
8.13 NVIDIA Isaac multi-agent simulation (2024+)
NVIDIA Isaac Lab added native multi-agent support in 2024. Simulation of 1000+ agents on a single H100 GPU at 1000 Hz physics. Enables sim-to-real for production swarm policies. Used by Amazon Robotics research arm + several agricultural robotics startups. Significance: lowers compute barrier for large-N MARL training.
8.14 Buzz programming language (Pinciroli 2015)
Carlo Pinciroli at ULB designed Buzz as a domain-specific language for swarm programming. Primitives:
swarm.create(id) { ... }— define a swarmswarm.select(condition) { ... }— sub-swarm selectionneighbors.foreach(f)— neighbor iterationvirtualstigmergy.put(key, value)— environment-mediated communicationswarmexec(swarm_id, function)— distributed execution
Compiles to bytecode running on each robot’s MCU. Reduced the gap between algorithm spec + deployment for ARGoS-based research.
8.15 Locus Robotics + RaaS warehouse model (2014-2026)
Locus Robotics (Wilmington MA) deployed > 25,000 AMRs in 200+ warehouses by 2024. Differentiator: Robotics-as-a-Service pricing rather than capex. Each robot is a small wheeled AMR carrying a stack of bins; centralized fleet manager runs pick-routes; human pickers walk to designated robot and load. Hybrid human-swarm model. Validated the RaaS pricing model — opex per pick instead of upfront capex, lowering customer adoption friction.
8.16 Geek+ + HAI Robotics + Quicktron (China AMR ecosystem, 2018-2026)
China-based AMR vendors deployed 50,000+ robots globally each: Geek+ (Beijing), HAI Robotics (Shenzhen, ACR-class case-handling), Quicktron (Shanghai). Architectures parallel Kiva’s but with local-Chinese supply chain. Combined Chinese AMR exports reached ~150k units globally by 2024, rivaling Amazon’s internal deployment.
8.17 OpenAI Hide-and-Seek tool emergence revisited
Bowen Baker et al. OpenAI 2019. Worth a deeper look as an emergence study. The training environment was a simple 2D box-pushing arena. Agents trained with self-play multi-agent PPO. Over 500M episodes (months of training compute), six distinct emergent strategies appeared in sequence, each unlocking after the previous reached saturation. Each transition required an order of magnitude more compute. Key insight: emergent strategy capability scales with compute, not just architectural sophistication. Implications for swarm RL: large-N MARL may discover non-trivial cooperation only at compute scales currently inaccessible to most labs.
8.18 Disney Imagineering drone shows (2016-2025)
Disney’s drone shows (“Starbright Holidays” 2016 onward) initially used 300 Intel drones; by 2024 they fielded proprietary 800-drone shows at Disneyland Paris and Walt Disney World. Architecture: centralized choreography + per-drone GPS + safety geofence + redundant comms. Disney’s distinguishing constraint: ultra-low failure rate (children watching, brand risk) drove sub-1-in-10⁴ failure-per-show targets, achieved by 2024.
8.19 Skydio swarm research (2021-2025)
Skydio (Redwood City CA) — known for consumer + enterprise autonomous drones — published research on coordinated multi-drone inspection. Skydio Dock + multi-Dock deployments coordinate 2-8 drones for industrial-inspection tasks (powerline, pipeline). DoD contracts 2023+ for Blue UAS swarms. Architecture: cloud-based mission planner + onboard autonomy. Demonstrates the soft-power swarm — coordinated but not synchronized at swarm-light-show scale.
8.20 NVIDIA Isaac multi-agent + Carbon Robotics field demos (2024)
NVIDIA’s Isaac Lab 2024 release added native multi-agent training. Demonstrated coordinated agricultural fleets in simulation (10 LaserWeeders + 5 Burros in a single field) trained with MAPPO. Carbon Robotics co-developed simulation environments to test fleet behaviors at scale before deployment. Cited as the bridge between sim-trained swarm policies and customer-grade agricultural fleet management.
9. Bio-inspired swarm algorithms (deeper)
9.1 Ant Colony Optimization (Dorigo 1992)
Marco Dorigo’s PhD work introduced ACO as a metaheuristic for combinatorial optimization, modeled after pheromone-trail foraging in Lasius niger ants. Each “ant” constructs a solution probabilistically biased by pheromone strength; successful solutions reinforce pheromone; unsuccessful solutions evaporate. Update rule:
where is evaporation rate and is the deposit of ant on edge . ACO is a generic optimizer (TSP, routing, scheduling) but the robotic version uses physical or virtual pheromones — Sugawara-Sano 1997 robots dropping chemical trails; Garnier-Tâche-Combe-Grimal-Theraulaz 2007 light-pheromone Alice robots; Arvin-Krajník-Turgut 2015 virtual-pheromone IR projections.
9.2 Particle Swarm Optimization (Kennedy + Eberhart 1995)
Inspired by bird-flock motion. Each particle has position and velocity ; updates:
where is the particle’s personal best and is the global (or local-neighborhood) best. Weights control inertia, personal exploitation, social exploitation. Widely used as a metaheuristic; less common in physical-robot swarms because real robots have inertia and constraints PSO doesn’t model.
9.3 Glowworm Swarm Optimization (Krishnanand-Ghose 2005)
Models firefly bioluminescence. Each agent emits a “luciferin” signal proportional to its objective value; agents move toward brighter neighbors. Naturally handles multimodal objectives — splits into sub-swarms at each local maximum.
9.4 Cuckoo Search (Yang + Deb 2009)
Lévy-flight-based metaheuristic mimicking brood-parasite egg-laying. Effective for high-dimensional continuous optimization. Used in some UAV-coverage swarm planning.
9.5 Bee Algorithm (Pham et al. 2005)
Two-population search modeled on honeybee foraging: “scouts” do random search; “workers” exploit promising sites. Used in factory-scheduling and warehouse-routing swarms.
10. Communication protocols and middleware
10.1 Physical layer
| Tech | Range | Bandwidth | Latency | Power |
|---|---|---|---|---|
| IR (Kilobot) | 7 cm | 30 kbps | 5 ms | very low |
| Bluetooth 5 LE | 50 m | 2 Mbps | 10 ms | low |
| WiFi 6 (802.11ax) | 50 m | 1 Gbps | 1-10 ms | medium |
| 5G NR | km | 1 Gbps | 1 ms | high |
| LoRa | 5-15 km | 50 kbps | 1 s | very low |
| Zigbee | 100 m | 250 kbps | 10 ms | low |
| UWB (Decawave DW1000) | 50 m | 6.8 Mbps | < 1 ms | low |
| Acoustic underwater | 5 km | < 10 kbps | seconds | medium |
| Visible light (VLC) | 5 m | 100 Mbps | < 1 ms | medium |
10.2 Network layer protocols
- AODV (Ad hoc On-Demand Distance Vector). Classic MANET reactive routing.
- OLSR (Optimized Link State Routing). Proactive MANET; better for dense networks.
- B.A.T.M.A.N. Advanced. Production-grade mesh routing in Linux kernel.
- RPL (Routing Protocol for Low-Power and Lossy Networks). IETF standard for IoT-class.
- DDS (Data Distribution Service). OMG standard; underpins ROS 2. Pub-sub with QoS.
- Zenoh (Eclipse). Newer pub-sub-query unifier; reduces ROS 2 discovery overhead at swarm scale.
- MQTT. Light-weight pub-sub over TCP; common in IoT swarms.
10.3 Coordination primitives in code
# Distributed consensus (e.g., heading agreement)
on_message(msg):
neighbor_headings[msg.id] = msg.heading
on_timer():
avg = mean(neighbor_headings.values())
self.heading += k * (avg - self.heading)
broadcast({"heading": self.heading})
# Voronoi coverage
on_neighbor_position_update():
cell = compute_voronoi_cell(self.pos, neighbors)
centroid = mass_centroid(cell, density_phi)
self.target = self.pos + k * (centroid - self.pos)
# Pheromone deposit
on_food_found():
drop_pheromone(self.pos, intensity=1.0)
on_random_move():
grad = sample_pheromone_gradient(self.pos)
self.heading = bias * grad + (1 - bias) * random_dir()
11. Field operations and logistics
11.1 Drone-light-show production stack
A typical Verity / Skyport / SkyMagic operation:
- Choreography software (Verity Skymotion, Drone Show Software) — keyframe-based 4D animation; export per-drone trajectories.
- Pre-flight upload — full trajectory time-coded to GPS time on each drone.
- Pre-arm verification — battery voltage, GPS lock count (≥ 10 sats), gyro / mag health, RTK fix.
- Synchronized takeoff — choreographer triggers launch sequence.
- In-flight monitoring — live telemetry from each drone over LTE / dedicated RF; auto-RTL on telemetry loss.
- Synchronized landing — landing-pad markers (AprilTag-style); auto-park sequence.
- Battery hot-swap — for multi-show evenings; 30-60 s per drone.
- Failure handling — aircraft that drifts out of geofence auto-lands; operator reviews logs.
Failure rate target: < 1 drone failure per 1000 shipped. Achieved by EHang and Verity by 2024.
11.2 Amazon Robotics warehouse architecture
- Fleet manager (proprietary, centralized) assigns tasks at 1 Hz.
- Map server holds floor topology + occupancy.
- Path planner runs centralized PIBT-class algorithm; broadcasts schedules.
- Local controller (per AMR) executes schedule with collision-avoidance fallback.
- Charging assignment — agents return to charge stations when battery < 30%.
- Maintenance schedule — preventive every 100 hours; automated diagnostic via sensor logs.
Floor density limits: ~0.3 AMRs per m² in transit zones, peaks ~0.5 at inducts. Above this, planning latency dominates.
11.3 Military swarm logistics (Switchblade class)
A typical Switchblade 300 launch:
- Mission planning — operator defines target area + ROE in 5-10 min.
- Pre-flight — connect launcher, run BIT, verify.
- Launch — pneumatic kick-out from tube; wings + propeller unfold.
- Cruise — 30+ min loiter at target area; operator monitors video.
- Target ID — operator confirms target (human-on-the-loop).
- Terminal guidance — autonomous final 3-5 second dive.
Swarms of 5-20 Switchblades coordinated via Lattice / Anduril or AeroVironment Crysalis at one operator per swarm. Counter-swarm: RF detection (DroneShield, D-Fend), HPM (high-power microwave, Epirus Leonidas), kinetic (Coyote Block 2).
12. Verification and validation
12.1 Formal verification challenges
Proving global properties of decentralized swarms is hard. Approaches:
- Compositional model checking. PRISM, UPPAAL for small-N stochastic verification.
- Mean-field analysis. Treat the swarm as a continuous density; analyze the PDE limit. Standard for very large N.
- Statistical model checking. Monte Carlo simulation with confidence bounds.
- Lyapunov-style proofs. Show convergence to a desired state. Works for consensus, formation, coverage.
12.2 Simulation-based validation
Standard protocol: Monte Carlo runs over (algorithm, swarm size, initial conditions, failure rate, environment). Output statistics: mean time to completion, success rate, fairness (Gini coefficient), worst-case agent overhead. Tools: ARGoS for academic, NVIDIA Isaac Lab + Omniverse for industrial.
12.3 Sim-to-real transfer for swarms
The reality gap multiplies for swarms — every agent has its own sensor noise, motor variation, communication dropout. Standard mitigation:
- Train policies with per-agent domain randomization across the population.
- Add adversarial agents during training to test fault tolerance.
- Use heterogeneous training (mixture of competent and faulty agents).
- Test with realistic radio propagation (ray-traced or measured channel models).
13. Open challenges (2026)
- Verifiable safety at scale. No reliable method exists to certify a 1000-agent swarm meets safety SLAs.
- Cross-vendor interoperability. Swarms today are vendor-locked (Anduril Lattice, AeroVironment Crysalis, AWS Cloud Robotics). Standards efforts (ROS 2 + DDS-Security) are early.
- Energy-efficient operation. Battery swap dominates duty cycle. Wireless charging (Wibotic, energy-harvesting) and solar (Wave Glider model) are partial answers.
- Heterogeneous swarms. Mixed ground + air + underwater coordination still bespoke per deployment.
- Counter-swarm. Defending against adversarial swarms is the dual problem; an active research + procurement frontier.
- Communication-denied operation. Swarms that work without RF (jammed, EMI, underwater) need richer onboard sensing + stigmergy.
- Human-swarm interaction. Operator-to-1000-agent UX remains a research area; aggregated metrics + exception-based alerting is current best practice but not solved.
13b. Reference workflow — designing a swarm
13b.1 Specification phase
- Define the collective task: coverage, formation, transport, search, construction.
- Define the swarm size envelope: minimum useful N, target N, max physically deployable.
- Define the environment: indoor structured, outdoor open, GPS-denied, communications-denied.
- Define safety + failure tolerance: % of agents that can fail without mission failure.
- Define operator interface: single-operator-N agents, supervisory targets, exception alerts.
13b.2 Algorithm phase
- Pick the bio-inspired / classical / RL family based on the task decision matrix.
- Implement in simulator (ARGoS, Isaac Lab, Gazebo).
- Run Monte Carlo over swarm sizes (10, 100, 1000) and failure rates (0%, 10%, 30%).
- Verify metrics: success rate, time to completion, agent overhead, Gini fairness.
13b.3 Platform phase
- Pick platform per cost / sensing / compute envelope (Kilobot for academic 1k+ scale; Crazyflie for aerial research; commercial AMR for warehouse).
- Profile per-agent power, comms, compute headroom.
- Implement firmware including state machine + neighbor discovery + protocol layer.
13b.4 Sim-to-real transfer
- Identify the per-agent variations: motor variance, sensor noise, comms dropout rates.
- Train / tune policy with these as DR axes in sim.
- Validate on 1 → 10 → 100 → N agents in stages.
- Live re-tuning during field deployment is normal for the first 100 hours.
13b.5 Operations
- Pre-deployment health checks; reserve spare agents.
- Live monitoring dashboard: aggregate metrics + exception alerts.
- Failure response procedure: identify, isolate, replace, log.
- Post-deployment data capture; iterate algorithm.
14. Glossary
- AMR — Autonomous Mobile Robot — typically wheeled, ground-based, with on-board autonomy.
- AOA — Angle-of-Arrival — localization method using receive-antenna phase differences.
- ARGoS — multi-physics-engine swarm simulator (Pinciroli, Birattari, Dorigo).
- Boids — Reynolds’ 1987 flocking model (separation, alignment, cohesion).
- Buzz — DSL for swarm programming (Pinciroli).
- CBBA — Consensus-Based Bundle Algorithm — distributed market-based task allocation (Choi-Brunet-How 2009).
- CBS — Conflict-Based Search — optimal MAPF algorithm (Sharon-Stern 2015).
- CTDE — Centralized Training, Decentralized Execution — MARL paradigm (MADDPG-class).
- DDS — Data Distribution Service — OMG pub-sub standard; ROS 2 substrate.
- Decentralized control — agents act on local observations only, no global controller.
- Emergence — collective behavior arising from local interactions, not explicit programming.
- Gossip protocol — message propagation via random neighbor forwarding.
- Kilobot — Rubenstein-Nagpal’s $14 swarm research platform.
- MAPF — Multi-Agent Path Finding — combinatorial planning problem.
- MARL — Multi-Agent Reinforcement Learning — joint policy learning across agents.
- MAPPO — Multi-Agent PPO — shared-parameter PPO baseline (Yu et al. 2022).
- MURDOCH — auction-based task allocation (Gerkey-Matarić 2002).
- PIBT — Priority Inheritance with Backtracking — anytime decentralized MAPF.
- Pheromone (digital) — environmental marker for stigmergic communication.
- Reynolds rules — separation, alignment, cohesion.
- Self-organization — emergence of structure from local rules.
- Stigmergy — coordination via the environment.
- Swarm intelligence — collective problem-solving by a decentralized swarm.
- UWB — Ultra-WideBand — low-power short-range high-bandwidth radio; cm-accuracy ranging.
- VDN / QMIX — value-decomposition MARL methods.
- Voronoi coverage — distributed coverage via Voronoi-cell centroidal motion.
Further reading
- Reynolds C.W. (1987) “Flocks, herds and schools: A distributed behavioral model.” SIGGRAPH ‘87.
- Şahin E. (2005) “Swarm Robotics: From Sources of Inspiration to Domains of Application.” Swarm Robotics SAB Workshop.
- Brambilla M., Ferrante E., Birattari M., Dorigo M. (2013) “Swarm robotics: A review from the swarm engineering perspective.” Swarm Intelligence 7.
- Bonabeau E., Dorigo M., Theraulaz G. (1999) Swarm Intelligence: From Natural to Artificial Systems. Oxford University Press.
- Rubenstein M., Cornejo A., Nagpal R. (2014) “Programmable self-assembly in a thousand-robot swarm.” Science 345.
- Werfel J., Petersen K., Nagpal R. (2014) “Designing collective behavior in a termite-inspired robot construction team.” Science 343.
- Cortés J., Martínez S., Karatas T., Bullo F. (2004) “Coverage control for mobile sensing networks.” IEEE Transactions on Robotics and Automation 20.
- Olfati-Saber R., Murray R.M. (2004) “Consensus problems in networks of agents with switching topology and time-delays.” IEEE Transactions on Automatic Control 49.
- Gerkey B.P., Matarić M.J. (2002) “Sold!: Auction methods for multirobot coordination.” IEEE Transactions on Robotics and Automation 18.
- Sharon G., Stern R., Felner A., Sturtevant N.R. (2015) “Conflict-Based Search for Optimal Multi-Agent Pathfinding.” Artificial Intelligence 219.
- Wagner G., Choset H. (2015) “Subdimensional expansion for multirobot path planning.” Artificial Intelligence 219.
- Lowe R., Wu Y., Tamar A., Harb J., Abbeel P., Mordatch I. (2017) “Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments” (MADDPG). NeurIPS.
- Rashid T., Samvelyan M., de Witt C.S., Farquhar G., Foerster J., Whiteson S. (2018) “QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning.” ICML.
- Yu C., Velu A., Vinitsky E., Gao J., Wang Y., Bayen A., Wu Y. (2022) “The Surprising Effectiveness of PPO in Cooperative Multi-Agent Games” (MAPPO). NeurIPS.
- Pinciroli C., Trianni V., O’Grady R., et al. (2012) “ARGoS: a modular, parallel, multi-engine simulator for multi-robot systems.” Swarm Intelligence 6.
- Pinciroli C., Beltrame G. (2016) “Buzz: An extensible programming language for heterogeneous swarm robotics.” IROS.
- Bayindir L. (2016) “A review of swarm robotics tasks.” Neurocomputing 172.
- Dorigo M., Theraulaz G., Trianni V. (2020) “Reflections on the future of swarm robotics.” Science Robotics 5.
- Baker B., Kanitscheider I., Markov T., Wu Y., Powell G., McGrew B., Mordatch I. (2019) “Emergent Tool Use From Multi-Agent Autocurricula.” ICLR 2020.