SLAM Algorithms — Family Index

Simultaneous Localization And Mapping is the joint inference problem of estimating both robot trajectory and a map of the environment from sensor data, when neither is known a priori. After three decades the field has fragmented into a zoo of algorithms differentiated along three axes — sensor modality, map representation, and back-end estimator. This index catalogs the canonical members and notes which combinations are deployed in practice.

1. At a glance — taxonomy axes

SLAM systems decompose along three largely orthogonal axes:

Sensor modality: monocular camera, stereo, RGB-D (Kinect, RealSense, Tango), 2D LiDAR, 3D LiDAR (Velodyne, Ouster, Livox), event camera (DVS / DAVIS), radar (FMCW), sonar (underwater), IMU (always auxiliary, never alone). Visual-inertial (VIO) and LiDAR-inertial (LIO) are the dominant tightly-coupled multi-sensor combinations.
Map representation: sparse landmark (3D points + descriptors), semi-dense (high-gradient pixels), dense (every pixel / every voxel), volumetric (TSDF grid, occupancy grid, ESDF), surfel (oriented disks), mesh, neural-implicit (MLP weights), 3D Gaussian (Gaussian Splatting).
Estimator back-end: Extended Kalman Filter (EKF), Unscented KF (UKF), particle filter (Rao-Blackwellized), pose-graph (poses as nodes, relative-motion measurements as edges), factor-graph (poses + landmarks + IMU pre-integration as factors), batch bundle adjustment, incremental smoothing (iSAM2 Bayes tree), learned end-to-end.

The 1990s-2007 filtering era used EKF/UKF/particle filters with a strict O(n²)-or-worse cost in landmark count. The “graph SLAM” insight (Lu + Milios 1997, Dellaert 2005 square-root SAM) re-cast the problem as sparse nonlinear least-squares on a graph, enabling O(n) incremental smoothing via Cholesky / Bayes-tree factorization and unlocking modern scalable SLAM.

A practical SLAM system also splits responsibility along a front-end / back-end axis: the front-end is the per-frame sensor-driven step (feature extraction, data association, scan matching, photometric tracking, IMU pre-integration) that emits constraints, and the back-end is the global optimizer (filter, smoother, graph optimizer) that fuses them. Loop closure spans both: the front-end proposes candidates (appearance-based with DBoW2/NetVLAD, or geometry-based with scan-context / branch-and-bound) and the back-end accepts, rejects, or down-weights them (RANSAC, PCM, GNC, switchable constraints). Most failure modes in real-world SLAM trace to data-association errors at the front-end propagating into the back-end as false-positive loop closures, motivating the heavy robust-back-end research of the last decade.

2. Filtering era (Bayesian recursive estimation)

Before pose-graph SLAM matured, the dominant paradigm was a single Gaussian (EKF) or particle distribution over the joint robot+map state, updated recursively per sensor frame.

EKF-SLAM — Smith, Self, Cheeseman 1990 (“Estimating Uncertain Spatial Relationships in Robotics”, the “SPmodel” / stochastic map). The joint covariance matrix is updated by linearizing both motion and observation models. Computational complexity is O(n²) per update in number of landmarks, fundamentally limiting it to a few hundred landmarks. MonoSLAM (Davison 2007 PAMI) was the first real-time monocular EKF-SLAM at 30 Hz on a desktop CPU, ~100 features tracked, a landmark milestone for visual SLAM. Inverse-depth parameterization (Civera, Davison, Montiel 2008) extended EKF-SLAM to handle low-parallax points.
UKF-SLAM — Unscented transform avoids Jacobian computation and handles stronger nonlinearity than EKF, but rarely used in practice because the pose-graph approach dominated by the time UKF-SLAM matured.
FastSLAM — Montemerlo, Thrun, Koller, Wegbreit 2002 (“FastSLAM: A Factored Solution to the Simultaneous Localization and Mapping Problem”). Rao-Blackwellized particle filter: each particle is a trajectory hypothesis, and conditional on a trajectory, landmark posteriors decouple into independent low-dimensional EKFs. Scales to thousands of landmarks. FastSLAM 2.0 (Montemerlo + Thrun 2003) added improved proposal distribution using current measurement.
GMapping — Grisetti, Stachniss, Burgard 2007 (“Improved Techniques for Grid Mapping with Rao-Blackwellized Particle Filters”, IEEE T-RO). Rao-Blackwellized particle filter for 2D occupancy-grid LiDAR. The standard ROS-1 mobile-robot SLAM baseline 2008-2016. Limitations: particle depletion in long corridors, no native loop closure.
Hector SLAM — Kohlbrecher et al. 2011 — scan-matching-only 2D LiDAR SLAM, no odometry required, common on quadrotors.

Filtering-era systems are still preferred where computational budget is fixed and small (microcontroller-grade) or where the state stays small (camera-mounted AR with bounded workspace).

3. Pose-graph and factor-graph back-ends (modern)

The graph-SLAM paradigm formulates the MAP estimate as a sparse nonlinear least-squares problem on a graph. Pose-graphs (Lu + Milios 1997) keep only poses with relative-pose edges; factor-graphs (Dellaert + Kaess 2006) admit arbitrary factor types (IMU pre-integration, GPS, range, landmark observations). Minimization is of the negative log-likelihood — equivalent to weighted least-squares under Gaussian noise.

TORO — Grisetti, Stachniss, Grzonka, Burgard 2007 — Tree-based netwORk Optimizer; stochastic gradient descent on a spanning-tree-parameterized graph.
HOG-Man — Grisetti, Kümmerle, Stachniss, Frese, Hertzberg 2010 — Hierarchical Optimization for pose Graphs on Manifolds.
g2o — Kümmerle, Grisetti, Strasdat, Konolige, Burgard 2011 ICRA (“g2o: A General Framework for Graph Optimization”). C++ library, sparse Cholesky / PCG / Levenberg-Marquardt. Used in ORB-SLAM, RGB-D-SLAM, parts of Cartographer.
GTSAM — Dellaert (Georgia Tech, 2012-present) — factor-graph library with the iSAM2 incremental smoother backed by a Bayes tree data structure (Kaess et al. 2012 IJRR). iSAM (2008) was the original incremental smoothing-and-mapping algorithm. GTSAM is the back-end of choice for modern factor-graph SLAM: Kimera, LIO-SAM, multi-robot extensions, DOOR-SLAM.
Ceres Solver — Google 2010+. General-purpose nonlinear least-squares with auto-differentiation. Powers VINS-Mono / VINS-Fusion, many academic VO pipelines, and Google’s own Tango.
SE-Sync — Rosen, Carlone, Bandeira, Leonard 2017 — certifiably-correct synchronization on SE(d), addresses local-minima problem of pose-graph SLAM.

4. Feature-based visual SLAM

Track sparse keypoints across frames, triangulate them as 3D landmarks, optimize via bundle adjustment.

MonoSLAM — Davison 2007 PAMI; EKF-based.
PTAM — Klein + Murray 2007 ISMAR (“Parallel Tracking And Mapping for Small AR Workspaces”). Split tracking (per-frame) from mapping (background bundle adjustment) into separate threads — became the architectural template for nearly every modern visual SLAM system. Designed for AR.
ORB-SLAM — Mur-Artal, Montiel, Tardós 2015 T-RO (“ORB-SLAM: A Versatile and Accurate Monocular SLAM System”). ORB feature (Rublee 2011), three threads (tracking / local mapping / loop closing), DBoW2 vocabulary tree for place recognition.
ORB-SLAM2 — Mur-Artal + Tardós 2017 T-RO — mono / stereo / RGB-D unified.
ORB-SLAM3 — Campos, Elvira, Gómez Rodríguez, Montiel, Tardós 2021 T-RO — adds tight visual-inertial fusion, multi-map system (“Atlas”), pinhole + fisheye support. The current reference open-source visual-(inertial-)SLAM.
VINS-Mono — Qin, Li, Shen (HKUST 2018, IEEE T-RO) — tightly-coupled monocular VIO with sliding-window optimization in Ceres, IMU pre-integration. Robust on aerial vehicles.
VINS-Fusion — Qin et al. 2019 — VINS-Mono extended with stereo and GPS.
OKVIS — Leutenegger, Lynen, Bosse, Siegwart, Furgale 2015 IJRR — tightly-coupled keyframe visual-inertial, foundational; spawned OKVIS2.
Maplab — Schneider, Dymczyk, Fehr, Egger, Lynen, Gilitschenski, Siegwart (ETH ASL + Furgale 2018) — open multi-session mapping research platform.
RTABMap — Labbé + Michaud 2013 — appearance-based loop closure on top of RGB-D / stereo VO, the default ROS mid-sized RGB-D SLAM.

5. Direct visual SLAM (photometric error)

Instead of matching features, minimize the per-pixel photometric error directly. Avoids feature-extraction failure modes in low-texture or blur and uses more image information per frame.

DTAM — Newcombe, Lovegrove, Davison 2011 ICCV (“Dense Tracking And Mapping in Real-Time”). First real-time dense monocular SLAM on GPU; per-pixel inverse depth optimization.
LSD-SLAM — Engel, Schöps, Cremers 2014 ECCV (“LSD-SLAM: Large-Scale Direct Monocular SLAM”). Semi-dense — operates only on high-gradient pixels. CPU-only.
DSO — Engel, Koltun, Cremers 2017 PAMI (“Direct Sparse Odometry”). Photometric bundle adjustment on a small set of carefully chosen sparse points with full photometric calibration (vignette, response, exposure). Strong odometry, no loop closure in canonical form.
VI-DSO / DSO-VI — Stumberg, Usenko, Cremers 2018 — visual-inertial extension.
REMODE — Pizzoli, Forster, Scaramuzza 2014 ICRA (“REgularized MOnocular Depth Estimation”) — per-pixel Bayesian depth fusion for dense monocular reconstruction; coupled with SVO for tracking.
CNN-SLAM — Tateno et al. 2017 — early hybrid using CNN depth predictions inside an LSD-SLAM-style framework.

6. Dense / volumetric SLAM

Dense map representations (TSDF, surfel, mesh) — typically RGB-D-sensor-driven, GPU-bound.

KinectFusion — Newcombe, Izadi, Hilliges, Molyneaux, Kim, Davison, Kohli, Shotton, Hodges, Fitzgibbon 2011 ISMAR. TSDF volumetric integration with frame-to-model ICP tracking. Foundational dense RGB-D SLAM. Confined to a fixed cubic volume.
Kintinuous — Whelan, Kaess, Fallon, Johannsson, Leonard, McDonald 2012 — KinectFusion with a shifting volume (cyclic buffer) to map larger environments.
ElasticFusion — Whelan, Salas-Moreno, Glocker, Davison, Leutenegger 2015 (“ElasticFusion: Dense SLAM Without A Pose Graph”, RSS / IJRR). Surfel-based deformable model with non-rigid deformation graph for loop closure; no explicit pose-graph.
InfiniTAM — Kähler, Prisacariu, Ren, Sun, Torr, Murray 2015 (and Nießner et al. 2013 introduced voxel hashing) — sparse-voxel TSDF on GPU, much higher memory-efficiency than dense grid.
BundleFusion — Dai, Nießner, Zollhöfer, Izadi, Theobalt 2017 SIGGRAPH. Global per-frame pose optimization at scale, eliminates KinectFusion drift entirely on indoor RGB-D sequences.
Voxblox — Oleynikova, Taylor, Fehr, Siegwart, Nieto (ETH ASL) 2017 IROS — TSDF + Euclidean Signed Distance Field (ESDF) on CPU; primary mapping back-end for planning on UAVs.
Voxgraph — Reijgwart, Millane, Oleynikova, Siegwart, Cadena, Nieto 2020 (ETH ASL) — submap-based TSDF SLAM with pose-graph back-end.
SLAM++ (dense) / Stereo Bundle Mapping pipelines feed dense surfel or volumetric reconstructions off-line.

7. LiDAR SLAM

3D LiDAR (Velodyne HDL/VLP, Ouster OS, Livox MID/Avia) drove a separate algorithm lineage rooted in LOAM.

LOAM — Zhang + Singh 2014 RSS (“LOAM: Lidar Odometry and Mapping in Real-time”). Decompose into high-rate odometry (edge + planar features) and low-rate mapping. Long the #1 on KITTI odometry. Closed-source canonical implementation; many forks.
A-LOAM — Tong Qin, simplified Ceres-based reimplementation of LOAM, widely used as a starting point.
LeGO-LOAM — Shan + Englot 2018 IROS (“Lightweight and Ground-Optimized LiDAR Odometry and Mapping on Variable Terrain”). Ground-plane segmentation, two-step LM optimization, loop closure via ICP — for ground vehicles.
LIO-SAM — Shan, Englot, Meyers, Wang, Ratti, Rus 2020 IROS (“LIO-SAM: Tightly-coupled Lidar Inertial Odometry via Smoothing and Mapping”). LOAM features + IMU pre-integration in GTSAM factor-graph + GPS factor + loop closure. One of the most widely-deployed open-source LIO systems.
LIO-Mapping — Ye, Chen, Liu 2019; LINS — Qin, Cao, Cao, Liu 2019 — earlier tightly-coupled LIO.
FAST-LIO — Xu + Zhang (HKU MARS Lab) 2021 — iterated error-state Kalman filter LIO; very low latency.
FAST-LIO2 — Xu, Cai, He, Lin, Zhang 2022 T-RO — ikd-Tree incremental k-d tree for fast nearest-neighbor; significant speedup. Open-source de facto baseline for solid-state LiDAR (Livox).
Point-LIO — Cai, Xu, Zhang 2023 ICRA — point-by-point LIO without frame batching; handles very-aggressive motion.
FAST-LIO-MULTI — multi-LiDAR extension.
NDT-Mapping — Magnusson 2009 — Normal Distributions Transform scan matching; basis of Autoware’s mapping/localization stack.
KISS-ICP — Vizzo, Guadagnino, Mersch, Wiesmann, Behley, Stachniss 2023 RA-L (“KISS-ICP: In Defense of Point-to-Point ICP — Simple, Accurate, and Robust Registration If Done the Right Way”). Pure point-to-point ICP LiDAR odometry, no IMU, no features — surprisingly competitive.
SuMa / SuMa++ — Behley + Stachniss 2018-2019 — surfel-based LiDAR SLAM with semantics.
MULLS, HDL-SLAM, BLAM — other LiDAR pipelines.

8. Cartographer (Google)

Cartographer — Hess, Kohler, Rapp, Andor 2016 ICRA (“Real-Time Loop Closure in 2D LIDAR SLAM”). 2D + 3D LiDAR with submap-based scan matching and branch-and-bound loop-closure search; pose-graph optimization in Ceres. Native ROS 2 support. Heavily deployed in warehouse AGVs and indoor robotics.

9. Visual-Inertial Odometry (loosely or tightly coupled)

VIO systems with explicit IMU pre-integration are mandatory wherever camera-only methods fail (low texture, motion blur, rolling shutter, fast motion).

MSCKF — Mourikis + Roumeliotis 2007 ICRA (“A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation”). Marginalize landmarks in nullspace, keep only a sliding window of past poses in state. The basis of Project Tango and Magic Leap’s tracking.
ROVIO — Bloesch, Burri, Omari, Hutter, Siegwart 2015 IROS (“Robust Visual Inertial Odometry Using a Direct EKF-Based Approach”). Robocentric (state in body frame) iterated EKF with direct photometric updates.
OpenVINS — Geneva, Eckenhoff, Lee, Yang, Huang 2020 ICRA — open-source MSCKF-style filter VIO from RPNG, well-maintained.
OKVIS / OKVIS2 — see §4.
VINS-Mono / VINS-Fusion — see §4.
SVO — Forster, Pizzoli, Scaramuzza 2014 ICRA (“SVO: Fast Semi-Direct Monocular Visual Odometry”). Semi-direct: feature-based + photometric refinement, very fast on UAVs.
SVO 2.0 — Forster, Zhang, Gassner, Werlberger, Scaramuzza 2017 T-RO — multi-camera, edgelet support.
Basalt — Usenko, Demmel, Stumberg, Cremers 2020 (TUM) — visual-inertial mapping with non-linear factor recovery.

10. Event-camera SLAM

Dynamic Vision Sensors (DVS, DAVIS) output per-pixel asynchronous brightness-change events at microsecond latency. Required for very-high-speed motion (drones, FPV racing) where conventional cameras suffer motion blur.

EVO — Rebecq, Horstschaefer, Gallego, Scaramuzza 2017 RA-L (“EVO: A Geometric Approach to Event-based 6-DOF Parallel Tracking and Mapping in Real-Time”). PTAM-style for events.
DEVO — depth + event; ESVO — Zhou, Gallego, Shen 2021 (event stereo).
Ultimate SLAM — Rosinol Vidal, Rebecq, Horstschaefer, Scaramuzza 2018 RA-L — events + frames + IMU on UAV (paper title “Ultimate SLAM? Combining Events, Images, and IMU”). Strong performance in HDR/blur regimes.

11. Semantic and object-level SLAM

Integrate semantic segmentation or object-level reconstruction into the SLAM estimate.

SLAM++ — Salas-Moreno, Newcombe, Strasdat, Kelly, Davison 2013 CVPR. Pre-scanned 3D object instances are detected, posed and integrated as objects (not points) in an RGB-D SLAM graph.
MaskFusion — Rünz, Buffier, Agapito 2018 ISMAR — per-object surfel reconstruction with Mask-RCNN; MID-Fusion, EM-Fusion are contemporaneous.
Kimera — Rosinol, Abate, Chang, Carlone (MIT 2020-2022) — Kimera-VIO + Kimera-Mesher (3D mesh) + Kimera-Semantics (semantically-labeled mesh) + Kimera-RPGO (Robust Pose Graph Optimization with Pairwise Consistency Maximum Set). Kimera-Multi is the multi-robot extension.
Hydra — Hughes, Chang, Carlone 2022 RSS — real-time 3D Scene Graph construction (buildings → rooms → places → objects) on top of Kimera. Foundational for embodied-AI tasks.
Voxblox-plus-plus, PanopticFusion, MaskFusion continue the dense-semantic line.

12. Learned and neural-implicit SLAM

End-to-end differentiable bundle adjustment and neural-implicit map representations.

DROID-SLAM — Teed + Deng 2021 NeurIPS (“DROID-SLAM: Deep Visual SLAM for Monocular, Stereo, and RGB-D Cameras”). RAFT-style flow + differentiable dense bundle adjustment layer (“DBA”); jointly optimizes poses + per-pixel inverse-depth. Auto-grade accuracy on TUM-RGBD, EuRoC, TartanAir.
iMAP — Sucar, Liu, Ortiz, Davison 2021 ICCV (“iMAP: Implicit Mapping and Positioning in Real-Time”). Single multilayer perceptron stores the scene as a continuous occupancy/color field; tracking by gradient descent against the MLP.
NICE-SLAM — Zhu, Peng, Larsson, Xu, Bao, Cui, Oswald, Pollefeys 2022 CVPR (“NICE-SLAM: Neural Implicit Scalable Encoding for SLAM”). Hierarchical voxel-grid features + small decoder MLPs; scalable to room-sized scenes.
Nicer-SLAM — Zhu et al. 2023 — monocular variant.
NeRF-SLAM — Rosinol, Leonard, Carlone 2022 — Kimera VIO + Instant-NGP NeRF for dense reconstruction.
OrbeezSLAM — Chung, Tseng, Hsieh et al. 2022 — Instant-NGP-based real-time NeRF-SLAM with ORB front-end.
Co-SLAM, ESLAM, Point-SLAM, GO-SLAM — recent academic NeRF-SLAM variants.
MonoGS / Gaussian Splatting SLAM — Matsuki, Murai, Kelly, Davison 2024 CVPR (“Gaussian Splatting SLAM”). 3D Gaussian primitives optimized photometrically; very high rendering quality and competitive accuracy. The 2024-25 frontier.
Photo-SLAM, SplaTAM, Gaussian-SLAM, MonoGS — the 3DGS SLAM family proliferating since 2024.
Pearl, NeuralRecon, VolSDF-SLAM — neural surface SLAMs.

13. Multi-robot collaborative SLAM

CCM-SLAM — Schmuck + Chli 2017-2019 ETH ASL — Centralized Collaborative Monocular SLAM with a server-and-agent architecture.
DOOR-SLAM — Lajoie, Ramtoula, Chang, Carlone, Beltrame 2019 RA-L — Distributed Outlier-Robust SLAM; combines distributed pose-graph optimization with PCM (Pairwise Consistency Maximization) for outlier-robust inter-robot loop closures.
Kimera-Multi — Chang, Tian, How, Carlone 2021 — multi-robot Kimera with Kimera-Distributed.
Swarm-SLAM — Lajoie + Beltrame 2024 RA-L — open-source decentralized C-SLAM framework for fleets.
Maplab Multi — ETH ASL.
DCL-SLAM, D-Lio-SAM — distributed LiDAR variants.

14. Comparison table

Algorithm	Sensor	Back-end	Map	Year	Typical use	Lib
EKF-SLAM (canonical)	mono cam	EKF	sparse landmarks	1990	small indoor	custom
MonoSLAM	mono cam	EKF	sparse	2007	AR research	open
GMapping	2D LiDAR	RBPF	2D occupancy	2007	mobile-base ROS-1	open
FastSLAM 2.0	range/cam	RBPF	landmarks	2003	offroad	open
PTAM	mono cam	bundle adj	sparse	2007	AR	open
ORB-SLAM3	mono / stereo / RGB-D + IMU	g2o BA	sparse	2021	drones, AR	open
VINS-Mono	mono cam + IMU	Ceres slid-win	sparse	2018	UAV	open
VINS-Fusion	stereo + IMU + GPS	Ceres	sparse	2019	UGV/UAV	open
OKVIS	stereo + IMU	keyframe BA	sparse	2015	research	open
ROVIO	mono + IMU	iEKF	sparse direct	2015	UAV	open
OpenVINS	mono / stereo + IMU	MSCKF	sparse	2020	research baseline	open
MSCKF	mono + IMU	EKF (null-space)	sparse	2007	Tango, MagicLeap	proprietary
SVO 2.0	mono / multi-cam + IMU	semi-direct BA	semi-dense	2017	UAV	open
LSD-SLAM	mono cam	pose-graph	semi-dense	2014	research	open
DSO	mono cam	photo BA	sparse direct	2017	research	open
DTAM	mono cam	dense opt	dense	2011	research/AR	research
KinectFusion	RGB-D	ICP frame-to-model	TSDF	2011	indoor scan	open
ElasticFusion	RGB-D	non-rigid deform	surfel	2015	indoor	open
BundleFusion	RGB-D	global BA	TSDF	2017	indoor scan	open
Voxblox	depth	TSDF + ESDF	volumetric	2017	UAV planning map	open
RTABMap	stereo / RGB-D	pose-graph + DBoW2	dense optional	2013	ROS RGB-D	open
LOAM	3D LiDAR	feature scan-match	point cloud	2014	KITTI / AV	closed
LeGO-LOAM	3D LiDAR	feature + ICP	point cloud	2018	UGV	open
LIO-SAM	3D LiDAR + IMU	GTSAM factor graph	point cloud	2020	AV / UGV	open
FAST-LIO2	LiDAR (Livox) + IMU	iEKF + ikd-tree	point cloud	2022	drone / fast UGV	open
Cartographer	2D / 3D LiDAR	Ceres branch-bound	submaps + grid	2016	warehouse AGV	open
KISS-ICP	3D LiDAR	point-to-point ICP	none	2023	LiDAR odometry baseline	open
Kimera	stereo + IMU	GTSAM + RPGO	semantic mesh	2020	semantic mapping	open
Hydra	stereo + IMU	Kimera + scene graph	3D scene graph	2022	embodied AI	open
DROID-SLAM	mono / stereo / RGB-D	dense diff BA	per-pixel depth	2021	learned VO	open
iMAP	RGB-D	MLP gradient descent	neural-implicit	2021	research	research
NICE-SLAM	RGB-D	hierarchical NeRF	neural-implicit	2022	research	open
Gaussian Splatting SLAM (MonoGS)	mono / RGB-D	3DGS photo BA	3D Gaussians	2024	research frontier	open

15. Front-end techniques (shared building blocks)

Feature extractors — Harris (1988), Shi-Tomasi (1994); SIFT (Lowe 2004); SURF (Bay 2008); FAST (Rosten 2006); ORB (Rublee, Rabaud, Konolige, Bradski 2011); BRISK (Leutenegger 2011). Learned: SuperPoint (DeTone, Malisiewicz, Rabinovich 2018), R2D2 (Revaud 2019), KP2D, DISK (Tyszkiewicz 2020), ALIKE / ALIKED (2022/23).
Descriptor matching — BRIEF (Calonder 2010), FREAK (Alahi 2012). Learned: SuperGlue (Sarlin, Sattler, Lynen, Hartley 2020), LightGlue (Lindenberger, Sarlin, Pollefeys 2023 ICCV) — much faster than SuperGlue.
Dense feature matching — LoFTR (Sun, Shen, Wang, Zhou, Bao 2021 CVPR) — detector-free transformer correspondences.
Photometric / direct tracking — KLT (Lucas-Kanade 1981), inverse compositional Lucas-Kanade (Baker + Matthews 2001), direct image alignment (Engel et al.).
Loop closure — DBoW2 vocabulary tree (Gálvez-López + Tardós 2012); NetVLAD (Arandjelović, Gronat, Torii, Pajdla, Sivic 2016) — learned place recognition; CosPlace / MixVPR more recent. Geometric verification by PnP + RANSAC or essential matrix decomposition.
Outlier rejection — RANSAC (Fischler + Bolles 1981); MAGSAC++ (Barath 2020); GNC (Graduated Non-Convexity, Yang + Carlone 2020) — applied at the graph optimization level.
IMU pre-integration — Forster, Carlone, Dellaert, Scaramuzza 2017 T-RO (“On-Manifold Preintegration for Real-Time Visual-Inertial Odometry”) — the canonical formulation used by every modern tightly-coupled VIO. Composes IMU integrals between two keyframes on SO(3)×R^3 in body frame so that the residual at the back-end depends linearly on first-order corrections to gyro and accel biases — making it feasible to re-linearize without re-integrating raw IMU samples.
Robust kernels — Huber, Cauchy, Tukey, DCS (Dynamic Covariance Scaling, Agarwal 2013), switchable constraints (Sünderhauf + Protzel 2012). All four reduce the influence of false-positive loop closures inside the standard NLS optimization.
Marginalization — Schur complement to remove old states while preserving information; foundational for sliding-window VIO (VINS-Mono, OKVIS, OpenVINS) where memory must be bounded.
Sparse Cholesky / QR — SuiteSparse CHOLMOD and SPQR (Davis); the numerical workhorses underneath Ceres / g2o / GTSAM linear-solver back-ends.

16. Selection heuristics

Indoor drone, monocular, AR target → ORB-SLAM3 (mono-inertial) or VINS-Mono.
Outdoor autonomous vehicle mapping → LIO-SAM or FAST-LIO2 + Cartographer 3D, fused with cameras for semantics; post-process with g2o or GTSAM for HD-map deliverable.
Warehouse AGV with 2D LiDAR → Cartographer 2D (ROS 2) — the default deployment.
ROS 2 mobile base with RGB-D → RTABMap (most mature ROS RGB-D pipeline) or Cartographer (3D).
Visual-inertial AR / handheld → proprietary stacks (ARKit, ARCore — MSCKF-derived) or open-source ROVIO / OpenVINS / VINS-Fusion.
High-speed UAV / FPV racing → FAST-LIO2 (if LiDAR) or SVO 2.0 + event camera (Ultimate SLAM) for blur regimes.
HD-map for self-driving → LIO-SAM + Cartographer + offline g2o/GTSAM batch optimization; manual loop-closure curation.
UAV photogrammetry, no real-time → COLMAP offline SfM (Schönberger 2016) — not SLAM but related.
Bin-picking / dense reconstruction → KinectFusion / ElasticFusion / BundleFusion (RGB-D), or for research-grade NICE-SLAM / Gaussian Splatting SLAM.
Humanoid robot perception subsystem → Kimera VIO + Hydra scene graph for high-level semantics.
Quadruped (Spot, ANYmal) → typically proprietary onboard with ICP + visual; open-source equivalent is LIO-SAM + Kimera.
Race-car high-speed autonomous → FAST-LIO2 + Cartographer with custom high-frequency loop-closure logic.
Surgical endoscope / minimally-invasive → DROID-SLAM or specialized learned monocular SLAM (textureless tissue).
Underwater (sonar/visual) → Pose-graph SLAM with sonar registration (DIDSON, Sonar-SLAM); often Kalman-filter-based with DVL/INS aiding.
Multi-robot fleet → Kimera-Multi or Swarm-SLAM or DOOR-SLAM; require communication-aware design.
Resource-constrained microcontroller / embedded → EKF-SLAM with hand-tuned landmark count, or scan-matching-only (Hector SLAM) with no global optimization.
GPS-denied long-range (subterranean / DARPA SubT-style) → LIO-SAM or FAST-LIO2 fused with thermal / radar; require resilience to dust, smoke, dynamic obstacles, and degraded LiDAR geometry (long featureless tunnels). The 2021 DARPA SubT-Final winners ran LIO-SAM variants with custom degeneracy detection.
Dynamic / non-rigid scene → DynaSLAM (Bescos 2018), DS-SLAM, VDO-SLAM — all extend ORB-SLAM with semantic masking or motion segmentation to ignore moving objects. Pure NeRF / 3DGS SLAM still struggles with dynamics.
Low-cost ground robot (Roomba-class) → wheel-odometry + 2D LiDAR with GMapping or Cartographer 2D; or pure visual with monocular ORB-SLAM3.

17. Datasets and benchmarks

Algorithm comparisons depend on a small set of standard datasets that have become the de facto evaluation harnesses:

KITTI (Geiger, Lenz, Stiller, Urtasun 2013 IJRR) — automotive stereo + Velodyne HDL-64 + GPS/INS ground truth; the canonical autonomous-driving SLAM benchmark. KITTI-360 (2022) extends with 360° panoramas.
EuRoC MAV (Burri et al. 2016 IJRR) — drone stereo + IMU + Vicon ground truth; the canonical visual-inertial benchmark for UAV-scale motion.
TUM RGB-D (Sturm, Engelhard, Endres, Burgard, Cremers 2012 IROS) — handheld RGB-D + mocap; the canonical indoor RGB-D evaluation.
TUM VI (Schubert, Goll, Demmel, Usenko, Stueckler, Cremers 2018) — visual-inertial handheld, fisheye stereo.
TartanAir (Wang et al. 2020) — large-scale photorealistic simulation, diverse environments and motion; used as the DROID-SLAM training corpus.
ScanNet (Dai 2017) and Replica (Straub 2019) — indoor RGB-D scene datasets used by neural-implicit SLAMs (iMAP, NICE-SLAM).
Newer College (Ramezani 2020) — Oxford handheld 3D LiDAR + IMU + cameras + survey-grade ground truth; LIO benchmark.
NCLT (Carlevaris-Bianco 2016, U-Michigan) — Segway long-term outdoor (15-month) LiDAR + camera; long-term SLAM benchmark.
Hilti SLAM Challenge (2021-2023) — industrial / construction environment, multi-sensor; informs LIO research.

Metrics: Absolute Trajectory Error (ATE) in RMSE meters after Sim(3) alignment is the dominant single-number metric (Sturm 2012); Relative Pose Error (RPE) measures drift over fixed-distance windows. For 3D map quality: Chamfer distance to ground-truth mesh, F-score at a threshold.

18. Theoretical underpinnings (brief)

Observability — Visual-inertial systems have four unobservable directions (global position xyz, yaw); pitch and roll are observable thanks to gravity. Visual-only mono SLAM additionally has scale unobservable (7 unobservable degrees: SE(3) gauge + scale). LiDAR-inertial: same four unobservable as VIO. Consistency-aware filtering (First-Estimates Jacobian, OC-EKF — Huang, Mourikis, Roumeliotis 2010) enforces these.
Information form vs covariance form — equivalent dual representations of the same Gaussian; information form is sparse when factor graphs are sparse, hence the dominance of information-form smoothers.
Gauge freedom — SLAM is defined up to a rigid transform of the global frame (and scale for monocular); the back-end fixes this by anchoring the first pose or by computing on the manifold modulo the gauge group.
Maximum-A-Posteriori (MAP) — the unifying view of modern SLAM: minimize the negative-log-posterior over poses+landmarks, subject to Gaussian (or robust-kernel) priors on measurements and motion. Under linear-Gaussian assumptions this reduces to weighted least-squares. The factor graph is just the Bayes-net’s factor-graph reduction.

19. Cross-references

slam — overview note in Tier 1.
computer-vision-robotics — visual front-end fundamentals shared with non-SLAM CV.
perception-sensors — sensor characteristics that constrain algorithm choice.
bayesian-estimation — EKF / UKF / particle filter / factor-graph theory underpinning every back-end.
sensors-pose-motion — IMU pre-integration, GNSS fusion, sensor models.
path-planning-algorithms — consumer of SLAM maps; ESDF / occupancy / mesh feed into planners.

20. Citations (primary)

Cadena, Carlone, Carrillo, Latif, Scaramuzza, Neira, Reid, Leonard. “Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age.” IEEE T-RO 32(6), 2016.
Davison, Reid, Molton, Stasse. “MonoSLAM: Real-Time Single Camera SLAM.” IEEE T-PAMI 29(6), 2007.
Klein, Murray. “Parallel Tracking and Mapping for Small AR Workspaces.” ISMAR 2007.
Engel, Koltun, Cremers. “Direct Sparse Odometry.” IEEE T-PAMI 40(3), 2018.
Mur-Artal, Tardós. “ORB-SLAM2: an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras.” IEEE T-RO 33(5), 2017.
Campos, Elvira, Gómez Rodríguez, Montiel, Tardós. “ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial and Multi-Map SLAM.” IEEE T-RO 37(6), 2021.
Hess, Kohler, Rapp, Andor. “Real-Time Loop Closure in 2D LIDAR SLAM.” ICRA 2016.
Shan, Englot, Meyers, Wang, Ratti, Rus. “LIO-SAM: Tightly-coupled Lidar Inertial Odometry via Smoothing and Mapping.” IROS 2020.
Xu, Cai, He, Lin, Zhang. “FAST-LIO2: Fast Direct LiDAR-Inertial Odometry.” IEEE T-RO 38(4), 2022.
Qin, Li, Shen. “VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator.” IEEE T-RO 34(4), 2018.
Newcombe, Izadi, Hilliges, Molyneaux, Kim, Davison, Kohli, Shotton, Hodges, Fitzgibbon. “KinectFusion: Real-time Dense Surface Mapping and Tracking.” ISMAR 2011.
Whelan, Salas-Moreno, Glocker, Davison, Leutenegger. “ElasticFusion: Real-time Dense SLAM and Light Source Estimation.” IJRR 35(14), 2016.
Teed, Deng. “DROID-SLAM: Deep Visual SLAM for Monocular, Stereo, and RGB-D Cameras.” NeurIPS 2021.
Matsuki, Murai, Kelly, Davison. “Gaussian Splatting SLAM.” CVPR 2024.
Rosinol, Abate, Chang, Carlone. “Kimera: an Open-Source Library for Real-Time Metric-Semantic Localization and Mapping.” ICRA 2020.
Forster, Carlone, Dellaert, Scaramuzza. “On-Manifold Preintegration for Real-Time Visual-Inertial Odometry.” IEEE T-RO 33(1), 2017.
Grisetti, Stachniss, Burgard. “Improved Techniques for Grid Mapping with Rao-Blackwellized Particle Filters.” IEEE T-RO 23(1), 2007.
Smith, Self, Cheeseman. “Estimating Uncertain Spatial Relationships in Robotics.” In Autonomous Robot Vehicles, Springer, 1990.
Montemerlo, Thrun, Koller, Wegbreit. “FastSLAM: A Factored Solution to the Simultaneous Localization and Mapping Problem.” AAAI 2002.
Mourikis, Roumeliotis. “A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation.” ICRA 2007.
Kümmerle, Grisetti, Strasdat, Konolige, Burgard. “g2o: A General Framework for Graph Optimization.” ICRA 2011.
Kaess, Johannsson, Roberts, Ila, Leonard, Dellaert. “iSAM2: Incremental Smoothing and Mapping Using the Bayes Tree.” IJRR 31(2), 2012.
Dellaert, Kaess. “Square Root SAM: Simultaneous Localization and Mapping via Square Root Information Smoothing.” IJRR 25(12), 2006.
Lu, Milios. “Globally Consistent Range Scan Alignment for Environment Mapping.” Autonomous Robots 4(4), 1997.
Bloesch, Burri, Omari, Hutter, Siegwart. “Iterated Extended Kalman Filter Based Visual-Inertial Odometry Using Direct Photometric Feedback.” IJRR 36(10), 2017.
Sucar, Liu, Ortiz, Davison. “iMAP: Implicit Mapping and Positioning in Real-Time.” ICCV 2021.
Zhu, Peng, Larsson, Xu, Bao, Cui, Oswald, Pollefeys. “NICE-SLAM: Neural Implicit Scalable Encoding for SLAM.” CVPR 2022.
Vizzo, Guadagnino, Mersch, Wiesmann, Behley, Stachniss. “KISS-ICP: In Defense of Point-to-Point ICP — Simple, Accurate, and Robust Registration If Done the Right Way.” IEEE RA-L 8(2), 2023.
Hughes, Chang, Carlone. “Hydra: A Real-Time Spatial Perception System for 3D Scene Graph Construction and Optimization.” RSS 2022.
Sarlin, DeTone, Malisiewicz, Rabinovich. “SuperGlue: Learning Feature Matching with Graph Neural Networks.” CVPR 2020.
Lindenberger, Sarlin, Pollefeys. “LightGlue: Local Feature Matching at Light Speed.” ICCV 2023.
Gálvez-López, Tardós. “Bags of Binary Words for Fast Place Recognition in Image Sequences.” IEEE T-RO 28(5), 2012.

Compendium

Explorer

SLAM Algorithms — Family Index

SLAM Algorithms — Family Index

1. At a glance — taxonomy axes

2. Filtering era (Bayesian recursive estimation)

3. Pose-graph and factor-graph back-ends (modern)

4. Feature-based visual SLAM

5. Direct visual SLAM (photometric error)

6. Dense / volumetric SLAM

7. LiDAR SLAM

8. Cartographer (Google)

9. Visual-Inertial Odometry (loosely or tightly coupled)

10. Event-camera SLAM

11. Semantic and object-level SLAM

12. Learned and neural-implicit SLAM

13. Multi-robot collaborative SLAM

14. Comparison table

15. Front-end techniques (shared building blocks)

16. Selection heuristics

17. Datasets and benchmarks

18. Theoretical underpinnings (brief)

19. Cross-references

20. Citations (primary)

Graph View

Table of Contents