SQL — Reference (ANSI/ISO + dialects)
Source: https://www.iso.org/standard/76583.html
SQL
- Created: 1974 (SEQUEL/SQL at IBM, Donald Chamberlin & Raymond Boyce); first ANSI standard 1986 (SQL-86), ISO standard 1987
- Latest stable (standard): ISO/IEC 9075:2023 (“SQL:2023”) — added Part 16 (Property Graph Queries / SQL/PGQ) and a
JSONdata type in SQL/Foundation. (Wikipedia SQL article citing ISO/IEC 9075:2023) - Paradigms: Declarative, set-based, relational; with procedural extensions per dialect (PL/pgSQL, T-SQL, PL/SQL, MySQL stored programs)
- Typing: Static, strongly typed columns; expression coercion rules vary by dialect
- Memory: Engine-managed; buffer pools, page cache, sort spillage to disk
- Compilation: Statements parsed → bound → planned/optimized → executed; plans typically cached/reused
- Primary domains: Online transactional processing (OLTP), analytical processing (OLAP), data warehousing, embedded application storage, time-series, geospatial, document and graph workloads
- Implementations: PostgreSQL (18.x, 2026), MySQL (8.4 LTS, 2026 release line) / MariaDB, SQLite (3.x, embedded), Microsoft SQL Server (2022 / vNext on Linux & Windows), Oracle Database (23ai/26ai), plus DB2, CockroachDB, YugabyteDB, Snowflake, BigQuery, Redshift, DuckDB, ClickHouse, Spanner, Aurora, Singlestore — all SQL-flavored, none fully ISO-compliant
- Official docs:
- Standard: https://www.iso.org/standard/76583.html
- PostgreSQL: https://www.postgresql.org/docs/current/
- MySQL 8.4: https://dev.mysql.com/doc/refman/8.4/en/
- SQLite: https://www.sqlite.org/docs.html
- SQL Server: https://learn.microsoft.com/en-us/sql/
- Oracle: https://docs.oracle.com/en/database/oracle/oracle-database/
At a glance
SQL (“Structured Query Language”) is a declarative language for relational data — you describe what set you want; the optimizer chooses how to get it. The ISO/IEC 9075 standard runs to thousands of pages across a dozen parts (Foundation, CLI, PSM, MED, OLB, Schemata, JRT, XML, etc., with SQL/PGQ added in 2023), and “virtually no implementation adheres to it fully.” Every database vendor implements a dialect — a SQL core plus extensions, plus a procedural language. The portable subset is small enough that getting cross-dialect SQL right is a real engineering problem; ORMs and DSLs (jOOQ, SQLAlchemy Core, Diesel) exist largely to hide it. (Wikipedia)
Quick dialect cheat sheet:
| Concern | PostgreSQL | MySQL/MariaDB | SQLite | SQL Server (T-SQL) | Oracle (PL/SQL) |
|---|---|---|---|---|---|
| Architecture | Server, multi-process | Server, multi-thread (per connection) | In-process library, single file | Server, multi-thread | Server, multi-process |
| Default isolation | Read Committed | Repeatable Read | Serializable | Read Committed | Read Committed |
| Concurrency model | MVCC (snapshot) | InnoDB MVCC | Reader-writer locks (WAL mode → MVCC-ish) | MVCC w/ Read Committed Snapshot opt-in | MVCC |
| Auto-increment | GENERATED ALWAYS AS IDENTITY (or serial) | AUTO_INCREMENT | INTEGER PRIMARY KEY (rowid alias) | IDENTITY(1,1) | GENERATED BY DEFAULT AS IDENTITY (12c+) |
| Limit/offset | LIMIT n OFFSET m | LIMIT m, n or LIMIT n OFFSET m | LIMIT n OFFSET m | OFFSET m ROWS FETCH NEXT n ROWS ONLY | OFFSET m ROWS FETCH NEXT n ROWS ONLY (12c+) |
| String concat | ` | (andconcat()`) | CONCAT() (` | ||
| Quote string | 'literal'; double '' to escape | 'literal' or "literal" | 'literal' | 'literal' (N'literal' for unicode) | 'literal' |
| Quote identifier | "name" | `name` (backticks) | "name" or ` | [name] or "name" | "NAME" (uppercase canonical) |
| JSON type | JSON, JSONB (binary, indexable) | JSON (binary internally) | TEXT + json1 extension | nvarchar(max) + JSON_* functions, JSON type (2025+) | JSON (21c+) + JSON Relational Duality |
| UPSERT | INSERT … ON CONFLICT … DO UPDATE | INSERT … ON DUPLICATE KEY UPDATE | INSERT … ON CONFLICT … DO UPDATE | MERGE (with caveats) | MERGE |
| Boolean | Native BOOLEAN | TINYINT(1) (no real bool) | None (use INTEGER) | BIT | None pre-23ai (use NUMBER(1)) |
| Procedural | PL/pgSQL (also PL/Python, PL/Perl, PL/Tcl) | SQL/PSM dialect | None (call from host) | T-SQL | PL/SQL |
Getting started
Install:
- PostgreSQL:
brew install postgresql@17,apt install postgresql, or Dockerpostgres:17. Connect withpsql. - MySQL:
brew install mysql,apt install mysql-server. CLI:mysql. - SQLite:
brew install sqlite, almost always preinstalled. CLI:sqlite3 mydb.sqlite. No server. - SQL Server: Docker
mcr.microsoft.com/mssql/server:2022-latest(cross-platform now), or Express MSI on Windows. CLI:sqlcmd. - Oracle: Free edition Oracle Database Free 23ai (Linux container easiest). CLI:
sqlplusor modernsqlcl.
Hello world (any dialect):
SELECT 'Hello, world!';Project layout: No standard. Migration tooling owns layout: Flyway (V1__init.sql, V2__add_users.sql), Liquibase (XML/YAML/SQL changesets), Alembic (SQLAlchemy), Prisma migrate, Knex, sqitch.
Build/package tool: None for SQL itself. Migrations are the build pipeline. Schema diff tools: pg_dump --schema-only, mysqldump --no-data, sqldef, Atlas, dbmate, Skeema (MySQL).
REPL: psql, mysql, sqlite3, sqlcmd, sqlplus/sqlcl. DBeaver (cross-dialect, free), DataGrip (JetBrains), TablePlus, pgAdmin, MySQL Workbench, SQL Server Management Studio (SSMS), Azure Data Studio as GUIs.
Basics
Types & literals. Standard scalar types: BOOLEAN, SMALLINT/INTEGER/BIGINT, NUMERIC(p,s)/DECIMAL, REAL/DOUBLE PRECISION, CHAR(n)/VARCHAR(n)/TEXT, DATE/TIME/TIMESTAMP [WITH TIME ZONE], INTERVAL, BLOB/BYTEA, UUID (Postgres), JSON/JSONB. String literals single-quoted with doubled single-quote escape: 'O''Brien'. Date: DATE '2026-05-06', TIMESTAMP '2026-05-06 12:34:56+00'. NULL is its own three-valued logic — x = NULL is never true; use x IS NULL / IS DISTINCT FROM.
DDL — schema.
CREATE TABLE users (
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, -- standard since SQL:2003; supported by Postgres, Oracle 12c+, SQL Server 2022 (T-SQL syntax differs)
email VARCHAR(255) NOT NULL UNIQUE,
created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT CURRENT_TIMESTAMP,
metadata JSONB
);
CREATE INDEX users_email_lower_idx ON users (LOWER(email));
ALTER TABLE users ADD COLUMN deleted_at TIMESTAMP WITH TIME ZONE;
DROP TABLE users CASCADE;DML — data.
INSERT INTO users (email) VALUES ('a@x'), ('b@x') RETURNING id; -- RETURNING is Postgres/Oracle/SQL Server (OUTPUT in T-SQL); not SQLite/MySQL
UPDATE users SET deleted_at = now() WHERE email = 'a@x';
DELETE FROM users WHERE deleted_at < now() - INTERVAL '90 days';SELECT — the query.
SELECT u.email, COUNT(*) AS post_count
FROM users u
LEFT JOIN posts p ON p.author_id = u.id
WHERE u.created_at >= DATE '2026-01-01'
GROUP BY u.email
HAVING COUNT(*) > 10
ORDER BY post_count DESC
LIMIT 100;Logical processing order (mental model — physical order is the optimizer’s choice): FROM → JOIN → WHERE → GROUP BY → HAVING → window functions → SELECT → DISTINCT → ORDER BY → LIMIT. This is why you can’t reference a SELECT alias in WHERE (alias not yet bound) but you can in ORDER BY.
Joins. INNER JOIN, LEFT/RIGHT/FULL OUTER JOIN, CROSS JOIN, LATERAL JOIN / CROSS APPLY (per-row subquery, see Advanced).
Set operations. UNION / UNION ALL, INTERSECT, EXCEPT (Postgres/SQL Server) / MINUS (Oracle). ALL keeps duplicates; without it dedupes (= one extra sort).
Constraints. PRIMARY KEY, UNIQUE, FOREIGN KEY … REFERENCES … ON DELETE CASCADE | SET NULL | RESTRICT | NO ACTION, CHECK (...), NOT NULL. Defer FKs in transactions: SET CONSTRAINTS ALL DEFERRED.
Intermediate
Window functions (SQL:2003, universal except very old SQLite). Compute aggregations over a partition without collapsing rows:
SELECT
user_id,
order_total,
SUM(order_total) OVER (PARTITION BY user_id ORDER BY ordered_at
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS running_total,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY ordered_at) AS order_seq,
LAG(order_total) OVER (PARTITION BY user_id ORDER BY ordered_at) AS prev_total
FROM orders;Supported window fns: ROW_NUMBER, RANK, DENSE_RANK, PERCENT_RANK, NTILE, LAG/LEAD, FIRST_VALUE/LAST_VALUE/NTH_VALUE, plus any aggregate as a window function. Frame clauses: ROWS BETWEEN n PRECEDING AND m FOLLOWING, RANGE, GROUPS (SQL:2011, Postgres+Oracle).
Common Table Expressions (CTEs).
WITH active AS (
SELECT * FROM users WHERE deleted_at IS NULL
)
SELECT * FROM active WHERE created_at > now() - INTERVAL '7 days';Recursive CTE for graph/tree traversal:
WITH RECURSIVE descendants AS (
SELECT id, parent_id, name FROM categories WHERE id = 1
UNION ALL
SELECT c.id, c.parent_id, c.name
FROM categories c
JOIN descendants d ON c.parent_id = d.id
)
SELECT * FROM descendants;Termination is by the UNION ALL halting when the right side returns no rows.
Subqueries. Scalar (SELECT (SELECT max(x) FROM …)), table (in FROM), correlated (references outer query — runs per row, often optimizer-rewritten to a join). EXISTS (subquery) is usually the right form for “any matching row” — short-circuits.
Indexes (more in Advanced). CREATE INDEX … (col) for B-tree (default everywhere). Multi-column index (a, b) accelerates WHERE a = ?, WHERE a = ? AND b = ?, but not WHERE b = ? alone. Functional/expression indexes: CREATE INDEX … ON t (LOWER(email)) (Postgres/Oracle/SQL Server, not native MySQL — workaround via generated columns).
Transactions.
BEGIN; -- or START TRANSACTION
… statements …
COMMIT; -- or ROLLBACKSavepoints: SAVEPOINT s1; … ROLLBACK TO SAVEPOINT s1;. Transaction isolation set per-tx: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE | REPEATABLE READ | READ COMMITTED | READ UNCOMMITTED.
Procedural / stored programs.
- PostgreSQL — PL/pgSQL (also pluggable PL/Python, PL/Perl, PL/Tcl, PL/v8).
- SQL Server — T-SQL (
CREATE PROCEDURE … AS BEGIN … END). - Oracle — PL/SQL (
CREATE OR REPLACE PROCEDURE … IS BEGIN … END;). - MySQL — SQL/PSM dialect (
DELIMITER //; CREATE PROCEDURE …). - SQLite — none built-in; use the host language.
json_each,json_treetable-valued functions cover much of what triggers and functions provide elsewhere.
Advanced
Memory model & on-disk format. Each engine maintains a buffer pool / page cache (PostgreSQL: shared_buffers; MySQL/InnoDB: innodb_buffer_pool_size; SQL Server: max server memory; Oracle: SGA). Pages are 8 KiB (Postgres, SQL Server default), 16 KiB (InnoDB default), or configurable (SQLite default 4 KiB; Oracle 2/4/8/16/32 KiB by tablespace). Sort/hash spills to tempdb (SQL Server), temp tablespace (Oracle), temp_tablespaces (Postgres), sort_buffer_size overflow files (MySQL).
MVCC vs lock-based. Postgres, MySQL/InnoDB, Oracle, SQL Server (with Read Committed Snapshot) all use MVCC — readers see a snapshot, writers create new row versions, vacuum/garbage collection reclaims old versions later. Pre-RCSI SQL Server and pre-WAL SQLite use shared/exclusive locks — readers block writers and vice versa. MVCC implications: long-running transactions delay vacuum (Postgres “bloat”), repeatable-read works without read locks, write conflicts are detected at commit time under SERIALIZABLE.
Transaction isolation — the actual semantics, not the names. The standard defines four levels by what anomalies they prevent:
| Level | Dirty read | Non-repeatable read | Phantom read | Write skew | Serialization anomaly |
|---|---|---|---|---|---|
| READ UNCOMMITTED | possible | possible | possible | possible | possible |
| READ COMMITTED | prevented | possible | possible | possible | possible |
| REPEATABLE READ | prevented | prevented | possible (per std) | possible | possible |
| SERIALIZABLE | prevented | prevented | prevented | prevented | prevented |
But every engine implements these levels differently:
- PostgreSQL “REPEATABLE READ” is actually snapshot isolation — also prevents phantom reads (better than the standard). Doesn’t prevent write skew. Its SERIALIZABLE uses Serializable Snapshot Isolation (SSI) — true serializability with rollback on detected anomaly. Default: READ COMMITTED.
- MySQL/InnoDB “REPEATABLE READ” (the default) uses snapshot reads + next-key locks for inserts in range-locked queries. Mostly prevents phantoms — not always. Its SERIALIZABLE adds shared locks on every read.
- SQL Server default READ COMMITTED uses locks unless RCSI (Read Committed Snapshot Isolation) is on. SNAPSHOT isolation is opt-in per-database. SERIALIZABLE uses range locks.
- Oracle has READ COMMITTED (default) and SERIALIZABLE; both snapshot-based. No support for READ UNCOMMITTED or REPEATABLE READ.
- SQLite is SERIALIZABLE always (single-writer, journal/WAL provides snapshot reads in WAL mode).
MERGE / UPSERT. Standard MERGE (SQL:2003) exists in SQL Server, Oracle, and Postgres 15+ — but its semantics under concurrency are notoriously underspecified. SQL Server’s MERGE has known race-condition bugs (Aaron Bertrand’s Use Caution with SQL Server’s MERGE Statement); many guides recommend INSERT … ON CONFLICT (Postgres/SQLite) or INSERT … ON DUPLICATE KEY UPDATE (MySQL) instead.
-- Postgres / SQLite (UPSERT)
INSERT INTO counters (key, value) VALUES ('x', 1)
ON CONFLICT (key) DO UPDATE SET value = counters.value + EXCLUDED.value;
-- MySQL
INSERT INTO counters (`key`, value) VALUES ('x', 1)
ON DUPLICATE KEY UPDATE value = value + VALUES(value);
-- Standard MERGE
MERGE INTO counters c USING (SELECT 'x' AS k, 1 AS v) s ON (c.key = s.k)
WHEN MATCHED THEN UPDATE SET value = c.value + s.v
WHEN NOT MATCHED THEN INSERT (key, value) VALUES (s.k, s.v);Lateral joins. LATERAL lets the right side of a join reference columns from the left. Standardized SQL:1999. Supported in Postgres, Oracle 12c+, SQL Server (CROSS APPLY/OUTER APPLY predates lateral), MySQL 8.0.14+.
-- "Top 3 orders per user"
SELECT u.id, o.*
FROM users u
CROSS JOIN LATERAL (
SELECT * FROM orders WHERE user_id = u.id ORDER BY total DESC LIMIT 3
) o;JSON / XML. Per dialect:
- Postgres:
JSONB(binary, indexable, GIN); operators->,->>,#>,@>,?,jsonb_path_exists,jsonb_path_query. SQL:2016 path syntax injsonb_path_*family. - MySQL:
JSON(binary),->,->>,JSON_EXTRACT,JSON_TABLEfor shredding to relational form. - SQL Server: historically
nvarchar(max)+JSON_VALUE/JSON_QUERY/OPENJSON; SQL Server 2022+ added nativeJSONtype andISJSON,JSON_OBJECT,JSON_ARRAY,JSON_PATH_EXISTS. (What’s new in SQL Server 2022) - Oracle:
JSONtype (21c+) and JSON Relational Duality (23ai+) — query the same data as table or document. - SQLite:
json1extension built in (json_extract,json_each,json_tree). - XML: SQL/XML standard. Postgres
xpath, SQL Server.value('xpath', 'sqltype'), Oracle XQuery.
Query planner inspection.
- Postgres:
EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) <query>— runs query, returns chosen plan + actual timings + I/O. - MySQL:
EXPLAIN ANALYZE(8.0.18+) orEXPLAIN FORMAT=JSON. - SQL Server:
SET STATISTICS XML ON; SET STATISTICS IO ON;or “Display Actual Execution Plan” in SSMS;SHOWPLAN_XML. - Oracle:
EXPLAIN PLAN FOR <query>; SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);Real-time SQL Monitoring for live exec. - SQLite:
EXPLAIN QUERY PLAN <query>.
Index types beyond B-tree.
| Type | Engine(s) | Use case |
|---|---|---|
| B-tree | All | Equality, range, prefix on ordered columns |
| Hash | Postgres (memory only pre-10), MySQL Memory engine | Equality only; no range. Rare; B-tree usually better |
| GIN (Generalized Inverted) | Postgres | Inverted index for JSONB, arrays, full-text (tsvector); fast @> containment |
| GiST (Generalized Search Tree) | Postgres | Geometric, full-text, range types, kNN; foundation of PostGIS |
| BRIN (Block Range) | Postgres | Append-only / time-series tables — small index, scans block ranges |
| SP-GiST | Postgres | Space-partitioned trees (quad-trees, k-d trees) |
| Bitmap | Oracle | Low-cardinality columns, OLAP joins |
| Columnstore | SQL Server, Postgres (citus, hydra), MariaDB | Analytical scans across columns; massive compression |
| Full-text | All major (tsvector/MATCH AGAINST/CONTAINS/Oracle Text/SQLite FTS5) | Token search, ranking |
Procedural extensions deep dive.
- PL/pgSQL (Postgres):
CREATE FUNCTION … LANGUAGE plpgsql AS $$ DECLARE … BEGIN … END $$;. Cursors, exceptions (EXCEPTION WHEN unique_violation THEN …), composite types. - T-SQL (SQL Server):
BEGIN TRY/CATCH,THROW, table-valued parameters, OUTPUT clause, dynamic SQL viasp_executesql @sql, @paramdef, @p1=…(parameterized — SQL injection-safe vsEXEC()), schema-bound functions for indexed views. - PL/SQL (Oracle): packages (header + body),
%TYPE/%ROWTYPEfor column-typed vars, bulk operations (FORALL/BULK COLLECT— order-of-magnitude faster than row-by-row), pipelined table functions, autonomous transactions (PRAGMA AUTONOMOUS_TRANSACTION). - MySQL stored programs: procedures, functions, triggers, events. Far less mature than Oracle/Postgres; many shops avoid in favor of application-side logic.
Foreign Data Wrappers / federation.
- Postgres:
postgres_fdw,oracle_fdw,mysql_fdw,file_fdw,multicorn(Python wrappers) — query external sources as tables. - SQL Server:
OPENROWSET, linked servers, PolyBase (S3, Hadoop, parquet — extended in SQL Server 2022). - Oracle: Database Links,
DBMS_HS_*(Heterogeneous Services). - MySQL: FEDERATED storage engine (limited).
Materialized views.
- Postgres:
CREATE MATERIALIZED VIEW … WITH DATA; REFRESH MATERIALIZED VIEW [CONCURRENTLY] mv;(CONCURRENTLY requires unique index, doesn’t lock readers). - Oracle: rich support —
REFRESH FAST(incremental via materialized view logs),REFRESH COMPLETE,REFRESH FORCE, ON COMMIT vs ON DEMAND. - SQL Server: indexed views — strictly more constrained but maintained synchronously.
- MySQL: none built-in. Triggers + summary tables, or external tools (Flexviews).
- SQLite: none.
Partitioning.
- Postgres (declarative since 10):
PARTITION BY RANGE | LIST | HASH (col); subpartitioning. Partition pruning at planning time. - MySQL: RANGE, LIST, HASH, KEY, COLUMNS. Partition pruning. Limited (no FK across partitions).
- SQL Server: partition functions + partition schemes; one filegroup per partition.
SWITCH PARTITIONfor fast bulk load. - Oracle: richest — RANGE, LIST, HASH, INTERVAL (auto-create on insert), REFERENCE (FK-based), composite. Partition-wise joins.
- SQLite: none (single file).
Sharding is application-side except for purpose-built distributed SQL (CockroachDB, YugabyteDB, Spanner, Vitess for MySQL, Citus for Postgres). Add consistent hashing and you have a horizontal-scale story.
Prepared statements & plan caching. Every engine caches plans for parameterized statements. Pitfalls: parameter sniffing — the first execution’s parameters shape the plan, subsequent calls with skewed params reuse a bad plan. Mitigations: SQL Server OPTION (RECOMPILE), OPTIMIZE FOR UNKNOWN, parameter-sensitive plan optimization (2022+); Oracle OPT_PARAM hints, adaptive plans (12c+); Postgres prepared-statement plan cache hits the “generic vs custom plan” decision at execution 6.
God mode
- Recursive CTEs for graph traversal: above pattern applies to org charts, dependency graphs, transitive closures. With
cycle detectionclause (SQL:2016, Postgres 14+):CYCLE id SET is_cycle USING path_array. - Window-function recipes:
ROW_NUMBER() OVER (PARTITION BY user ORDER BY ts DESC) = 1for “latest per group”.SUM(x) OVER ()for grand total alongside detail rows. Gaps-and-islands:ROW_NUMBER() OVER (ORDER BY ts) - ROW_NUMBER() OVER (PARTITION BY status ORDER BY ts)groups consecutive same-status rows. FILTERclause (SQL:2003):COUNT(*) FILTER (WHERE active)instead ofCOUNT(CASE WHEN active THEN 1 END). Postgres + SQLite + Oracle 23ai. Cleaner for conditional aggregates.GROUPING SETS,CUBE,ROLLUP— multi-level aggregations in one query (SQL:1999).TABLESAMPLE—SELECT * FROM t TABLESAMPLE SYSTEM (1)reads ~1% of pages. Postgres, SQL Server, Oracle.MERGEwith caveats: prefer ON CONFLICT (Postgres/SQLite) where available; if you must MERGE (Oracle/SQL Server), wrap in serializable txn or use HOLDLOCK in T-SQL to avoid concurrency races.- JSONB indexing (Postgres):
CREATE INDEX ON t USING gin (data jsonb_path_ops)— operator-class-tuned GIN index, smaller and faster for@>queries than the default. - Generated columns (
GENERATED ALWAYS AS … STORED | VIRTUAL): synthesize values from other columns;STOREDlets you index them (the workaround for MySQL’s lack of expression indexes). LATERALfor top-N-per-group (above).- Arrays (Postgres):
text[],int[][],unnest(),array_agg(),array @> array, GIN-indexable. Often a JSON-replacement for typed array data. - Range types (Postgres):
int4range,tstzrange. Exclusion constraints prevent overlapping ranges (e.g., room booking conflicts). - PL/pgSQL
RAISE EXCEPTION USING ERRCODE = '23505'— emit specific SQLSTATE so apps can distinguish constraint violations. SELECT … FOR UPDATE [SKIP LOCKED | NOWAIT]— row-level pessimistic locking;SKIP LOCKEDis the canonical work-queue pattern (each consumer claims rows the others didn’t lock). Standard since SQL:2003 (SKIP LOCKEDis SQL:2016).SELECT … FOR SHARE— share-lock for read-modify-write to prevent concurrent updates without blocking other readers.EXPLAIN ANALYZEdiscipline: always includeBUFFERS(Postgres) — knowing whether a plan hit cache vs disk is more important than its time. Compare estimated vs actual rows; orders-of-magnitude divergence = stale stats (ANALYZE/UPDATE STATISTICS).- Transaction-scoped temp tables (Postgres
CREATE TEMP TABLE … ON COMMIT DROP, SQL Server#temp, OracleCREATE GLOBAL TEMPORARY TABLE … ON COMMIT DELETE ROWS) for pipeline-style intermediate results. COPY(Postgres) /BULK INSERT(SQL Server) /LOAD DATA INFILE(MySQL) /SQL*Loader(Oracle) — bypass row-by-row INSERT for massive throughput on bulk loads.- Database-side full-text: Postgres
to_tsvector/to_tsquery+ GIN, MySQLMATCH … AGAINST, SQL ServerCONTAINS/FREETEXT, Oracle Text, SQLite FTS5. Often good enough to skip Elasticsearch for small corpora. - PostgreSQL’s
pg_stat_*views — runtime telemetry:pg_stat_statements(per-query stats; needs extension),pg_stat_user_indexes(find unused indexes byidx_scan = 0),pg_stat_activity(running queries).
Idioms & style
- Naming:
snake_casefor tables and columns is overwhelmingly the convention (Postgres folds unquoted identifiers to lowercase; mixed case requires quoting forever). Singular vs plural table names is a holy war — pick one and stick. - Always alias tables in joins (
FROM users u JOIN orders o ON u.id = o.user_id). SELECT *is for ad-hoc only. In code, name columns — schema changes silently break consumers otherwise.COUNT(*)overCOUNT(1)overCOUNT(col).*and1are equivalent (the optimizer treats them identically on every modern engine);COUNT(col)skips NULLs.- Use
IS [NOT] DISTINCT FROMfor null-safe equality (SQL:1999; Postgres, SQL Server 2022+, Oracle 23ai). MySQL has<=>. - Prefer
EXISTSoverIN (subquery)for “does any row match” — short-circuits, NULL-safe. - Always parameterize — never string-concatenate values into SQL. SQL injection is solved with
?/$1placeholders + driver binding. - Migrations are append-only. Edit-and-rerun breaks production. Use Flyway/Liquibase/Atlas/Alembic and treat each migration as immutable history.
- One transaction per logical unit of work. Long-running transactions hurt MVCC bloat (Postgres) and lock duration (SQL Server pre-RCSI).
- Index the foreign keys. Most engines don’t auto-index FK columns;
DELETEof a parent row scans children otherwise. (Oracle is famous for the “unindexed FK = full-table-lock during delete” trap.) - Linter / formatter:
sqlfluff(multi-dialect, configurable rules),pgFormatter(Postgres-leaning). Editor support via DataGrip / DBeaver / VS Code SQLTools. - Style guides: Mozilla’s SQL style, GitLab’s data team SQL style guide, Simon Holywell’s “SQL Style Guide” (https://www.sqlstyle.guide/).
Ecosystem
- Migration tools: Flyway, Liquibase, Alembic (SQLAlchemy), Atlas, dbmate, sqitch, Skeema (MySQL), DACPAC/sqlpackage (SQL Server), Liquibase Pro.
- Query builders / ORMs: SQLAlchemy (Python), jOOQ (JVM — generates type-safe DSL from schema), Diesel (Rust), Prisma (Node.js), Hibernate / EF Core, Doctrine (PHP), Ecto (Elixir).
- Connection pooling: PgBouncer (Postgres), ProxySQL (MySQL), HikariCP (JVM-side, all dialects).
- Replication / HA: Postgres logical & streaming replication, Patroni; MySQL GTID replication, InnoDB Cluster, Group Replication; SQL Server Always On Availability Groups; Oracle Data Guard / GoldenGate; Litestream / rqlite for SQLite.
- Backup:
pg_basebackup+ WAL archiving (Postgres),mysqldump/xtrabackup(MySQL), VSS-based (SQL Server), RMAN (Oracle),.backup(SQLite). - Observability:
pg_stat_statements, performance schema (MySQL), Query Store (SQL Server 2016+, default-on in 2022), AWR / Statspack (Oracle). - Distributed SQL: CockroachDB, YugabyteDB, TiDB, Spanner, Citus (Postgres extension for sharding), Vitess (MySQL).
- Analytical / OLAP: Snowflake, BigQuery, Redshift, ClickHouse, DuckDB (in-process columnar SQL — increasingly important), Databricks SQL.
- Notable users: every business with persistent state. Postgres at Apple, Instagram, Reddit, Datadog. MySQL at Facebook (huge fork: MySQL+RocksDB), Booking.com, Wikipedia. SQLite in every iOS/Android app, every browser, Skype, Adobe products, most embedded systems with structured storage. SQL Server across Microsoft, Stack Overflow (famously). Oracle in essentially every Fortune 500 ERP.
Gotchas
- NULL is not equal to NULL.
WHERE x = NULLmatches nothing. UseIS NULL/IS DISTINCT FROM. - Three-valued logic.
NOT (x IN (1, NULL))is NULL (not true) when x ∉ {1, NULL}.NOT EXISTSis safer. COUNT(col)skips NULLs;COUNT(*)doesn’t.- String comparison collation differences.
'a' = 'A'depends on collation. SQL Server is case-insensitive by default; Postgres is case-sensitive by default. - Ordering without
ORDER BYis undefined. No engine guarantees insertion order; what worked yesterday breaks afterVACUUM FULLor a plan change. LIMITwithoutORDER BYreturns “any” N rows. Same trap.- Implicit type coercion footguns. MySQL:
WHERE id = '1abc'matchesid = 1because of implicit cast. Postgres errors. Performance:WHERE indexed_col = 'literal'may not use the index if types differ (e.g.,intcolumn vs string literal). SELECT *and column reordering. Adding a column to a table changesSELECT *order — breaksINSERT INTO … SELECT *and positional clients.- Auto-commit by default in most CLIs — wrap in explicit
BEGIN; … COMMIT;or set\set AUTOCOMMIT off(psql). - Postgres folds unquoted identifiers to lowercase, the standard says they should be uppercase. Quoted identifiers preserve case forever. Mixing quoted/unquoted is misery.
- MySQL’s
||is OR, not concat unlesssql_modeincludesPIPES_AS_CONCAT. UseCONCAT(). - MySQL’s “REPEATABLE READ” + write skew — a lost-update can occur if you read-then-write without
FOR UPDATE. - SQL Server’s
MERGErace conditions under concurrent writes — known footgun, see Bertrand’s article. - Oracle’s empty string is NULL.
'' IS NULLis true. Different from every other dialect. - Oracle’s
DATEincludes time of day. No separateTIMEtype.TIMESTAMPis the modern choice. - SQLite’s “type affinity” isn’t real type checking — you can insert a string into an
INTEGERcolumn unless you declareSTRICT(SQLite 3.37+). Quietly stores it. - MERGE is not always atomic in concurrency — even where it works, multiple concurrent MERGEs can both decide “not matched” then both INSERT. Use unique constraint + ON CONFLICT or wrap in SERIALIZABLE.
INSERT … RETURNINGisn’t in MySQL or SQLite (SQLite added in 3.35). Standardize viaLAST_INSERT_ID()(MySQL),last_insert_rowid()(SQLite),OUTPUT(T-SQL),RETURNING INTO :var(PL/SQL).- Foreign keys not enforced in SQLite by default.
PRAGMA foreign_keys = ON;per connection — and the WAL/journal default doesn’t preserve this across reopens. - Time zones.
TIMESTAMPwithout time zone is wall-clock;TIMESTAMP WITH TIME ZONE(Postgres) actually stores UTC and converts on read per session TZ. SQL Server hasdatetimeoffset(with TZ) anddatetime2(without). Never storeDATETIMEand “remember it’s UTC” — you’ll forget. - Floating point money. Use
NUMERIC(p,s)/DECIMAL. NeverFLOAT/DOUBLE. OFFSETfor pagination scales poorly (DB scans + discards offset rows). Use keyset pagination (WHERE id > $last_id ORDER BY id LIMIT n) for large data sets.- N+1 query problem is an ORM/app issue but shows up in DB load. EXPLAIN per query and look at the access pattern.
Citations
- ISO/IEC 9075-1:2023 (SQL standard, Framework) — https://www.iso.org/standard/76583.html
- Wikipedia SQL article (standard parts, version history) — https://en.wikipedia.org/wiki/SQL
- PostgreSQL 18.3 documentation — https://www.postgresql.org/docs/current/
- MySQL 8.4 reference manual — https://dev.mysql.com/doc/refman/8.4/en/
- SQLite documentation — https://www.sqlite.org/docs.html
- SQLite “About” — https://www.sqlite.org/about.html
- SQL Server documentation — https://learn.microsoft.com/en-us/sql/
- What’s new in SQL Server 2022 — https://learn.microsoft.com/en-us/sql/sql-server/what-s-new-in-sql-server-2022
- Oracle Database documentation — https://docs.oracle.com/en/database/oracle/oracle-database/
- Oracle Database Concepts — https://docs.oracle.com/en/database/oracle/oracle-database/23/cncpt/
- Postgres MVCC — https://www.postgresql.org/docs/current/mvcc-intro.html
- MySQL InnoDB transaction isolation — https://dev.mysql.com/doc/refman/8.4/en/innodb-transaction-isolation-levels.html
- SQL Server isolation levels — https://learn.microsoft.com/en-us/sql/t-sql/statements/set-transaction-isolation-level-transact-sql
- Aaron Bertrand on MERGE — https://www.mssqltips.com/sqlservertip/3074/use-caution-with-sql-servers-merge-statement/
- SQL Style Guide (Simon Holywell) — https://www.sqlstyle.guide/
- sqlfluff — https://docs.sqlfluff.com/en/stable/