C — Reference

Source: https://en.cppreference.com/w/c

C

  • Created: 1972 by Dennis Ritchie at Bell Labs (alongside the rewrite of Unix in C)
  • Latest stable: C23 = ISO/IEC 9899:2024 (published 2024-10-31). Predecessors: C17 (2018), C11 (2011), C99, C89/C90 (ANSI). C2y in early drafting.
  • Paradigms: procedural, imperative, structured; weakly typed by modern standards; supports limited generic programming via _Generic
  • Typing: static, weak (implicit conversions, casts subvert), nominal (struct types are nominal even if structurally identical)
  • Memory: manualmalloc/calloc/realloc/free, plus stack frames and static storage. No GC. Aliasing rules + lifetimes are programmer-managed.
  • Compilation: compiled to native code via GCC, Clang/LLVM, MSVC, ICC/ICX, TCC, Cproc, chibicc, Cosmopolitan; some interpreters (Cling, Ch). Translation units → object files → linker → executable / library.
  • Primary domains: operating systems (Linux, BSDs, Windows kernel), embedded firmware, microcontrollers, drivers, language runtimes (CPython, Lua, Ruby, V8 supporting code), databases (SQLite, PostgreSQL, Redis), networking (nginx, curl), cryptography, scientific kernels (BLAS, FFTW), DSP
  • Official docs: https://en.cppreference.com/w/c (de facto reference; ISO standard is paywalled)

At a glance

  • Standard body: ISO/IEC JTC1/SC22/WG14. Drafts public; the final standard is paywalled but late drafts (e.g., N3220 for C23) are essentially identical and freely available.
  • Compilers: GCC (https://gcc.gnu.org), Clang/LLVM (https://clang.llvm.org), MSVC (Microsoft Visual C++), Intel ICX (LLVM-based), TCC (Tiny C Compiler), chibicc (educational), Cproc, Cosmopolitan libc / Actually Portable Executable.
  • Standard libraries: glibc (Linux), musl (Alpine, embedded), Bionic (Android), Newlib (embedded), uClibc-ng, Microsoft UCRT, Apple Libc.
  • GCC 15+ defaults to -std=gnu23 (per WG14 references). C23 toggle is -std=c23.

Getting started

Install:

  • Linux: sudo apt install build-essential (GCC) or sudo apt install clang.
  • macOS: xcode-select --install (Apple Clang) or brew install llvm gcc.
  • Windows: MSVC via Visual Studio Build Tools, or MSYS2 + MinGW-w64 GCC, or LLVM Clang for Windows, or WSL.
  • Cross-compile / embedded: arm-none-eabi-gcc, riscv64-unknown-elf-gcc, vendor toolchains (XC32, AVR-GCC, IAR, Keil/ARMCC).

Hello world:

#include <stdio.h>
 
int main(void) {
    printf("Hello, world!\n");
    return 0;
}

Compile + run: cc -std=c23 -Wall -Wextra -O2 hello.c -o hello && ./hello.

Project layout (typical):

myproject/
  Makefile             # or CMakeLists.txt / meson.build
  include/
    mylib.h
  src/
    main.c
    mylib.c
  tests/
    test_mylib.c
  build/               # out-of-source

Build tools:

  • Make / GNU Make — the lingua franca, brittle but everywhere.
  • CMake — de facto for cross-platform; generator for Make/Ninja/MSBuild/Xcode.
  • Meson + Ninja — modern, fast, friendlier syntax. Used by GTK, GStreamer, systemd.
  • Bazel, Buck2 — at scale.
  • autotools (./configure && make && make install) — legacy GNU stack.
  • Single-header bundlers (e.g., for SQLite-style amalgamations).

Package management:

  • C lacks an official package manager. vcpkg (Microsoft, cross-platform), Conan (multi-language), pkg-config for finding installed libs, distro package managers (apt, yum, pacman, brew).
  • Header-only and single-file libs (stb_*, raylib, etc.) sidestep the issue.

REPL / interpreter: cling (LLVM-based), tcc -run hello.c, online: https://godbolt.org (Compiler Explorer — essential).

Basics

Primitive types & literals:

  • Integers: char (signedness implementation-defined), signed char, unsigned char, short, int, long, long long, plus unsigned variants.
  • Fixed-width (since C99, <stdint.h>): int8_t, uint16_t, int32_t, uint64_t, intptr_t, size_t, ptrdiff_t.
  • Floating: float, double, long double. C23 adds _Float16, _Float32, _Float64, _Float128, _Decimal*.
  • C23: bool, true, false, nullptr, nullptr_t, _BitInt(N) (bit-precise integers).
  • Literals: 0x1A, 0b1010 (C23 binary), 0777 (octal), 1'000'000 (C23 digit separator), 1.5e3, 1.5f, 123ULL, 'a', u8'a', U'é'.

Variables / scope: block-scoped via { }. static for file-scope or persistent local. extern for cross-TU references. const (read-only), volatile (don’t optimize accesses), restrict (no aliasing — C99). C23 allows constexpr for true compile-time constants.

Control flow: if/else, for, while, do/while, switch/case/default (with fallthrough by default — use [[fallthrough]]; attribute in C23 to silence warnings), goto, break, continue. C23 allows labels before declarations.

Functions:

int add(int a, int b) { return a + b; }
int sum(int n, ...);                       // varargs via <stdarg.h>
typedef int (*comparator)(const void*, const void*);
  • No default arguments. No overloading (use _Generic macros). Functions are first-class via pointers but no closures.
  • C23: auto for type inference of locals; unnamed parameters allowed; [[nodiscard]], [[deprecated]], [[maybe_unused]] attributes.

Strings: NUL-terminated char* (or char[]). No first-class string type. Width-prefixed: "text" (char), u8"text" (UTF-8, char8_t in C23), u"text" (char16_t), U"text" (char32_t), L"text" (wchar_t). Use <string.h>: strlen, strcpy, memcpy, strncpy, snprintf, memcmp. strdup and memset_explicit are now standard in C23.

Built-in collections: none. Arrays (with size known at compile time, or VLA — variable-length array — optional in C11+). No native list / map / set. Use libraries (klib, stb, GLib, uthash) or hand-rolled.

Intermediate

Type system depth:

  • Pointers: T* for any T. void* is the universal pointer (implicit conversions to/from any object pointer).

  • Arrays decay to pointers in expressions — T arr[N] becomes T* when passed to functions.

  • Function pointers: int (*fp)(int, int) = &add;

  • Structs / unions / enums. Anonymous structs/unions allowed. Bit fields: unsigned x : 4;.

  • typedef for type aliases; C23 adds typeof / typeof_unqual operators.

  • Generic selection (_Generic) — compile-time type dispatch:

    #define type_name(x) _Generic((x), \
        int: "int", float: "float", double: "double", default: "other")

Modules / packages: none — there are translation units (one .c file). Headers (#include) are textual substitution. Linker resolves symbols across object files. C23 retains no module system (C++20 modules are not in C). Use header guards (#ifndef MOD_H) or #pragma once (compiler extension, near-universal).

Errors:

  • Return codes: 0 = success, nonzero = error. Convention: int return, errno set on POSIX I/O.
  • <errno.h> global errno, strerror, perror.
  • setjmp/longjmp (<setjmp.h>) for non-local control flow — used sparingly (Lua, Python interpreter loop).
  • assert (<assert.h>) for debug-time checks. C23 adds static_assert (was _Static_assert in C11).

Concurrency primitives (C11+):

  • <threads.h>: thrd_create, thrd_join, mtx_t, cnd_t, tss_t. MSVC has historically lacked these; many projects still use POSIX pthreads or Win32 directly.
  • <stdatomic.h>: _Atomic int x; atomic_fetch_add(&x, 1); atomic_thread_fence(memory_order_acquire); — full C11 memory model.
  • POSIX: pthread_create, pthread_mutex_t, pthread_cond_t, sem_t.
  • Higher level: OpenMP (#pragma omp parallel for), Intel TBB (C++ but C-callable), libuv, libdispatch (GCD).

File I/O / networking:

  • Buffered: <stdio.h>fopen, fread, fwrite, fprintf.
  • Unbuffered POSIX: <unistd.h>open, read, write, close, lseek.
  • Memory mapping: mmap / munmap (POSIX), CreateFileMapping (Win32).
  • Networking: BSD sockets (<sys/socket.h>, socket, bind, listen, accept, connect, send, recv). Cross-platform: libuv, libevent.

Stdlib highlights: <stdio.h>, <stdlib.h> (malloc/free/qsort/bsearch), <string.h>, <math.h>, <time.h>, <ctype.h>, <stdint.h>, <inttypes.h> (PRIu64 macros), <limits.h>, <float.h>, <errno.h>, <assert.h>, <stdarg.h>, <stdatomic.h>, <threads.h>, <setjmp.h>. C23 adds <stdbit.h> (bit utilities), <stdckdint.h> (checked integer arithmetic).

Advanced

Memory model:

  • C11 introduced a formal memory model with memory_order_* (relaxed, consume, acquire, release, acq_rel, seq_cst).
  • Object lifetime is governed by storage duration: automatic, static, allocated, thread-local (thread_local in C23 / _Thread_local in C11).
  • Strict aliasing (TBAA): you may not access an object via an incompatible type pointer except via char*, unsigned char*, or std::byte-equivalents. Violating this is UB and breaks -O2+ optimizations.
  • restrict asserts pointers don’t alias the same object — enables better codegen; lying = UB.
  • Provenance (TS 6010, post-C23 work) — formalizing pointer provenance for optimizers.

Concurrency / parallelism deep dive:

  • C11 atomics map to LDAR/STLR on ARM, LOCK-prefixed ops on x86.
  • Memory fences via atomic_thread_fence. Lock-free queues via CAS loops.
  • <stdatomic.h> atomic_flag is the only guaranteed-lock-free type; use ATOMIC_*_LOCK_FREE macros to check.
  • Cache-line awareness (alignas(64)), false-sharing avoidance, NUMA-aware numactl / mbind.

FFI / interop:

  • C is the FFI lingua franca — every other language can call C ABI functions. The “C ABI” varies by platform (System V AMD64, Microsoft x64, AAPCS64).
  • Calling other languages: dlopen/dlsym (POSIX), LoadLibrary/GetProcAddress (Win32). LuaJIT, Python ctypes, Rust extern "C", Go cgo all link against C.

Reflection / introspection: none at runtime. Source-level: __func__, __FILE__, __LINE__, __DATE__, __TIME__. C23 adds __VA_OPT__ for cleaner variadic macros.

Performance tuning:

  • Profilers: perf (Linux), vtune (Intel), Instruments (macOS), callgrind / cachegrind (Valgrind), gprof, samply, flamegraph.pl.
  • Sanitizers (Clang/GCC): -fsanitize=address (ASan), -fsanitize=undefined (UBSan), -fsanitize=thread (TSan), -fsanitize=memory (MSan, Clang-only). Run all your tests under them.
  • Static analysis: clang-tidy, clang --analyze, Coverity, PVS-Studio, Cppcheck, Frama-C, Infer (Meta).
  • Hot-path tools: perf record/report, objdump -d, llvm-mca (machine-code analyzer), Compiler Explorer (https://godbolt.org).
  • PGO: GCC -fprofile-generate / -fprofile-use. LTO: -flto.
  • Branch hints: __builtin_expect (GCC/Clang), [[likely]]/[[unlikely]] not yet in C as of C23 (planned).

God mode

Undefined behavior catalog: signed integer overflow, dereferencing null, oob array access, use-after-free, data races, strict-aliasing violations, modifying string literals, reading uninitialized objects, sequence-point violations (i = i++), shifting by ≥ width, divide by zero. UB is not “implementation-defined” — the compiler can assume it never happens and delete code based on that assumption. https://blog.regehr.org/archives/213

Strict aliasing:

float f = 1.0f;
unsigned u = *(unsigned*)&f;  // UB — strict aliasing violation
unsigned u2; memcpy(&u2, &f, sizeof u2);  // OK

Use memcpy or unions (the C union exception applies to type-punning). C23 lets you use compound literals more freely.

restrict:

void axpy(size_t n, float a, const float *restrict x, float *restrict y) {
    for (size_t i = 0; i < n; i++) y[i] += a * x[i];
}

Tells the compiler x and y don’t alias — vectorizes properly.

Inline assembly:

unsigned long rdtsc(void) {
    unsigned long lo, hi;
    __asm__ __volatile__ ("rdtsc" : "=a"(lo), "=d"(hi));
    return (hi << 32) | lo;
}

GCC/Clang extended asm uses constraint letters (r, m, =a, +r). MSVC has __asm (x86 only) and __readmsr/__cpuid intrinsics.

Sanitizers in detail:

  • ASan: red-zones around allocations, shadow memory; ~2x slowdown.
  • UBSan: instruments operations to trap on signed overflow, oob, null deref, etc.
  • TSan: happens-before tracking for data races; ~5-15x slowdown.
  • MSan: uninitialized-read detection; needs MSan-instrumented libc — usually run on Clang + libc++ on Linux.

Section attributes & linker control:

  • __attribute__((section(".init_array"))) — code that runs before main.
  • __attribute__((constructor)) / ((destructor)) — same idea, friendlier syntax.
  • __attribute__((visibility("hidden"))) — control symbol visibility for shared libs (-fvisibility=hidden).
  • __attribute__((weak)) — weak symbol; overridable by strong definitions.
  • Linker scripts (.lds) for embedded — placement of sections, vector tables.

_Generic for type-overloaded macros:

#define abs(x) _Generic((x), \
    int: abs, long: labs, long long: llabs, \
    float: fabsf, double: fabs, long double: fabsl)(x)

Powers <tgmath.h> type-generic math.

Computed gotos (GCC extension): dispatch tables for interpreters and bytecode VMs.

static void *targets[] = { &&op_add, &&op_sub, ... };
goto *targets[opcode];
op_add: ...; goto *targets[*ip++];

~10-30% faster than switch for VM main loops (used by CPython 3.11+ when supported).

alloca — stack allocation; cheap but UB on overflow. C99 VLAs (int arr[n]) are similar; optional in C11+.

Compiler-specific intrinsics:

  • __builtin_expect, __builtin_unreachable, __builtin_clz/ctz/popcount (now standardized as <stdbit.h> in C23).
  • SIMD intrinsics: <immintrin.h> (x86 SSE/AVX), <arm_neon.h> (ARM NEON).
  • Atomic builtins: __atomic_*, __sync_* (legacy).

C23 highlights to know:

  • [[attribute]] syntax (was __attribute__ GCC-only).
  • auto for type inference, typeof operator standardized.
  • nullptr and nullptr_t.
  • _BitInt(N) for arbitrary-width integers.
  • constexpr objects.
  • Improved variadic macros: __VA_OPT__.
  • Removed: trigraphs, K&R function declarations, non-two’s-complement integers (signed integer representation is now defined).
  • Added: <stdbit.h>, <stdckdint.h>.

Idioms & style

  • Naming: snake_case is dominant in K&R-influenced code (Linux kernel, glibc). PascalCase and camelCase appear in Win32 / Apple APIs. Macros are SCREAMING_SNAKE_CASE.
  • Formatters: clang-format (the standard), indent (GNU). Common style files: Linux kernel’s .clang-format, LLVM, Google, WebKit, Mozilla.
  • Linters / static analyzers: clang-tidy, cppcheck, Coverity, PVS-Studio, Infer, CodeQL, Frama-C (deductive verification), CBMC (bounded model checker), CompCert (formally verified compiler).
  • Style guides: Linux Kernel coding style (https://www.kernel.org/doc/html/latest/process/coding-style.html), GNU Coding Standards, Google C++ Style (covers C-ish C), MISRA C (safety-critical, automotive), CERT C (security).
  • Idiomatic patterns:
    • Header has extern declarations and types; .c has definitions.

    • Opaque types: forward-declare a struct in the header, define in the .c. The user only sees mylib_handle_t*.

    • “Object” pattern: struct foo + foo_init/foo_free + functions taking struct foo*.

    • Goto-cleanup for resource unwinding:

      int do_thing(void) {
          int rc = 0;
          FILE *f = fopen("x", "r");
          if (!f) { rc = -1; goto out; }
          char *buf = malloc(N);
          if (!buf) { rc = -2; goto close_file; }
          // ... work ...
          free(buf);
      close_file:
          fclose(f);
      out:
          return rc;
      }
    • Validate / clamp inputs at API boundaries; trust nothing.

    • Always check malloc’s return; always free exactly once.

    • Keep functions short; prefer composition over deep nesting.

  • What reviewers look for: missing bounds checks, signed-vs-unsigned comparisons, format-string vulns (printf(user_input) is RCE), strcpy/sprintf/gets (use strncpy/snprintf/fgets), uninitialized reads, double-free, leak on error paths, missing const-correctness, integer overflow before allocation, assuming int ≥ 32 bits when long/size_t is right.

Ecosystem

DomainTools / projects
Operating systemsLinux kernel, BSD kernels, Windows (kernel + much of Win32), seL4 (formally verified), Zephyr, FreeRTOS
EmbeddedESP-IDF, STM32 HAL, Arduino core (C++ but C-callable), TinyUSB, lvgl, FreeRTOS
DatabasesSQLite, PostgreSQL, MySQL/MariaDB, Redis, valkey, LMDB
Web serversnginx, lighttpd, Apache httpd
Graphics / MultimediaFFmpeg, GStreamer, libpng, libjpeg-turbo, mesa, Vulkan loader, SDL2/SDL3, raylib
CryptoOpenSSL, BoringSSL, libsodium, mbedTLS
Networkingcurl, libcurl, libuv, libevent
ScientificBLAS/LAPACK, FFTW, GSL, HDF5, NetCDF
Language runtimesCPython, Lua, Ruby (MRI), Perl, PHP (Zend)
TestingUnity, Check, CMocka, CUnit, Criterion, Greatest, μnit
BuildMake, CMake, Meson, Ninja, autotools, Bazel
DocsDoxygen (with C support), Sphinx + Breathe
Notable usersevery major OS, every cloud provider’s hot path, every embedded device, every other language’s runtime

Gotchas

  • Integer promotion / implicit conversion: unsigned char a = 200, b = 200; then a + b is promoted to int (200 + 200 = 400, no overflow). But unsigned int a, b overflow is well-defined (modular); signed overflow is UB.
  • Signed vs unsigned mix: for (int i = 0; i < strlen(s); i++) compares int to size_t; the int gets converted to unsigned, so negative i becomes huge.
  • Operator precedence: *p++ is *(p++) not (*p)++. a & b == c is a & (b == c). Parenthesize freely.
  • Array != pointer in sizeof: inside the function it was passed to, sizeof(arr) is sizeof(T*), not the original size.
  • String literals are const but typed char* in C — modifying them is UB. Use const char*.
  • gets removed in C11 — use fgets.
  • scanf("%s", buf) has no length limit — buffer overflow. Use %19s or fgets + parse.
  • printf format mismatch%d with a long is UB; use %ld, %zu for size_t, PRIu64 for uint64_t.
  • Off-by-one in NUL-terminator handling — strncpy does NOT guarantee NUL-termination.
  • malloc return — always check for NULL.
  • Double-free and use-after-free — leading source of CVEs. Set pointers to NULL after free.
  • VLAs on stackint arr[n] for user-controlled n is a stack-overflow vector.
  • Uninitialized stackint x; if (cond) x = 1; use(x); reads garbage on the else path; UB. Initialize at declaration.
  • memcpy overlap is UB — use memmove.
  • Sequence-point traps: i = i++ + ++i; — UB.
  • Macro hygiene: #define SQ(x) x*x then SQ(1+2) = 1+2*1+2 = 5. Use ((x)*(x)). And never SQ(i++) — double evaluation.
  • Headers without guards included twice → redefinition errors.
  • Endianness: assuming little-endian (x86, ARM-LE) breaks on big-endian. Use htonl/ntohl for network, explicit shift-and-mask otherwise.
  • time_t is signed and may overflow in 2038 on 32-bit; use 64-bit time on modern systems.
  • Newcomers from Java/C#: no GC; you own all memory. No exceptions; check return codes. No method overloading; use _Generic macros. No bounds checks; arrays are raw pointers.
  • Newcomers from Python: strings need explicit length. Memory must be freed. Compilation step required. Behavior is undefined far more often than you expect — turn on -Wall -Wextra -Werror and run with sanitizers.

Citations