Haskell — Reference

Source: https://www.haskell.org/documentation/

Haskell

  • Created: 1990 by an academic committee (Hudak, Hughes, Peyton Jones, Wadler, others)
  • Latest stable: Language standard Haskell 2010; primary compiler GHC 9.12.4 (released March 27, 2026)
  • Paradigms: Purely functional, lazy by default, statically typed
  • Typing: Strong static, Hindley-Milner with extensions (rank-N, GADTs, type families, kind polymorphism)
  • Memory: Garbage collected (GHC: copying generational, with parallel and concurrent variants); immutable data, lazy evaluation by default
  • Compilation: Native code via GHC (LLVM or NCG backend); also bytecode via GHCi, JS via GHC JS backend, WASM target
  • Primary domains: Compilers/DSLs, finance (banks, trading), formal verification, blockchain, web (Servant/Yesod), data engineering, research
  • Official docs: https://www.haskell.org/documentation/ and https://www.haskell.org/onlinereport/haskell2010/

At a glance

Haskell is a pure, lazy, statically typed functional language with one of the most expressive type systems in industrial use. Effects are tracked through monads (IO, STM, State); evaluation is non-strict by default, enabling infinite data structures and decoupling control flow from data flow. The de facto compiler is GHC (Glasgow Haskell Compiler), which implements Haskell 2010 plus a large catalog of extensions enabled per-module via LANGUAGE pragmas.

Getting started

Install — Use GHCup (https://www.haskell.org/ghcup/), the official cross-platform toolchain installer for GHC, Cabal, Stack, and HLS (Haskell Language Server). On Unix: curl --proto '=https' --tlsv1.2 -sSf https://get-ghcup.haskell.org | sh. Verify: ghc --version, cabal --version.

Hello world (Main.hs):

module Main where
main :: IO ()
main = putStrLn "Hello, world!"

Compile: ghc Main.hs && ./Main. Or run in REPL: runghc Main.hs.

Project (Cabal)cabal init --interactive scaffolds:

my-app/
  my-app.cabal     # package metadata + targets
  src/Lib.hs
  app/Main.hs
  test/Spec.hs
  CHANGELOG.md

cabal build, cabal run, cabal test. Stack (alternate) pins compiler+deps via Stackage snapshots.

REPLghci is the interactive interpreter. :t expr shows type, :k Type shows kind, :i Name shows info, :l Module loads, :r reloads, :set -XLanguageExtension toggles extensions live.

Basics

Types/literalsInt (machine word), Integer (arbitrary precision), Double, Float, Bool, Char, String (= [Char]), Text (preferred for real strings, from text package), ByteString. Tuples: (1, "x"). Lists: [1,2,3]. Maybe a for optional, Either e a for errors.

Variables/scoping — All values are immutable bindings. let ... in ... for local bindings; where clauses attach locals to a definition. Indentation-sensitive (the layout rule). Top-level type signatures are optional but customary.

Control flowif e then a else b (an expression, both branches required), guards (| cond = expr), case ... of with pattern matching. No statements — everything is an expression.

Functions — Curried by default: add :: Int -> Int -> Int; add x y = x + y is sugar for Int -> (Int -> Int). Lambda: \x -> x + 1. Composition: (f . g) x = f (g x). Application: f $ g x = f (g x) (low-precedence $ saves parens). Sections: (+1), (10-).

StringsString is [Char] (a linked list of chars — slow). Real code uses Data.Text (Text) for Unicode text, Data.ByteString for bytes. OverloadedStrings extension lets literals adapt.

Collections — Lists ([a], lazy, singly linked), tuples (fixed arity), Data.Map.Map k v (balanced tree), Data.Set.Set a, Data.IntMap, Data.HashMap.Strict.HashMap, Data.Vector.Vector (boxed/unboxed arrays), Data.Sequence.Seq (finger tree).

Intermediate

Type system depth — Algebraic data types (data), newtype (zero-cost wrapper), parametric polymorphism, type classes (Haskell innovation: ad-hoc polymorphism with principled dispatch). Common classes: Eq, Ord, Show, Read, Functor, Applicative, Monad, Foldable, Traversable, Semigroup, Monoid, Num. Inferred via Hindley-Milner; explicit annotations needed for ambiguous polymorphism (MonomorphismRestriction).

Modulesmodule Foo.Bar (export1, export2) where. import Data.Map (Map), import qualified Data.Map as M, import Data.Map hiding (filter). Cabal/Stack handle package boundaries.

Error handling — Three layers: Maybe a for absence, Either e a for typed errors, exceptions in IO via Control.Exception (throw, catch, bracket). Pure code can use error :: String -> a (avoid in libs). The MonadError and ExceptT patterns thread errors through monad transformers.

Concurrency primitives — Lightweight threads via forkIO :: IO () -> IO ThreadId (M:N scheduled by GHC RTS, microsecond start cost). MVar a (synchronizing variable), IORef a (mutable cell, no sync), TVar a + STM (Software Transactional Memory: composable, lock-free), Chan a (unbounded queue), async package for Future-style. Control.Concurrent.STM is the killer feature.

I/O — All effects in the IO monad. do notation desugars >>=. readFile, writeFile, getLine, putStrLn. System.IO for handles. Data.Text.IO for Unicode-correct text I/O.

Stdlib highlightsbase (Prelude, Data.List, Data.Maybe, Data.Either, Control.Monad), containers (Map/Set/IntMap/Seq), text, bytestring, vector, transformers (monad transformers), mtl (typeclasses for transformers), async, stm, time.

Advanced

Memory/GC — GHC heap is a copying generational GC (default 2 generations). Allocation is a pointer bump (cheap). Concurrent and parallel GC variants since GHC 8.10. Pinned vs unpinned memory for FFI. Heap profile: +RTS -h -p then hp2ps. Strictness analysis runs at compile time; force evaluation manually with seq, deepseq, bang patterns (!x), or strict data fields (data T = T !Int).

Concurrency deep dive — RTS schedules green threads onto OS-level capabilities (+RTS -N4). STM (atomically $ do ...) provides composable transactions: any TVar read is logged, retry on conflict. retry blocks until any read TVar changes; orElse composes alternatives. Async exceptions propagate across threads via throwTo. bracket for resource safety.

FFIforeign import ccall "math.h sin" c_sin :: CDouble -> CDouble calls C. foreign export ccall exposes Haskell to C. Foreign.Ptr, Foreign.Marshal.Alloc, Foreign.Storable for memory. inline-c and c2hs generate bindings.

Reflection — Limited intentionally: Data.Typeable for runtime type reps, Data.Data for generic traversals, Generic (from GHC.Generics) for compile-time generic programming.

Performance tools+RTS -p time profiling, +RTS -h heap profile, +RTS -s GC stats, criterion / tasty-bench for microbenchmarks, ThreadScope for concurrency visualization, ghc-prof-flamegraph. -ddump-simpl to inspect Core IR. Use INLINE/INLINABLE/SPECIALIZE pragmas to control optimization.

God mode

GADTs{-# LANGUAGE GADTs #-} lets constructors refine the result type:

data Expr a where
  IntE  :: Int  -> Expr Int
  BoolE :: Bool -> Expr Bool
  If    :: Expr Bool -> Expr a -> Expr a -> Expr a

Pattern matching on a GADT introduces type equality constraints into scope.

Type families{-# LANGUAGE TypeFamilies #-} adds type-level functions: open type families (extensible), closed type families (pattern-matched), associated types in classes. Cornerstone of vector, singletons, servant.

Kind polymorphism + DataKinds{-# LANGUAGE DataKinds, KindSignatures, PolyKinds #-} promotes data constructors to types and types to kinds: data Nat = Z | S Nat becomes a kind. Combined with TypeFamilies you get type-level computation; combined with singletons you get a workable dependent-types simulation.

Template Haskell{-# LANGUAGE TemplateHaskell #-} is compile-time Haskell macros: splice expressions $(genCode), quote with [| expr |]. Used by lens, aeson (deriveJSON), singletons, persistent. Quasiquoters: [sql| SELECT * FROM users |].

unsafePerformIO — Escapes the IO monad. Legitimate uses: top-level caches, FFI bridge for pure C functions, System.IO.Unsafe. Misuse breaks referential transparency.

RULES pragma{-# RULES "map/map" forall f g xs. map f (map g xs) = map (f . g) xs #-} lets libraries declare semantic-preserving rewrites; GHC applies during simplification. Foundation of stream fusion (vector, text).

Core inspectionghc -O2 -ddump-simpl -dsuppress-all Foo.hs dumps the Core IR (typed lambda calculus + ADTs). Reading Core is the standard way to understand whether your code fuses, allocates, or specializes. STG and Cmm dumps go even lower.

STM internals — Each atomically block builds a transaction log of TVar reads/writes; on commit the log is validated under a global lock (very brief); conflict triggers retry. retry parks the thread on the read set until any TVar changes. See https://hackage.haskell.org/package/stm.

Runtime profilingcabal build --enable-profiling, run with +RTS -p -h -hc -L80 -RTS. eventlog2html and ThreadScope consume the eventlog.

GHC plugins{-# OPTIONS_GHC -fplugin=My.Plugin #-} injects compile-time Core-to-Core or typechecker plugins; foundations include ghc-typelits-natnormalise.

Idioms & style

  • Naming: lowerCamelCase for values, UpperCamelCase for types/constructors/modules, ' suffix for strict variants (foldl').
  • Formatter: fourmolu (configurable) or ormolu (opinionated). stylish-haskell for import sorting.
  • Linter: hlint (suggests refactors, e.g. “use map” instead of explicit recursion).
  • Idiomatic: prefer total functions; encode invariants in types (NonEmpty, Refined, smart constructors); use newtype for unit safety; pure functions on the boundary, IO only at the edges; Text over String; fold instead of recurse.
  • Expert review focus: laziness leaks (build up giant unevaluated thunks — use bang patterns, foldl' not foldl, strict data fields); partial functions (head, !!, read, fromJust); orphan instances; missing type signatures on top-level definitions; over-use of do for non-monadic flow.

Ecosystem

  • Web: Servant (type-safe REST APIs), Yesod (full-stack), Scotty (Sinatra-like), IHP (rapid dev), WAI/Warp (HTTP server).
  • Data/DB: persistent + esqueleto, hasql, postgresql-simple, beam, opaleye.
  • Concurrency/streams: async, stm, conduit, pipes, streamly.
  • Parsing: megaparsec, parsec, attoparsec, alex/happy.
  • JSON: aeson; YAML: yaml; binary: cereal/binary/store.
  • Testing: hspec, tasty, QuickCheck (property-based — invented here), HUnit, hedgehog.
  • Lenses/optics: lens (Edward Kmett), optics (cleaner API).
  • Build: Cabal (official), Stack (Stackage curated snapshots), Nix.
  • Docs: Haddock generates HTML from doc comments; published on Hackage and Stackage.
  • Notable users: Standard Chartered, Barclays, Facebook (Sigma anti-spam, Haxl), GitHub (Semantic), Cardano blockchain (IOHK), Mercury Bank, Awake Security, Galois, Tweag, Channable.

Gotchas

  • Lazy by defaultfoldl (+) 0 [1..1e8] builds an O(n) thunk and stack-overflows. Use foldl' from Data.List.
  • String == [Char] — slow; switch to Text or ByteString for any real workload.
  • Partial functions in Preludehead, tail, init, last, (!!), read, fromJust all crash on bad input. Use safer alternatives (Data.Maybe.listToMaybe, readMaybe).
  • Number defaultingprint (1 + 2) requires defaulting; ambiguous monomorphic uses can fail to compile or silently default to Integer.
  • MonomorphismRestriction — top-level let f = ... without signature gets monomorphized, surprising newcomers.
  • Orphan instances — defining a typeclass instance outside both the class module and the type module breaks consistency; GHC warns.
  • Record fields are global functionsdata Person = P { name :: String } puts name in the module namespace; conflicts are common. DuplicateRecordFields, OverloadedRecordDot (GHC 9.2+) help.
  • Async exceptions can interrupt anywhere; use bracket, mask, safe-exceptions package.
  • Build times — heavy use of TH, type families, and large generics can multiply compile time substantially.
  • Cabal hell vs Stackage — unsnapshotted Cabal solves can produce non-reproducible builds; freeze with cabal freeze or use Stack.
  • Floating-point Eq0.1 + 0.2 /= 0.3; standard IEEE caveats.

Citations