Sound Design for Film & Games — Foley, Effects, Implementation, Middleware
Sound design is the craft of constructing the non-musical, non-dialog audio of a film, television series, game, or interactive experience — and increasingly of XR (virtual and augmented reality) titles. Where music is composed and dialog is captured, sound design is invented: a lightsaber is a CRT hum mixed with a film-projector motor and a microphone cable dragged on the floor; a Tyrannosaurus rex is a baby elephant, a tiger, and an alligator; a particle weapon in a video game is the slowed-down hiss of compressed air over a metal pipe. The discipline sits between acoustics, psychoacoustics, music technology, software engineering (for games), and the visual narrative — every sound serves the picture, not the other way round.
Definition and scope
The traditional film-audio breakdown is dialog, music (score and source), and sound — and within “sound,” the subdivisions are production sound (everything captured on set during the shoot), ADR (Automated Dialog Replacement, lines re-recorded in a studio when the production track is unusable), sound effects (SFX) (hard effects like gunshots, designed effects like spaceships, ambiences like crowd walla and forest beds), and Foley (per-character footsteps, cloth movement, and prop sounds performed in sync to picture on a Foley stage). The combined “sound design” credit (separate from “sound mix”) covers everything except dialog and score — though increasingly the design role spills into rhythmic music-adjacent territory (drones, designed swells, hybrid score-as-sound).
In games the breakdown shifts: dialog is still dialog (with vastly more lines than a film — God of War Ragnarök 2022 shipped over 36,000 lines), sound effects become an interactive, parameter-driven layer that responds to game state, music typically uses interactive (adaptive) techniques rather than linear cues, and Foley is largely replaced by procedural footstep and cloth systems triggered by character animations and surface tags.
Film workflow — production sound
Production sound is captured on set by the sound recordist (a department head reporting to the director and the editor), the boom operator (suspending a shotgun mic on a fishpole or boom over the actors), and the utility sound technician (managing radio mics, slates, and timecode sync). The recordist’s primary tools:
- Sound Devices Scorpio (32 inputs / 36 tracks, the 2026 high-end film standard), Sound Devices 888 (16 inputs / 20 tracks), and Sound Devices 833 (12 inputs / 8 tracks) — the dominant location recorders since the late 2010s.
- Zaxcom Nova / Deva 24 — Zaxcom’s competing platform with internal recording on the wireless transmitter (Zaxcom ZMT4 records 32-bit float internally, so the bodypack is also a backup recorder).
- Aaton Cantar X3 — French boutique, used on Roma (2018) and many cinematographer-led indie productions.
- Zoom F8n Pro and MixPre-10 II — the lower-end of professional and the upper-end of prosumer.
Boom microphones are dominated by the Sennheiser MKH 416 (in production since 1976, the global film boom standard — short shotgun, RF-condenser, side-fire interference tube), Sennheiser MKH 8060 (newer, smaller, slightly less off-axis coloration), Schoeps CMIT 5U (German, prized for low self-noise and natural off-axis response, the high-end boom on most prestige features), DPA 4017B (Danish supercardioid shotgun). Suspension cradles (Cinela COSI, Rycote modular blimps), windshields (Rycote Cyclone, Cinela Pianissimo) and dead-cat fur covers are essential exterior gear.
Lavalier microphones — DPA 6060 / 4060 / 4071, Sanken COS-11D (the longstanding lavalier of choice for cinema), Sennheiser MKE 1 / MKE 2 — are wired to wireless bodypacks: Lectrosonics SSM (super-slot subminiature transmitter, 1.8 g without battery, the film standard), Wisycom MTP60, Zaxcom ZMT4-X, Audio Ltd A10. Antenna diversity at the receiver (typically two antennas with a fast switching diversity scheme) suppresses dropouts as actors move.
Post — picture lock, spotting, ADR, Foley
After picture editorial reaches a stable cut (picture lock, though revisions continue), the sound editorial team begins. The supervising sound editor runs the spotting session with the director, walking through the cut and tagging every required sound: every line needing ADR, every Foley pass, every effects-build moment, every ambience cue. Dialog editors clean the production track and assemble ADR; sound effects editors build the FX track with library and recorded material; Foley artists perform the Foley.
ADR is recorded in a treated studio with the actor watching their original performance and replacing the line on a beep-and-cue triplet (“three…two…go”). Modern productions increasingly use Synchro Arts Revoice Pro to sync the new take’s timing to the original, eliminating most of the “ADR feel” that betrayed a Hollywood film in the 1980s.
Foley
The Foley discipline is named for Jack Foley (Universal Studios, 1891-1967), who developed live sound-replacement techniques starting in 1927 for early talkies. Modern Foley stages have multiple pits of different surfaces (gravel, sand, cobblestones, wet/dry concrete, tile, hardwood, carpet, snow, water tank), an arsenal of props (suitcases, doors, leather coats, hammers, kitchenware, glass-shards trays), and a recordist with multiple microphones (typically a Schoeps CMIT or Sennheiser MKH 8050 close-up plus a wider room mic).
Notable Foley artists: Marko Costanzo (Skywalker Sound, hundreds of features including The Departed and Lincoln), John Roesch (Apocalypse Now, Star Wars Episode V, the Indiana Jones trilogy), Sanaa Joi Kelley (a leading 2010s-2020s Foley artist), Gary Hecker (over 350 features), Heikki Kossi (Finnish, Hilma and many European prestige features).
Premier Foley stages globally:
- Skywalker Sound (Marin County CA) — five Foley stages, the gold standard for Pixar, Lucasfilm, and many prestige features.
- Warner Bros. De Lane Lea (London) — UK feature standard.
- Twickenham Film Studios (London) — historically tied to British features.
- Pinewood Studios (UK) — multiple Foley stages on the lot.
- Sony Pictures Post Production (Los Angeles).
- Technicolor Hollywood (and the surviving Technicolor PostWorks NY).
- Formosa Group (Los Angeles, founded 2010 by sound veterans).
Sound effects libraries
Commercial libraries provide the bulk of the sound palette under tight budgets. Major libraries:
- Boom Library (Berlin, since 2008) — modern designed material, weapons, cinematic transitions, vehicle libraries.
- Sound Ideas (Canada, since 1978) — the Hollywood Edge, Series 6000, sci-fi material.
- Pro Sound Effects (NY-based) — modern recorded libraries.
- Soundsnap (subscription, accessible mid-tier).
- The Recordist (Frank Bry, weather and outdoor specialist).
- Cinesound / Coll Anderson — film-targeted.
- A Sound Effect marketplace — aggregates dozens of indie publishers.
Field recording is where original sound design begins. Portable recorders: Sound Devices MixPre-3 II / MixPre-6 II / MixPre-10 II, Zoom F3 (32-bit float, dual-A/D, near-impossible to clip, USD 350 — became a field-recordist favorite immediately upon release in 2022), Zoom F8n Pro, Zoom H8, Tascam DR-680. Microphones for field work: Sennheiser MKH 8060 plus Cinela COSI suspension, DPA 4017B, Sanken CO-100K (ultrasonic-capable to 100 kHz, used for pitch-shift design where the recorded source is taken way down in pitch to reveal supersonic detail), Schoeps CCM series. Specialist transducers:
- Hydrophones — Aquarian H2a (USD 100, underwater workhorse), DPA 8011 (USD 1,500 professional), JrF C-series (cheaper artisan).
- Contact microphones — LOM Geofón (resonance-suppressed contact mic, popular with the Eric La Casa / Chris Watson school), Cold Gold C-Tac, Barcus-Berry.
- Parabolic dishes — Telinga Pro-X, Wildtronics, Dodotronic.
- Stereo arrays — Sennheiser MKH 8040 ORTF pairs, Schoeps MS sets, DPA 5100 Mobile Surround for 5.1 ambiences.
DAW
Digital audio workstations for film and games:
- Avid Pro Tools Ultimate (formerly Pro Tools HD) — the industry standard for film post since the early 2000s. The session format is the lingua franca for studio-to-studio transfer. Dolby Atmos workflow tightly integrated since version 2019.12; HEAT (an analog-modeling DSP option) for harmonic warmth.
- Steinberg Nuendo — the game-audio leader, with native Wwise integration, Game Audio Connect, and superior cue-list features for episodic and serial content.
- Reaper (Cockos) — a low-cost, fast, infinitely scriptable DAW that has gained ground in game audio, indie film, and broadcast since the late 2010s.
- Logic Pro (Apple, macOS only) — strong for music-tied design, used by composers crossing into design.
- Cubase (Steinberg) — Nuendo’s music-focused sibling.
Dolby Atmos certification is required for theatrical Atmos work — the studio needs a calibrated Atmos monitoring environment (typically 7.1.4 with overheads), the Dolby Atmos Renderer (Mac app provided by Dolby for licensed studios), and a sign-off from a Dolby engineer. The Renderer’s bus structure is a 7.1.2 bed plus up to 118 simultaneous objects, each with X/Y/Z position metadata.
Edit and design tools
- iZotope RX 11 — the dialog editorial standard. Spectral De-noise (broadband noise reduction), Dialogue De-reverb (machine-learning de-reverberation, 2018 onward), Mouth De-click, Spectral Repair (paint-out airplanes, sirens, sneezes), Dialogue Contour (pitch flattening), Loudness Control. iZotope shipped Music Rebalance (ML-based stem separation) in RX 8 and improved it through RX 10 and 11.
- Cedar Audio — UK boutique audio restoration, DNS One de-noiser used in film and broadcast since the 1990s.
- Synchro Arts Revoice Pro and VocAlign Pro — time and pitch align an ADR take to the original production line, removing the need for the actor to match cadence.
- Soundminer and BaseHead — sound effects database front-ends with metadata search, waveform preview, drag-to-session workflow. Both index hundreds of thousands of files across Pro Tools, Nuendo, and Reaper.
- Audio Design Desk (ADD) — modern AI-tagged sound effects search.
Synthesis and processing for sound design
Sound design synthesis is where the craft becomes inventive — generating sounds that no microphone could capture.
- Native Instruments Reaktor 6 — a modular synthesis and effects environment with thousands of user-built ensembles (Razor, Monark, The Mouth, Skanner XT). Used by Junkie XL and Tom Holkenborg for scoring and design.
- Symbolic Sound Kyma (USD ~16,000 system with Pacarana DSP hardware) — the high-end sound-design workstation of choice for Ben Burtt style design. Used on the Avatar films, Stranger Things, and many AAA games for textures that defy other tools.
- Output — Portal (granular plugin), Substance (hybrid synth), Arcade (sample-loop platform).
- Ableton Live + Max for Live — the open-creative-coding pipeline; Max for Live patches by Mark Fell, Robert Henke (Monolake), and many other artist-developers.
- GRM Tools (INA-GRM, France) — descended from the musique concrète tradition, classic Doppler / Shuffling / Comb / Freezing plugins.
- SoundToys — EchoBoy (multi-character delay), Crystallizer (granular pitch-shifted ostinato echo), Decapitator (saturation), Devil-Loc (extreme limiter), MicroShift (subtle stereo widening); ubiquitous in film and TV.
- Eventide — the H3000 hardware harmoniser (1986, the sound of the late-1980s film score), the H9000 modern flagship, the Blackhole reverb, the H910 — used for “alien voice” pitch manipulation since Aliens (1986).
- Krotos Reformer Pro — real-time replacement of a microphone input with a library sound, mapped by amplitude and spectral envelope. Performable Foley.
- TwistedWave and Audacity — quick stereo editors for prep and library work.
- MetaSynth (U&I Software) — image-to-sound synthesis used famously on The Matrix (Tom Tykwer / Don Davis design passes).
- Csound, SuperCollider, Pure Data, VCV Rack — open-source and academic synthesis platforms used at the inventive edge.
Mixing for film — stems and the dub stage
The film mix culminates in the dub (or mix) — the supervising sound mixer (often two mixers, one for dialog/music and one for effects/Foley) sits at a film-rated mixing console (Avid S6 / S6L, AMS Neve DFC, Harrison MPC4-D) in a calibrated mix theatre and mixes the entire film. The deliverable is a set of stems:
- DX (Dialog) stem — dialog only, dry, used for foreign dub replacement.
- MX (Music) stem — score and source music.
- FX (Effects) stem — sound effects and Foley.
- M&E (Music and Effects) — the DX-less mix for foreign localization.
- Atmos / 7.1.4 / 5.1 / Lt/Rt / stereo / mono downmixes.
Lt/Rt is the matrix-encoded stereo of a surround mix — a stereo pair carrying centre channel and surround information encoded into the L/R pair via phase shifts, decodable by a Dolby Pro Logic or Atmos for Home decoder. The LFE (Low Frequency Effects) channel is the “.1” — a band-limited (typically < 120 Hz) channel that is typically reproduced 10 dB hot relative to the bed channels at calibrated playback (because the in-room response is heavily contained by room modes that vary by venue). It carries cinematic impact: explosions, low rumbles, dragon roars.
Surround and immersive formats
- Stereo — two channels, ITU 60° aperture.
- 5.1 — three front (L / C / R), two surround (Ls / Rs), one LFE. ITU-R BS.775 places surrounds at ±110° from centre. The Dolby Digital theatrical and home format since 1991.
- 7.1 — adds back-surround pair behind the listener (Bs L / Bs R at ±150°). Dolby Surround 7.1 in cinema (since Toy Story 3, 2010).
- Dolby Atmos — object-based audio. A 7.1.2 (or 9.1.6 in premium cinemas) “bed” carries traditional channel-based mixing, while up to 118 simultaneous “objects” carry mono streams with three-dimensional positioning metadata (X/Y/Z plus size, snap behaviour, divergence). The renderer in the playback environment maps objects to whatever speaker layout is available — a flagship Dolby cinema (60+ speakers including overheads), an Atmos home theatre (typically 5.1.4 or 7.1.4 with four ceiling speakers), a Sonos Arc soundbar with virtualised height, or a pair of Apple AirPods Pro with binaural HRTF rendering.
- Dolby Atmos Music — Apple Music launched Dolby Atmos catalog in June 2021; Tidal followed. Atmos mixes are delivered as ADM BWF (Audio Definition Model) masters.
- DTS:X — DTS’s object-based competitor, used in some cinemas and consumer hardware.
- Auro 3D — Belgian channel-based 11.1 / 13.1 format with elevation, used in select cinemas and home AV products.
- Sony 360 Reality Audio — object-based with HRTF, distributed via Amazon Music HD, Tidal, Deezer. Sony Bravia and select Sony headphones.
- Ambisonics (B-format, AmbiX, FuMa) — spherical-harmonic-based 3D audio. First-order ambisonics uses 4 channels (W omni, X front-back, Y left-right, Z up-down); third-order uses 16 channels and is the production format for high-quality VR. YouTube 360 video supports first-order; Meta Quest content typically uses third-order.
Dolby Atmos workflow
The Atmos session in Pro Tools / Nuendo routes channels to bed buses and objects. The Dolby Atmos Renderer (a separate Mac application licensed per studio, current version 5.2) receives audio plus metadata over Dante or via Send/Receive plugins, monitors via Atmos-certified speakers, and authors the ADM BWF master file. The Atmos Production Suite is the in-DAW renderer (lower cost, single workstation). For home Atmos delivery the renderer creates a .mp4 with Dolby Digital Plus JOC (Joint Object Coding) or Dolby TrueHD. For cinema Atmos the deliverable is a DCP (Digital Cinema Package) with Atmos metadata.
Cinema Atmos differs from home Atmos: cinema systems have far more speakers (often 50-70 surrounds), require head-related rendering only at the cinema’s calibrated position, and use a different LFE channel topology. Home Atmos is binauralised on phones and headphones to give a virtualised height-and-surround impression from two speakers.
Theatre monitoring follows ITU-R BS.775 for surround geometry and the X-curve / SMPTE ST 202 for cinema-mix-room frequency response — a roll-off of approximately 3 dB per octave above 2 kHz, intended to match the perceived response of the larger theatres for which the mix is destined. There is ongoing controversy about whether the X-curve is still appropriate or whether flat-response monitoring is preferable; recent revisions (SMPTE TC-25CSS) propose a less aggressive roll-off.
Game audio middleware
Where film delivers a fixed linear mix, games construct the soundtrack at runtime from authored building blocks. Audio middleware sits between the audio designer’s tools and the game engine:
- Audiokinetic Wwise — the dominant AAA middleware since the late 2000s. Acquired by Sony in 2022. The Authoring application is a Windows / Mac standalone where designers organize containers, define RTPCs (Real-Time Parameter Controls — game variables that drive audio properties), states, switches, events, and mixing structure. The Wwise SDK integrates into Unity, Unreal Engine, and proprietary engines; the runtime is a high-performance C++ audio engine with sub-millisecond response. Used in Assassin’s Creed, Cyberpunk 2077, Helldivers 2.
- FMOD Studio (Firelight Technologies) — Wwise’s main competitor, popular in indie and mid-tier AAA. Strong in adaptive music. Used in Hades, Celeste, Genshin Impact.
- Fabric (Tazman Audio) — Unity-native, less common.
- Unity native audio — built-in audio engine with mixer groups, snapshots, and AudioSource components; sufficient for simpler titles.
- Unreal Engine MetaSounds — Epic’s modular procedural audio system, replacing the older USoundCue from Unreal 5.0 (2022). Node-graph-based DSP authored in-engine.
- WAAPI (Wwise Authoring API) — Lua / Python scripting to automate Wwise authoring.
Interactive audio concepts
Game audio thinks in terms of state rather than time:
- RTPC (Real-Time Parameter Control) — a continuous game-side variable (player health, vehicle RPM, distance to enemy) that drives an audio property (volume, pitch, low-pass cutoff, send level). The engine smoothly interpolates as the value changes.
- Switches — discrete game-side states (surface type, weapon type, character class) that select between alternative samples. The footstep system is canonical: a switch on the surface tag (“wood” / “metal” / “grass” / “snow”) selects from a random container of footstep variations.
- States — global states (combat / explore / cinematic / paused) that recall mix snapshots.
- Containers — random, sequence, switch, and blend containers organize the underlying samples and apply variation. A weapon report uses a random container of 6-12 variations to avoid the “machine-gun repetition” artifact.
- Event triggers — code-side calls (typically through a one-line API call like
AkSoundEngine.PostEvent("Player_Footstep", gameObject)) that fire authored events. - 3D positional audio — sources have a world position, attenuation curves (distance-vs-gain), cone shapes (forward-facing emitters), Doppler shift (frequency modulation as relative velocity changes), and reverb-send levels. The listener has a position and orientation; the engine computes panning, distance gain, and Doppler at the audio rate.
- Occlusion / obstruction — geometry-based culling: occlusion (a wall between source and listener) applies low-pass filter plus level reduction; obstruction (a partial barrier) applies less aggressive filtering. Wwise’s Spatial Audio module computes these at runtime from geometry.
- Reverb zones / convolution reverb — different game regions have different reverb characteristics; modern engines use convolution with measured or designed impulse responses.
- Music transitions — vertical layering plays simultaneous stems (drums, bass, strings, brass) and crossfades them based on tension state; horizontal re-sequencing transitions between musical sections at musically-meaningful sync points (next bar, next beat, next downbeat); stingers are short one-shot musical interjections triggered by events.
- Weapon foley layering — a single gunshot is typically 5-15 layered samples: shell drop, slide cycle, casing eject, magazine release, mechanism click, hammer fall, muzzle blast, tail reverb. Each is randomised independently for variation.
Spatial audio for games and XR
- HRTF (head-related transfer function) — the frequency response from a point in space to each ear, encoding the spectral cues (head shadowing, pinna filtering, inter-aural time and level differences) that the brain uses for localisation. Personalised HRTFs (measured per individual) work better than generic ones, but generic-HRTF rendering is good enough for most consumer applications.
- Steam Audio (Valve, free) — open-source spatial audio SDK with HRTF binaural rendering, geometry-based reverb, and occlusion. Unity, Unreal, and FMOD integrations.
- Oculus Audio SDK / Meta XR Audio — Meta’s spatializer for Quest, with HRTF and acoustic simulation.
- Microsoft Project Acoustics — geometry-based pre-computed wave-acoustic simulation, used in Gears of War 5 and Sea of Thieves. Bakes accurate diffraction and occlusion that runtime ray-tracing cannot match.
- Apple Spatial Audio — head-tracked binaural rendering on AirPods Pro / Max / Vision Pro, available to game developers via PHASE (Physical Audio Spatialization Engine) on Apple Silicon.
- Sony Tempest 3D — PS5 hardware-accelerated spatial-audio engine; used heavily in Returnal, Astro’s Playroom.
XR titles (VR and AR) require sub-20 ms motion-to-sound latency to maintain immersion — perceptual research (Brungart, Simpson, et al.) places the threshold for noticeable head-tracking latency in spatial audio around 15-25 ms. Audio is updated at the engine’s frame rate or higher (often 120-240 Hz) with extrapolation between frames.
Famous sound designers — film
- Ben Burtt (b. 1948) — created the lightsaber sound (CRT TV hum mixed with old film projector motor, and the swooshing produced by dragging the microphone cable past the speaker), R2-D2’s voice (synthesizer plus Burtt’s own vocal squeaks), Chewbacca (bear and walrus blend), the THX Deep Note (with James Moorer), and Wall-E’s voice. Three Academy Awards plus a Special Achievement Oscar.
- Walter Murch (b. 1943) — won three Academy Awards for sound and editing (Apocalypse Now 1979 sound, The English Patient 1996 picture + sound). Coined “sound montage” as a discipline; author of In the Blink of an Eye (1992); invented modern multi-camera editing on the Avid Media Composer for English Patient.
- Randy Thom (b. 1951) — Skywalker Sound, sound designer on Cast Away, The Right Stuff (Oscar 1983), The Incredibles, How to Train Your Dragon.
- Gary Rydstrom — Skywalker, seven Academy Awards (Jurassic Park, Titanic, Saving Private Ryan, Terminator 2). The Tyrannosaurus rex roar was a baby elephant’s call slowed down, combined with a tiger and an alligator.
- Skip Lievsay — Coen Brothers’ regular sound designer (Fargo, No Country for Old Men, A Serious Man); won the Academy Award for sound mixing for Gravity (2013).
- Erik Aadahl — Inception, Transformers, A Quiet Place (2018, where the entire premise hinged on sound design).
- Mark Mangini — Mad Max: Fury Road (Oscar 2015, with David White), Dune (2021), Dune: Part Two (2024 Oscar).
- Tom Myers and Ren Klyce — Pixar’s principal sound designers; Klyce works with David Fincher on the Fincher catalog.
- Hildegard Westerkamp, Chris Watson (UK, formerly of Cabaret Voltaire, now a wildlife recordist for the BBC Natural History Unit), Jean-Claude Risset — academic and field-recording lineages feeding modern design.
Famous sound designers — games
- Tim Larkin — Myst (1993), Halo 3 audio direction at Bungie.
- Brian Schmidt — Madden NFL, founder of the Game Audio Network Guild (G.A.N.G.).
- Jack Wall — Mass Effect, Call of Duty: Black Ops II music.
- Marty O’Donnell — Halo: Combat Evolved through Halo: Reach (Bungie 1999-2010), Destiny (2014, Bungie). The Halo theme’s monk-chant opening became one of the most-recognised game audio motifs.
- Inon Zur — Fallout 3 / 4 / 76, Dragon Age.
- Akira Yamaoka — Silent Hill series; pioneered the industrial-ambient horror-game audio palette.
- Jesper Kyd — Hitman series, Assassin’s Creed II, Borderlands.
- Mick Gordon — DOOM 2016 and DOOM Eternal (2020); the meticulous metal-and-design hybrid score.
- Audio direction at studios: Rockstar’s Vespa Audio team (GTA V’s full-city audio simulation); Sony Santa Monica (God of War 2018 + Ragnarök, audio direction by Mike Niederquell); Naughty Dog (The Last of Us series, Phillip Kovats later Christian Bohm); Insomniac (Spider-Man, Bjorn Arvidsson); CD Projekt Red (Cyberpunk 2077, Marcin Przybyłowicz music, Krzysztof Zięba SFX).
Music for picture
The composer (typically credited separately from sound design) writes the score. Workflow: temp tracks (existing music used by the editor during cutting), spotting (with the director, where music starts and stops), mockups (MIDI-and-sample-library demos), recording (live orchestra at Abbey Road Studio One, AIR Studios Lyndhurst, Synchron Stage Vienna, Sony Scoring Stage Hollywood, Newman Scoring Stage at 20th Century), mixing, and final delivery as stems.
Notable contemporary composers:
- John Williams (b. 1932) — five Academy Awards, 54 nominations; Star Wars (1977-2019), Indiana Jones, E.T., Schindler’s List, the Harry Potter scores 1-3.
- Hans Zimmer (b. 1957) — Lion King (Oscar 1994), Gladiator, the Dark Knight trilogy, Inception, Interstellar, Dune (Oscar 2021), Dune: Part Two (Oscar 2024). Founded Remote Control Productions, the team-based scoring studio.
- Hildur Guðnadóttir (Icelandic, b. 1982) — Joker (Oscar 2020), Chernobyl (Emmy 2019), TÁR (2022). The Joker score used a single overdriven cello as the primary motif.
- Trent Reznor & Atticus Ross — Nine Inch Nails members; The Social Network (Oscar 2010), Soul (Oscar 2021), Mank (Oscar 2021).
- Ludwig Göransson — Tenet, Black Panther (Oscar 2018), Oppenheimer (Oscar 2024).
- Jonny Greenwood — Radiohead member; There Will Be Blood, Phantom Thread, The Power of the Dog.
Cue sheets are submitted to performance-rights organisations (PROs — ASCAP / BMI / SESAC in the US, PRS in the UK, GEMA in Germany, SACEM in France, JASRAC in Japan) to trigger performance royalties when the film airs.
Loudness standards
- Cinema reference — DCI calibration is 85 dB SPL per main channel at the listener position with -20 dBFS pink noise. Mix-stage dialnorm sits around -27 LUFS; the broad dynamic range is preserved.
- Broadcast EU — EBU R128 (2011 / updated R128 s1 2014) targets -23 LUFS integrated over the program, with a max true-peak of -1 dBTP and short-term loudness range constraints.
- Broadcast US — ATSC A/85 (CALM Act 2010) targets -24 LKFS (equivalent to LUFS, BS.1770-3 measurement).
- Streaming video — Netflix specifies -27 LKFS dialnorm with -2 dBTP true-peak ceiling for non-Atmos; Atmos targets are different (-18 LKFS for Atmos integrated; the Atmos renderer outputs across multiple speaker layouts).
- Streaming music — Spotify -14 LUFS integrated; Apple Music -16 LUFS Sound Check default (but Atmos Music delivery is uncapped); Tidal -14; Amazon Music HD -14; YouTube -14.
- Games — variable; the AAA convention since the late 2010s has been a target of -23 LUFS integrated for the average gameplay loudness, with peaks managed by the mix bus limiter. Microsoft’s Xbox Series X / S support automatic loudness normalisation via the system mixer.
Audio for VR and AR
XR introduces several constraints absent in film and traditional games:
- Motion-to-sound latency must be < 20 ms to avoid breaking presence. Apple Vision Pro (launched February 2024) targets 12 ms scene-to-photon and similar audio targets via the dedicated R1 chip handling sensor fusion at sub-frame rates.
- Head tracking — six-degree-of-freedom (6DoF) audio must rotate the soundfield with the user’s head and translate sound sources accordingly. Generic-HRTF rendering plus head-tracked rotation is standard.
- Acoustic simulation — Microsoft Project Acoustics, Steam Audio, and Meta XR Audio compute occlusion, diffraction, and reverb based on the virtual geometry, often with bakes for static geometry and runtime ray-tracing for dynamic elements.
- Hardware: Apple Vision Pro spatial audio with dynamic head tracking via H2-paired earbuds or AirPods Pro; Meta Quest 3 / 3S / Pro; Sony PSVR2 with Tempest 3D Audio; PICO 4 Ultra (ByteDance).
AI in sound
AI tools entered the design pipeline aggressively from 2020 onward:
- NVIDIA RTX Voice / NVIDIA Broadcast — real-time noise suppression and echo cancellation using deep learning, GPU-accelerated.
- iZotope RX 10/11 — Music Rebalance (stem separation), Dialogue Isolate, Spectral Recovery (ML-based bandwidth extension), Repair Assistant.
- Sonible Smart Limit / Smart EQ / Smart Comp — adaptive ML-driven mastering and mixing tools.
- Krisp — real-time noise suppression for live communication, popular in remote ADR sessions during the 2020-2021 pandemic and beyond.
- Replica Studios — voice cloning for game dialog (Bethesda’s Starfield used Replica for some background NPCs).
- ElevenLabs — voice synthesis and cloning, the dominant AI-voice provider since 2022.
- Resemble AI and Adobe Project Sound Lift — voice synthesis and de-mixing.
- Adobe Podcast (Enhance Speech) — single-pass dialog cleanup.
- Stable Audio (Stability AI) and AudioCraft (Meta) — generative audio for music and sound effects; quality is improving but not yet at AAA-game / feature-film bar.
Deepfake voice concerns drove the SAG-AFTRA 2023 strike — the November 2023 settlement included specific AI clauses requiring informed consent for digital replicas, compensation for AI training on actor performances, and bargaining requirements for any AI-generated voice in covered productions.
DAW automation and orchestral libraries
DAW automation tracks any plugin or mixer parameter: volume, pan, sends, plugin params, articulation maps (lanes that select between violin “spiccato” / “legato” / “tremolo” via MIDI keyswitches or program changes). VCA grouping lets a group of channels be ridden by a single fader while preserving each channel’s individual automation underneath.
Orchestral sample libraries form the backbone of mockup and demo scoring:
- Spitfire Audio (UK) — Albion series, BBC Symphony Orchestra Pro, Hans Zimmer Strings, Spitfire Symphony Orchestra. The BBC SO Pro was recorded at Maida Vale studios with the actual BBC Symphony Orchestra.
- Vienna Symphonic Library (VSL) — Synchron Series, recorded at the Synchron Stage Vienna with all-articulation deep sampling.
- Heavyocity — Damage, Gravity, Master Sessions; trailer and modern-cinematic palette.
- Cinematic Studio Series (Cinematic Studio Strings / Brass / Woodwinds / Solo Strings) — Alex Wallbank’s libraries, the default ensemble for many composer mockups.
- Orchestral Tools — Berlin Series (recorded at the Teldex Scoring Stage Berlin), Junkie XL Brass, the Sine player.
- EastWest Hollywood Series — Hollywood Strings, Hollywood Orchestra Opus Edition; recorded at the EastWest Studios LA.
- Native Instruments Symphony Series — strings, brass, woodwinds, percussion.
MIDI keyswitches let a composer trigger an articulation change (legato to spiccato) from a low-note keyswitch while continuing to play the melody normally on the rest of the keyboard.
Audio careers — roles
- Dialog editor — cleans and assembles production dialog, prepares ADR cues. Pro Tools and iZotope RX expertise required.
- Sound effects editor — builds the FX track from library and recorded material; coordinates with the supervising sound editor.
- Foley artist — performs Foley on the Foley stage; props collector and performer simultaneously.
- Foley mixer / recordist — records Foley with microphones and console; coordinates with the artist.
- Re-recording mixer — finalises the mix on the dub stage. Larry Blake (credits “Best Sound Mixing” Oscar nominee, longtime Steven Soderbergh collaborator) and the Skywalker Sound mixing roster define the prestige tier.
- Supervising sound editor — runs the post-sound team, attends spotting sessions, hires the editorial and Foley crew, manages the schedule.
- Audio director (games) — non-coding senior role; defines the audio vision for a title, hires designers and composers, manages budget and pipeline. Examples: Mike Niederquell (Sony Santa Monica), Bjorn Arvidsson (Insomniac), Audrey Hodges (Riot Games).
- Technical sound designer / audio programmer (games) — coding role; implements audio systems, optimizes runtime performance, integrates middleware with the engine, builds tools. C++ and Wwise/FMOD scripting required.
- Composer — writes the music. Separate guild and union representation (Society of Composers & Lyricists in the US for film; G.A.N.G. for games).
- Music editor — cuts the score against picture, manages temp tracks, handles delivery to the dub.
Adjacent
- audio-production-mixing-mastering — Shares DAW vocabulary and the underlying mix and mastering practice; sound design borrows the entire studio toolset.
- live-sound-and-acoustics — Crossover talent for tour sound, broadcast mixing, and immersive concert design.
- sound-synthesis-and-electronic-music — Synthesis techniques are the inventive engine for designed sounds.
- music-theory-essentials — Score-and-design hybrid territory requires harmonic literacy.
- acoustics-noise-control — Foley-stage and dub-theatre acoustic design.
- digital-signal-processing-foundations — The DSP underlying every middleware engine, plugin, and HRTF renderer.