Music Theory Essentials
1. At a glance
Music theory describes how pitches, rhythms, and harmonies relate to each other and to a listener. It is descriptive (analyzing what composers have written) and prescriptive (offering working rules to write idiomatic music in a style). It spans many traditions, but the global lingua franca is Western 12-tone equal temperament (12-TET) — the default tuning of pianos, MIDI, DAWs, and most popular and concert music since roughly the early 20th century. Outside this default lies a vast world: just intonation, Pythagorean tuning, meantone, well temperaments, microtonal divisions (19-TET, 24-TET, 31-TET, 41-TET, 53-TET), and non-Western systems (Arabic maqam with quarter-tone intervals, Indian raga with śruti microtones, Indonesian gamelan with slendro and pelog scales, Japanese in scale, Chinese pentatonic).
The discipline ranges in formality and intent. At one end, pop musicians use chord-charts and Nashville Number System shorthand to communicate songs informally. In the middle, jazz musicians study ii-V-I substitutions, modal interchange, and altered-dominant voicings codified by Mark Levine (“The Jazz Theory Book”, 1995) and the Aebersold play-along method. At the other end, academic theorists apply species counterpoint (Fux, 1725), Schenkerian hierarchical reduction (Heinrich Schenker, early 20th century), pitch-class set theory (Allen Forte, 1973), or neo-Riemannian geometric transformations (David Lewin, 1987) to dissect tonal and post-tonal music. Composers writing spectral music (Gerard Grisey, Tristan Murail, mid-1970s onward) derive harmony from FFT analyses of acoustic spectra rather than from triadic stacks, blurring the line between theory and DSP.
2. Pitch and frequency
A pitched musical note corresponds to a periodic (or near-periodic) acoustic waveform with a fundamental frequency. The reference is A4 = 440 Hz (ISO 16:1975, “Acoustics — Standard tuning frequency”). Some symphony orchestras tune slightly sharp — the Berlin Philharmonic and Vienna Philharmonic often use A = 442 or 443, occasionally 444 — for perceived brilliance.
The octave is a 2:1 frequency ratio: A5 = 880 Hz, A3 = 220 Hz, A2 = 110 Hz. In 12-tone equal temperament, the octave is divided into 12 equal semitones, so each semitone is a frequency ratio of 2^(1/12) ≈ 1.0594631. The frequency of a MIDI note number n is:
f_n = 440 × 2^((n − 69) / 12)
where MIDI 69 = A4. So MIDI 60 (middle C, C4) = 440 × 2^(-9/12) ≈ 261.63 Hz, and MIDI 72 (C5) ≈ 523.25 Hz.
Cents are a logarithmic unit for fine pitch differences: 100 cents per semitone, 1200 cents per octave. The just-noticeable-difference (JND) for pitch is roughly 5-15 cents for sustained tones in the middle register, narrower for trained musicians and wider at extremes of register and for short tones. Pitch detection algorithms in DAW tuners (Antares Auto-Tune, Celemony Melodyne, Waves Tune) report deviations in cents.
3. Intervals
An interval is the distance between two pitches. In 12-TET, intervals up to the octave (measured in semitones, st) are:
| Name | Semitones | Just-intonation ratio |
|---|---|---|
| Unison (P1) | 0 | 1:1 |
| Minor 2nd (m2) | 1 | 16:15 |
| Major 2nd (M2) | 2 | 9:8 |
| Minor 3rd (m3) | 3 | 6:5 |
| Major 3rd (M3) | 4 | 5:4 |
| Perfect 4th (P4) | 5 | 4:3 |
| Tritone (TT, A4/d5) | 6 | 45:32 or 7:5 |
| Perfect 5th (P5) | 7 | 3:2 |
| Minor 6th (m6) | 8 | 8:5 |
| Major 6th (M6) | 9 | 5:3 |
| Minor 7th (m7) | 10 | 16:9 or 9:5 |
| Major 7th (M7) | 11 | 15:8 |
| Octave (P8) | 12 | 2:1 |
The just-intonation ratios are the small-whole-number frequency relationships that produce maximally consonant intervals (locked phase relationships, beatless when in tune). 12-TET approximates these — the major 3rd is the worst offender: 12-TET M3 is 400 cents while pure JI M3 (5:4) is 386.31 cents, so 12-TET is 13.69 cents sharp. This is why barbershop quartets and a cappella groups tuning by ear sound noticeably “different” from a piano — they lock onto JI.
Consonance and dissonance are partly cultural and partly psychoacoustic: Helmholtz (“On the Sensations of Tone”, 1863) traced consonance to coincidence of harmonic partials, an explanation refined by Plomp and Levelt (1965) into a sensory-dissonance curve depending on critical-band width.
4. Notation
Western staff notation places pitches on a five-line staff. Clefs anchor the staff to a specific pitch: treble clef circles G4 (second line from bottom), bass clef brackets F3 (fourth line), alto clef centers C4 (middle line, used by viola), tenor clef centers C4 on the fourth line (used by cello, bassoon, trombone in upper register). Ledger lines extend the staff above or below.
Sharps (♯) raise a pitch by one semitone, flats (♭) lower by one, naturals (♮) cancel a prior accidental, double-sharps (𝄪) and double-flats (♭♭) shift by two. The key signature at the start of each line places sharps or flats that apply throughout the piece (or until modulation). The circle of fifths arranges the 15 enharmonically-equivalent key signatures: C major (no sharps/flats), G major (1♯), D (2♯), A (3♯), E (4♯), B (5♯), F♯ (6♯), C♯ (7♯), and on the flat side F (1♭), B♭ (2♭), E♭ (3♭), A♭ (4♭), D♭ (5♭), G♭ (6♭), C♭ (7♭).
Time signatures are written as a fraction at the start of the piece: the top number indicates beats per measure, the bottom number the note value that gets one beat. 4/4 (common time) is four quarter-note beats per measure; 3/4 is the waltz; 6/8 is compound duple (two dotted-quarter beats, each subdividing into three eighths). Note values: whole (semibreve, 4 beats in 4/4), half (minim, 2 beats), quarter (crotchet, 1 beat), eighth (quaver, 1/2 beat), 16th (semiquaver, 1/4), 32nd, 64th. A dotted note adds half its value; a tuplet (triplet, quintuplet, septuplet) divides a beat into an irregular number of equal parts.
Tempo markings range from traditional Italian (largo ~50 BPM, andante ~80, moderato ~110, allegro ~130, presto ~180) to specific metronome marks (♩ = 120). Dynamics span ppp through mp, mf, ff (Italian abbreviations) with crescendo and decrescendo hairpins. Articulation marks include staccato (dot, detached), tenuto (dash, held full value), accent (>), marcato (^), legato (slur), and bowings for strings.
5. Scales and modes
A scale is an ordered set of pitches within an octave, conventionally written ascending.
Major (Ionian) — the canonical Western scale with the step pattern W-W-H-W-W-W-H (whole, half = 2 or 1 semitones). C major: C D E F G A B C.
Minor has three common forms:
- Natural minor (Aeolian) — W-H-W-W-H-W-W. A minor: A B C D E F G A.
- Harmonic minor — natural minor with raised 7th. A harmonic: A B C D E F G♯ A. The augmented-2nd step (F to G♯, 3 semitones) gives it an “exotic” flavor.
- Melodic minor — raised 6th and 7th ascending, then natural minor descending in traditional usage. A melodic: A B C D E F♯ G♯ A ascending; A G F E D C B A descending. Jazz uses the ascending form in both directions, calling it “jazz minor”.
Modes of the major scale — start the major scale on each successive degree:
- I — Ionian (major itself).
- II — Dorian (D Dorian = D E F G A B C D, like C major from D; characteristic raised 6th over a minor framework — “So What” by Miles Davis).
- III — Phrygian (E Phrygian; characteristic lowered 2nd — Spanish, Flamenco, metal).
- IV — Lydian (F Lydian; raised 4th — “dreamy”, film scoring; “The Simpsons” theme).
- V — Mixolydian (G Mixolydian; lowered 7th — rock, Celtic, blues; “Sweet Child o’ Mine” verse).
- VI — Aeolian (natural minor).
- VII — Locrian (B Locrian; diminished 5th — rare, mostly theoretical or in metal contexts).
Pentatonic scales — five-note scales pervasive globally:
- Major pentatonic: 1 2 3 5 6 (C major pent: C D E G A).
- Minor pentatonic: 1 ♭3 4 5 ♭7 (A minor pent: A C D E G).
Blues scale — minor pentatonic plus a “blue note” ♭5: A C D E♭ E G. The ambiguous third (between minor and major) is the signature of blues, R&B, and rock vocal phrasing.
Symmetric scales — built from a repeating interval pattern:
- Whole-tone — six whole steps: C D E F♯ G♯ A♯ C. Only two distinct whole-tone scales exist (transposing yields one of two). Used by Debussy (“Voiles”) for static, hazy harmony.
- Diminished (octatonic) — alternating whole-half (W-H-W-H-W-H-W-H) or half-whole. Used by Stravinsky, Messiaen, and in jazz over dominant 7♭9 chords.
- Chromatic — all 12 pitches.
Other scales of jazz and modal interest — harmonic major (major scale with ♭6), Lydian dominant (Mixolydian ♯4, the fourth mode of melodic minor), altered scale (super Locrian, seventh mode of melodic minor — the “altered” sound over V7alt chords), Phrygian dominant (Spanish/Klezmer; fifth mode of harmonic minor).
Messiaen modes of limited transposition — Olivier Messiaen (“Technique de mon langage musical”, 1944) catalogued seven scales whose transpositions yield duplicates after fewer than 12 shifts. Mode 1 = whole-tone, Mode 2 = octatonic, Mode 3 = three groups of (W-H-H), and so on. They underpin much of his organ and orchestral output (“Quatuor pour la fin du temps”, “Turangalîla-Symphonie”).
Microtonal scales — beyond 12 tones per octave:
- 19-TET (19 equal divisions) — excellent meantone approximation; explored by Joseph Yasser (1932) and Easley Blackwood.
- 24-TET (quarter tones) — common in Arabic music notation; used by Alois Hába, Charles Ives, Ivan Wyschnegradsky.
- 31-TET — extended meantone, advocated by Christiaan Huygens (1691) and Adriaan Fokker mid-20th-century.
- 41-TET and 53-TET — close approximations of just intonation; 53-TET nearly matches the Pythagorean comma exactly.
- Just intonation revival — Harry Partch built custom 43-tone JI instruments (Chromelodeon, Quadrangularis Reversum, Cloud-Chamber Bowls) and wrote “Genesis of a Music” (1949). La Monte Young (Well-Tuned Piano), Ben Johnston (string quartets nos. 4-9 in extended JI), Wendy Carlos (alpha, beta, gamma scales — non-octave-based tunings on “Beauty in the Beast”, 1986), and contemporary composer Sevish all work in this space.
6. Chords and harmony
A chord is three or more pitches sounding simultaneously. Western harmony is built primarily by stacking thirds (tertian harmony), though jazz and 20th-century music also use quartal (stacked fourths, e.g. McCoy Tyner) and secundal (clusters) harmony.
Triads — three notes, root + 3rd + 5th:
- Major — M3 + m3 above root. C major = C E G.
- Minor — m3 + M3. C minor = C E♭ G.
- Augmented — M3 + M3. C aug = C E G♯.
- Diminished — m3 + m3. C dim = C E♭ G♭.
7th chords add a 7th above the root:
- Major 7th (maj7, Δ7) — major triad + M7. Cmaj7 = C E G B.
- Dominant 7th (7) — major triad + m7. C7 = C E G B♭. Built on the 5th degree of a major scale.
- Minor 7th (m7) — minor triad + m7. Cm7 = C E♭ G B♭.
- Minor-major 7th (mMaj7, mΔ7) — minor triad + M7. CmMaj7 = C E♭ G B. Used in James Bond theme.
- Half-diminished (m7♭5, ø7) — diminished triad + m7. Cm7♭5 = C E♭ G♭ B♭. The ii of a minor ii-V-i.
- Fully diminished 7th (°7) — diminished triad + diminished 7th (enharmonic M6). C°7 = C E♭ G♭ B♭♭ (= A).
Extensions above the 7th — 9th, 11th, 13th, treated as colors on top of a 7th-chord foundation:
- 9: a M9 above the root (C9 = C E G B♭ D); ♭9 (C7♭9 = C E G B♭ D♭) is darker and common over V going to minor.
- 11: typically ♯11 on major or dominant chords (Cmaj7♯11 = C E G B F♯), or natural 11 on minor chords (Cm11 = C E♭ G B♭ D F).
- 13: M6 above the root (C13 = C E G B♭ D F A); ♭13 on altered dominants.
Sus and add chords:
- Sus2 — 2nd in place of the 3rd. Csus2 = C D G.
- Sus4 — 4th in place of the 3rd. Csus4 = C F G.
- Add9 / add2 — adds the 9th/2nd without the 7th. Cadd9 = C E G D.
- 6 chords — Cmaj6 = C E G A (Tin-Pan-Alley favorite, replaces or accompanies maj7).
Inversions — when a non-root chord tone is in the bass:
- Root position — root in bass.
- 1st inversion — 3rd in bass (figured-bass: 6 or 6/3).
- 2nd inversion — 5th in bass (6/4).
- 3rd inversion — 7th in bass (4/2 for a 7th chord).
Slash notation (C/E, Dm7/G) makes inversions and bass substitutions explicit in modern lead sheets.
Roman numeral analysis — chord scale-degree using Roman numerals (uppercase = major, lowercase = minor, ° = diminished, + = augmented). In C major: I (C) — ii (Dm) — iii (Em) — IV (F) — V (G) — vi (Am) — vii° (B°). In A minor: i (Am) — ii° (B°) — III (C) — iv (Dm) — v (Em, or V = E in harmonic minor) — VI (F) — VII (G, or vii° = G♯° in harmonic minor).
7. Functional harmony
Tonal music — roughly 1600 through the early 20th century in the Western art tradition, and most popular music today — organizes chords by function relative to a tonic key:
- Tonic (T) — I/i, and substitutes vi/VI (and sometimes iii/III). The home, point of rest.
- Subdominant / pre-dominant (S/PD) — IV/iv, ii/ii° (often as ii7 or ii7♭5). Departure from tonic, preparation for dominant.
- Dominant (D) — V or V7, vii°/vii°7 as a rootless V7 substitute. Tension that wants to resolve back to tonic.
The archetypal progression is T → PD → D → T: I — IV — V — I, or in jazz ii — V — I.
Cadences punctuate phrases:
- Authentic (perfect) cadence — V → I (or V7 → I), strongest closure. Authentic cadence with both chords in root position and tonic on top is “perfect authentic”; otherwise “imperfect”.
- Plagal cadence — IV → I, the “amen” cadence.
- Half cadence — any chord → V, leaves the phrase unresolved.
- Deceptive cadence — V → vi (or VI), thwarting the expected resolution. Used for surprise and to prolong.
Voice leading — moving multiple parts smoothly:
- Prefer stepwise motion in inner voices; leaps in the bass and (occasionally) soprano.
- Contrary motion between outer voices is preferred.
- Avoid parallel perfect 5ths and octaves (a Renaissance and common-practice rule — they erode voice independence).
- Resolve the leading tone (7th scale degree) up to the tonic, especially in outer voices.
- Resolve chordal 7ths down by step (the seventh of a V7 resolves down to the 3rd of I).
These are not absolute — popular music and rock routinely use parallel power chords; Debussy and Ravel embraced parallel 5ths and 9ths as a coloristic choice.
8. Modulation
Modulation is a change of key. Techniques:
- Pivot chord — a chord shared by both keys serves as the bridge. Modulating from C major to G major: Am is vi in C and ii in G; use Am as the pivot, then introduce F♯ (the new leading tone) to confirm G.
- Common-tone modulation — a single shared pitch links chords across distant keys (used by Schubert, Brahms).
- Chromatic modulation — a chord is altered chromatically to redirect; secondary dominants are a common path (V/V resolving to V of the new key).
- Direct (phrase) modulation — abrupt change at a phrase boundary, common in pop (“truck-driver modulation” up a half- or whole-step in the final chorus).
- Enharmonic modulation — reinterpreting a chord using enharmonic equivalence; the diminished 7th chord is symmetric and resolves to four different keys.
Closely related keys differ by one accidental in the key signature (relative minor/major; dominant; subdominant; relative of dominant; relative of subdominant). Distantly related keys (e.g. C major to F♯ major) require more elaborate or coloristic modulation.
9. Counterpoint
Counterpoint is the art of combining independent melodic lines. The canonical pedagogical text is Johann Joseph Fux’s “Gradus ad Parnassum” (1725), which systematizes Palestrina’s 16th-century style into five species:
- First species — note against note. One whole note in the counterpoint per whole note in the cantus firmus. Only consonances (P1, P5, P8, M3, m3, M6, m6) on every beat. P4 is dissonant in two-voice species counterpoint above the bass.
- Second species — two notes against one. Strong beats consonant; weak beats may be dissonant if approached and left by step (passing tone).
- Third species — four notes against one. More elaborate use of passing and neighbor tones.
- Fourth species — suspensions. The counterpoint is delayed across the bar, creating a dissonance on the strong beat that resolves down by step (suspension chain).
- Fifth species (florid) — all of the above combined freely, plus melismas and rhythmic variety.
Bach perfected counterpoint in the Baroque era: the “Well-Tempered Clavier” (1722, 1742) and “The Art of Fugue” (1740s) remain canonical study. Mozart studied Fux; Beethoven studied Fux and Bach; Brahms wrote rigorous counterpoint exercises; Schoenberg’s “Preliminary Exercises in Counterpoint” (posthumous, 1963) is a modern continuation.
Voice independence is the goal: each voice should have a memorable melodic shape and not duplicate the rhythm or contour of another. Dissonance is treated carefully — every dissonance must be prepared (entered from a consonance) and resolved (continued by step to a consonance), unless it is an allowed unaccented passing tone or neighbor tone.
10. Jazz harmony
Jazz extends common-practice harmony with extended chords, chromatic substitutions, and modal soloing. Mark Levine’s “The Jazz Theory Book” (1995) and Jamey Aebersold’s play-along series codified modern pedagogy.
ii-V-I — the core cadential progression. In C major: Dm7 — G7 — Cmaj7. The ii7 prepares; the V7 (with the tritone between 3 and ♭7) creates tension that resolves to I. Players practice ii-V-I in all 12 keys, then apply it to standards.
Tritone substitution — substitute the V7 with a dominant 7th a tritone away (V7 → ♭II7). G7 and D♭7 share the tritone (B/D♭ = F/E♭ enharmonic), so D♭7 → Cmaj7 is a valid substitution. The bass moves chromatically D — D♭ — C, a smoother voice leading than the leap of a fifth.
Modal interchange (borrowed chords) — borrow from the parallel minor (or another mode) into a major key. In C major: iv (Fm), ♭VI (A♭), ♭VII (B♭), iiø (Dm7♭5) all come from C minor. Iconic in Beatles songs (“Let It Be” uses ♭VII; “Yesterday” leans on iv).
Secondary dominants — V of a chord other than I. V/V is D7 in C major, resolving to G7 (the V). V/ii is A7 → Dm7. Secondary dominants tonicize their target.
Altered + extended dominants — V7 chords with chromatically altered extensions: ♭9, ♯9, ♯11 (= ♭5 enharmonic), ♭13 (= ♯5). G7alt = G B D♭ F A♭ B♭ — uses the altered scale (super Locrian, seventh mode of melodic minor). Heard on Coltrane and post-bop solos.
Modes for soloing — match a mode to each chord:
- Dorian over m7 chords (Dm7 → D Dorian).
- Mixolydian over unaltered dom7 (G7 → G Mixolydian).
- Lydian over maj7♯11 (Cmaj7♯11 → C Lydian).
- Locrian over m7♭5 (Bm7♭5 → B Locrian).
- Altered scale over V7alt (G7alt → A♭ melodic minor, called G altered or G super-Locrian).
- Lydian dominant over V7♯11 (the fourth mode of melodic minor).
Coltrane changes — John Coltrane’s “Giant Steps” (1959) cycles tonal centers by major 3rds (B-G-E♭) through ii-V-I groupings in each, producing rapid modulation through three equally-spaced keys. The “Countdown” reharm of “Tune Up” applies the same device to a standard. Trane’s “Naima” (1959) and “Crescent” (1964) explore modal stasis as the complementary opposite.
Bebop scales — Barry Harris codified 8-note scales for bop lines: the bebop dominant (Mixolydian + ♮7 chromatic passing), bebop major (major + ♭6), bebop Dorian (Dorian + ♮3 chromatic passing). The added chromatic note places chord tones on strong beats when running eighth notes.
Real Book + iReal Pro — the “Real Book” (illegal photocopied compilations from Berklee in the 1970s, now licensed editions from Hal Leonard) is the standard jazz fake-book of lead sheets. iReal Pro (Massimo Biolcati, 2008+) is a mobile + desktop app for chord-chart playback with looping and transposition — used universally in practice. Aebersold’s play-along volumes (1967+) pair printed lead sheets with rhythm-section recordings.
11. Rhythm and meter
Rhythm is the temporal organization of music. The beat is the perceived pulse; tempo is the beat rate in beats per minute (BPM).
Simple meters subdivide the beat into 2: 2/4, 3/4, 4/4, the quarter-note as beat. Compound meters subdivide into 3: 6/8 (two dotted-quarter beats, each = 3 eighths), 9/8, 12/8. Odd meters mix groupings: 5/4 (3+2 or 2+3, “Take Five” by Dave Brubeck, 1959), 7/8 (often 2+2+3 or 3+2+2 in Balkan dance traditions and Frank Zappa), 11/8, 13/8. Progressive rock (Yes, Genesis, King Crimson) and contemporary jazz (Hiromi, Snarky Puppy, Tigran Hamasyan) use odd meters extensively.
Polyrhythm — two or more rhythms with different subdivisions playing simultaneously (3 against 2, 4 against 3, the African 3:2 polyrhythm fundamental to Cuban son and Latin music). Polymeter — different meters running concurrently with displaced bar lines (Stravinsky’s “Rite of Spring” passages, Steve Reich’s phasing pieces).
Syncopation — accents on weak beats or off-beats. Pervasive in jazz, funk, hip-hop, reggae. Swing — long-short eighth-note feel (rough triplet ratio ~2:1, slightly less swung at faster tempos). Shuffle — heavy swung 12/8 feel in blues. Clave — the Cuban son clave (2-3 or 3-2 forms) is a five-stroke rhythmic key that organizes Afro-Cuban music: every other instrument aligns to it.
Subdivisions — quarters, 8ths, 16ths, triplets, 16th triplets, quintuplets, septuplets. Drummers practice rudiments (paradiddles, flams, drags, ratamacues) for limb independence.
Backbeat — accents on beats 2 and 4 in 4/4, played by snare drum, fundamental to rock, soul, R&B, and most popular music since the 1950s. Compare to the on-beat march or polka feel.
12. Form and structure
Larger-scale organization. Common forms:
- Strophic — A A A, multiple verses to the same music. Folk songs, hymns.
- Binary — A B, two contrasting sections; common in Baroque dance suites (gavotte, sarabande, gigue).
- Ternary — A B A, da capo arias and minuets-with-trios.
- AABA (32-bar) — eight-bar A section, repeat, contrasting B (bridge), return to A. The dominant form of Tin Pan Alley and early jazz standards: “Body and Soul”, “Take the A Train”, “I Got Rhythm”.
- Verse-chorus — modern pop default; often expanded to verse-pre-chorus-chorus-verse-pre-chorus-chorus-bridge-chorus (sometimes with a final modulating chorus). Used by Max Martin, Dr. Luke, and most contemporary hit production.
- Sonata form — exposition (two themes, second in dominant or relative major), development (modulation, motivic transformation), recapitulation (both themes in tonic). Anchors of Classical-era symphonies and sonatas (Haydn, Mozart, Beethoven).
- Rondo — A B A C A (D A), recurring refrain alternating with episodes. Mozart and Beethoven finales.
- Theme and variations — a theme followed by transformations preserving its structure but altering melody, harmony, rhythm, or texture. Bach’s Goldberg Variations, Beethoven’s Diabelli Variations, Brahms’s Variations on a Theme by Haydn.
- Through-composed — no large-scale repetition; each section new (much of Schubert’s Lieder, Wagner’s continuous music dramas).
- 12-bar blues — I (4 bars) — IV (2) — I (2) — V (1) — IV (1) — I (1) — V or turnaround (1). With dominant-7 quality on every chord. Foundation of blues, rock-and-roll, and much jazz; extended variants are 8-bar (slow blues), 16-bar (rhumba), and 24-bar.
13. Texture
The number and relationship of simultaneous musical lines:
- Monophonic — single melodic line, no accompaniment (Gregorian chant, solo flute).
- Homophonic — one melody with accompaniment harmony (typical hymn, pop song with chords).
- Polyphonic / contrapuntal — multiple independent melodic lines of comparable importance (Bach fugue, Renaissance madrigal).
- Heterophonic — multiple voices simultaneously playing variants of the same melody (Javanese gamelan, traditional Greek and Middle Eastern ensembles).
Arrangement and orchestration apply textural choices to specific instruments — voicings, doublings, register choices — shaping the timbral and dynamic profile of the piece. Rimsky-Korsakov’s “Principles of Orchestration” (1873-1908), Walter Piston’s “Orchestration” (1955), Samuel Adler’s “The Study of Orchestration” (5th ed 2016), and Alexander Publishing’s “Professional Orchestration” series are canonical references.
14. Tuning systems
Choices about how to divide an octave have shaped harmony as much as scales themselves.
- Pythagorean tuning — derived purely from the 3:2 perfect 5th. Stacking 12 fifths overshoots seven octaves by the Pythagorean comma (≈ 23.46 cents). Pure 5ths, but the major 3rd is the harsh 81:64 (≈ 408 cents). Used through the early Middle Ages.
- Just intonation (JI) — small-whole-number ratios for each interval. Beautiful for static harmony in a single key but modulation introduces commas (shifts between two slightly different versions of the same pitch). Used by barbershop quartets, a cappella ensembles, and contemporary microtonal composers.
- Meantone temperament — Renaissance solution: tempered 5ths (narrower than pure by ~5.4 cents in quarter-comma meantone) to produce pure major 3rds. Excellent in nearby keys but unusable in distant keys (“wolf 5th”).
- Well temperament — late-17th to 18th century: irregular tunings (Werckmeister III, 1691; Kirnberger III, 1779; Vallotti, 1779; Young, 1799) where each key has a slightly different color but all are playable. Bach’s “Well-Tempered Clavier” exploits this key-coloring.
- 12-tone equal temperament (12-TET) — every semitone equal at 2^(1/12). All keys identical in character; modulation and chromaticism are unrestricted. Mathematically described by Chinese theorist Zhu Zaiyu (1584) and Simon Stevin (c. 1585); accepted gradually over the 18th and 19th centuries, dominant by the 20th. Distributes the Pythagorean comma equally across all 12 fifths.
- Microtonal tunings — 19-TET (good meantone), 24-TET (quarter tones), 31-TET (extended meantone, Huygens-Fokker), 41-TET, 53-TET (near-JI). Wendy Carlos developed alpha, beta, gamma scales — non-octave divisions optimized for harmonic consonance (“Beauty in the Beast”, 1986).
- Stretched tuning — piano tuners stretch the octaves slightly (a few cents in the extreme registers) to compensate for the inharmonicity of stiff piano strings, where partials are slightly sharp of integer multiples of the fundamental. Theoretical work by Fletcher and Railsback in the 1930s-40s.
15. Non-Western music systems
A small sample of the global music universe:
- Indian classical (Hindustani and Carnatic) — built on raga (melodic mode with characteristic phrases and ornaments, far more than just a scale; over 200 in active use) and tala (rhythmic cycle, e.g. tintal = 16 beats in 4-4-4-4, jhaptal = 10 beats in 2-3-2-3, rupak = 7 beats in 3-2-2). Microtonal śruti divides the octave conceptually into 22 unequal steps. Gharana is a lineage or school of performance practice. Instruments include sitar (Ravi Shankar), sarod (Ali Akbar Khan, Amjad Ali Khan), tabla (Zakir Hussain, Anindo Chatterjee), bansuri flute (Hariprasad Chaurasia), violin in Carnatic style (L. Subramaniam), mridangam.
- Arabic maqam — modal system with quarter-tone intervals (notated with half-flat and half-sharp accidentals or relegated to performance practice). Common maqamat: Rast, Bayati, Saba, Hijaz, Nahawand, Kurd. Improvised taqsim explores a maqam before the composed piece begins. Found across the Arab world, Turkey, Iran (where it is called dastgah with related but distinct theory).
- Indonesian gamelan — bronze percussion orchestra from Java and Bali. Two main tuning systems: slendro (roughly five equidistant notes per octave, but inexactly equal) and pelog (seven-note scale with two common five-note pentatonic modes drawn from it, pathet manyura and pathet sanga). Gamelan tunings are not standardized — each ensemble has its own slightly different intonation, a feature rather than a defect.
- Chinese pentatonic — five-note scales (gong = 1 2 3 5 6, shang = 1 2 4 5 ♭7, jue, zhi, yu) corresponding to modal rotations of the pentatonic.
- Japanese in scale — 1 ♭2 4 5 ♭6, a hemitonic pentatonic. Yo and hirajōshi scales are also characteristic. Foundation of koto, shakuhachi, and shamisen repertoire.
- Sub-Saharan African polyrhythms — overlapping pulses (often referenced as 6-against-4 or 12-against-8 in cross-rhythm) and call-and-response forms underpin many traditions: Ewe drumming (Ghana), djembe ensembles (Mali, Guinea), mbira (Zimbabwe), Pygmy polyphony (Central Africa).
- Tuvan + Mongolian throat singing (khoomei) — production of two or more pitches simultaneously by isolating overtones of a sung fundamental. Genres include sygyt, kargyraa, and borbangnadyr.
These traditions resist mapping onto 12-TET notation. Software like Scala (Manuel Op de Coul) catalogs tens of thousands of tunings, and Sevish + Sean Archibald’s Microtonal Plugins (Surge XT, Pianoteq’s KBM/SCL support) allow DAW work in arbitrary tunings.
16. Computational and AI music
Music information retrieval (MIR) and AI generation have transformed both analysis and production.
MIR (Music Information Retrieval) — algorithms for analyzing recorded audio:
- Chroma features — 12-bin pitch-class profile, robust to timbre, used for chord and key recognition.
- Beat tracking — Ellis’s onset-detection algorithm (2007), madmom RNN-based trackers; standard in DJ software (Serato, Rekordbox, Traktor) for sync.
- Key detection — Krumhansl-Schmuckler key-finding profile (1990); refined by Temperley.
- Melody extraction — predominant pitch tracking (Salamon and Gomez, 2012; CREPE, Kim et al. 2018).
- Audio fingerprinting — Shazam’s algorithm (Wang, 2003; Echoprint open-source variant by Echo Nest, 2011; AcoustID/Chromaprint); spectrogram-peak constellation hashing for sub-second recognition.
- Libraries — Librosa (McFee et al., Python, widely used in academia), Essentia (UPF Music Technology Group), madmom (RNN/CNN tooling).
MIDI 2.0 — finalized by MMA + AMEI (Association of Musical Electronics Industry) in 2020. Backwards-compatible with MIDI 1.0 (1983). Features: bidirectional, 32-bit resolution per parameter (vs MIDI 1.0’s 7-bit), per-note articulation and pitch (replacing MPE-only workflows), property exchange and capability inquiry (MIDI-CI) for plug-and-play hardware identification, jitter-reduced timestamps, profile configuration.
MusicXML — open exchange format for music notation, originally Recordare/MakeMusic (Michael Good, 2004), now W3C Music Notation Community Group, version 4.0 (2021). Supported by Finale, Sibelius, Dorico, MuseScore, virtually all notation software.
Generative AI music (audio domain, post-2023):
- MusicLM (Google, Agostinelli et al., January 2023) — hierarchical text-to-music using SoundStream + MuLan + AudioLM components. Demos public, not productized.
- MusicGen (Meta AI, Copet et al., June 2023) — single-stage transformer over EnCodec tokens; open-weights, runs locally; small/medium/large/melody-conditioned variants.
- Stable Audio (Stability AI, September 2023; Stable Audio 2.0 April 2024) — latent diffusion for variable-length music up to 3 minutes; commercial Stable Audio plan plus open-weights Stable Audio Open (June 2024).
- AudioCraft (Meta, August 2023) — umbrella project containing MusicGen + AudioGen + EnCodec.
- Suno (Suno AI, mid-2023 onward; v3 in 2024, v3.5 + v4 in late 2024) — consumer text-to-song with vocals; widely used commercially. RIAA filed copyright suits in 2024.
- Udio (Uncharted Labs, April 2024) — similar consumer text-to-song service founded by ex-Google DeepMind researchers; co-defendant in the 2024 RIAA suit.
- Riffusion (Forsgren and Martiros, December 2022) — fine-tuned Stable Diffusion on spectrogram images, generating audio via the inverse short-time Fourier transform.
- MIDI-level generation — MuseNet (OpenAI, 2019, deprecated 2022); Anticipatory Music Transformer (Stanford, Thickstun et al., 2023) — controllable infilling; YuE (open MIDI/audio model, late 2024); MMM (Multi-Track Music Machine, Ens and Pasquier, 2020); Magenta’s Music Transformer (Huang et al., 2018) and Performance RNN.
Consumer-facing AI music tools:
- AIVA (Pierre Barreau, 2016+) — symbolic and audio composition for soundtrack work; SACEM-registered.
- Soundraw (Tokyo, 2020+) — generates customizable royalty-free background music.
- Boomy (Boomy Corporation, 2018+) — consumer track creation, distribution to streaming services.
- Mubert (Alexey Kochetkov et al., 2016+) — generative streaming music for content creators.
- Endel (Berlin, 2018+) — personalized adaptive soundscapes (focus, sleep, relaxation) responsive to time-of-day, heart rate, weather.
Optical Music Recognition (OMR) — automated scanning of printed/handwritten scores:
- Audiveris — open-source OMR (Hervé Bitteur, since 2004), Java-based, exports MusicXML.
- PhotoScore + NotateMe (Neuratron) — commercial OMR with mobile camera support.
- ScanScore (Lugert Verlag) — consumer OMR with DAW integration.
- Deep-learning OMR research (Calvo-Zaragoza et al., 2018-2024) is steadily improving accuracy on handwritten music.
17. Modern theory frontiers
Neo-Riemannian transformations — David Lewin’s “Generalized Musical Intervals and Transformations” (1987) and subsequent work by Brian Hyer, Richard Cohn (“Audacious Euphony”, 2012) treat triads as nodes in a geometric group, related by L (leading-tone exchange), R (relative), and P (parallel) operations. These compose into hexatonic and octatonic cycles, illuminating late-Romantic chromatic harmony (Wagner, Liszt, late Beethoven). The Tonnetz diagram visualizes the resulting harmonic space.
Pitch-class set theory — Allen Forte’s “The Structure of Atonal Music” (1973) defined pitch-class sets (collections of pitches reduced modulo 12), with normal-form representations, interval-class vectors, and set-class equivalence under transposition and inversion. Foundational for analyzing Schoenberg, Webern, Berg, and the broader atonal-to-serial repertoire.
Schenkerian analysis — Heinrich Schenker (1868-1935, Austrian theorist) developed a hierarchical reduction technique that derives any tonal piece from a fundamental structure (Ursatz), typically a stepwise descent in the upper voice over a I-V-I bass arpeggiation. Later levels (middleground, foreground) are elaborations. Influential but controversial (its methods have been criticized as ideologically prescriptive). Continued by Felix Salzer, Carl Schachter, Allen Cadwallader.
Spectral music — Gérard Grisey, Tristan Murail, Hugues Dufourt, and the Ensemble L’Itinéraire (Paris, mid-1970s onward) derive harmonic content from the analysis of acoustic spectra. Grisey’s “Partiels” (1975) builds its opening from the spectrum of a low E trombone note. Spectral analysis (FFTs, sinusoidal modeling) and resynthesis blur the line between composition, DSP, and electroacoustic practice. Saariaho, Lindberg, Romitelli are major successors.
Algorithmic + generative composition — Iannis Xenakis (1922-2001) wrote stochastic music using probability distributions (“Pithoprakta”, “Metastasis”), then formalized algorithmic processes in “Formalized Music” (1971/1992). David Cope’s Experiments in Musical Intelligence (EMI, 1981+) used pattern recognition and recombination to compose in the style of Bach, Mozart, Chopin. Lejaren Hiller’s “Illiac Suite” (1957) is the first computer-composed work. Modern descendants run from L-systems (Prusinkiewicz) to grammar-based composition (Bod, Steedman) to the deep-learning generative models in section 16.
18. Tools
Notation:
- MuseScore (Werner Schweer + community, since 2002; MuseScore 4.0 December 2022, 4.4 2024) — free + open-source, mature, MuseScore.com community sharing, MuseHub for soundbanks.
- Dorico (Steinberg, lead architect Daniel Spreadbury — formerly of Sibelius, since 2016; Dorico 5 2023) — premium, modern engraving engine, strong condensation and divisi.
- Sibelius (originally Sibelius Software 1993, now Avid; Sibelius Ultimate) — long-standing pro notation.
- Finale (MakeMusic since 1988) — long-time industry standard; MakeMusic announced end-of-life for Finale in August 2024, offering Dorico crossgrades.
DAWs:
- Logic Pro (Apple, originally Emagic 1992, acquired by Apple 2002; Logic Pro 11 May 2024) — Mac-only.
- Ableton Live (Ableton, Berlin, since 2001; Live 12 February 2024) — clip-based + linear; standard for electronic music production and live performance.
- FL Studio (Image-Line, since 1997; lifetime free updates; FL Studio 21 2023) — pattern-based, popular in hip-hop and pop production.
- Pro Tools (Avid; Pro Tools 2024) — long-standing pro audio + film post standard.
- Cubase (Steinberg, since 1989; Cubase 13 2023) — MIDI heritage, mature workflow.
- Studio One (PreSonus, since 2009; v6 2022) — drag-and-drop ergonomics.
- Reaper (Cockos, Justin Frankel — Winamp creator; since 2005) — extremely affordable, scriptable in Lua + Python, devoted following.
- GarageBand (Apple, since 2004) — entry-level Logic.
- Bitwig Studio (Bitwig, Berlin, since 2014; founded by ex-Ableton developers) — modular, strong device design.
Theory pedagogy:
- Teoria (José Rodríguez Alvira, since 1998) — free interval, chord, scale ear-training exercises.
- MusicTheory.net (Ricci Adams) — free lessons + trainers.
- Hooktheory (Hooktheory.com) — theory + songwriting via the TheoryTab database of pop-song chord progressions.
- Soundslice (Adrian Holovaty + Brett Kromkamp) — interactive notation + tab tied to video/audio playback.
- Tonebase — premium video lessons (piano, guitar, voice, composition).
- Pianote / Drumeo / Guitareo (Musora) — subscription instrument lessons.
- JustinGuitar (Justin Sandercoe, since 2003) — free comprehensive guitar method.
- EarMaster (Denmark, since 1996) — desktop + iPad ear-training software.
19. Cross-references
[[Music/_index]]— Music library overview.[[Engineering/Tier3/acoustics-noise-control]]— room acoustics, RT60, psychoacoustics (Fletcher-Munson, equal-loudness contours), Sabine/Eyring equations.[[Engineering/signal-processing-dsp]]— FFT, filters, sample-rate conversion, oversampling, dither.[[Math/fft-spectral]]— spectral analysis for pitch detection, formant tracking, spectral synthesis.[[Math/information-theory]]— Shannon entropy + audio compression (MP3, AAC, Opus, FLAC).[[Compute/transformer-architecture]]— generative music models (MusicGen, MusicLM, Suno, Udio, Stable Audio, YuE).[[Compute/rag-embeddings-vector-search]]— audio embeddings (CLAP, Wav2Vec2, MERT) for similarity search and recommendation.
20. Citations
- Aldwell, Schachter, Cadwallader. “Harmony and Voice Leading”, 5th edition. Cengage, 2018.
- Kostka, Payne, Almén. “Tonal Harmony with an Introduction to Twentieth-Century Music”, 8th edition. McGraw-Hill, 2017.
- Levine, Mark. “The Jazz Theory Book”. Sher Music, 1995.
- Persichetti, Vincent. “Twentieth-Century Harmony: Creative Aspects and Practice”. W. W. Norton, 1961.
- Fux, Johann Joseph. “Gradus ad Parnassum”. Vienna, 1725 (translated as “The Study of Counterpoint”, Alfred Mann ed., Norton 1971).
- Forte, Allen. “The Structure of Atonal Music”. Yale University Press, 1973.
- Lewin, David. “Generalized Musical Intervals and Transformations”. Yale University Press, 1987.
- Schenker, Heinrich. “Der freie Satz” (Free Composition). Vienna, 1935 (translated by Ernst Oster, Longman 1979).
- Cohn, Richard. “Audacious Euphony: Chromaticism and the Triad’s Second Nature”. Oxford University Press, 2012.
- Sethares, William A. “Tuning, Timbre, Spectrum, Scale”, 2nd edition. Springer, 2005.
- Helmholtz, Hermann von. “On the Sensations of Tone as a Physiological Basis for the Theory of Music” (Die Lehre von den Tonempfindungen), 1863 (translated by Alexander J. Ellis, Longmans 1875).
- Partch, Harry. “Genesis of a Music”, 2nd edition. Da Capo, 1974 (original 1949).
- Rimsky-Korsakov, Nikolai. “Principles of Orchestration”, 1873-1908 (English ed. Edward Agate, 1922).
- Adler, Samuel. “The Study of Orchestration”, 5th edition. W. W. Norton, 2016.
- Messiaen, Olivier. “Technique de mon langage musical”. Leduc, 1944.
- MIDI Manufacturers Association + AMEI. “MIDI 2.0 Specifications”, 2020 (with subsequent revisions through 2024).
- W3C Music Notation Community Group. “MusicXML 4.0”, 2021.
- ISO 16:1975. “Acoustics — Standard tuning frequency (Standard musical pitch)“. International Organization for Standardization.
- Wang, Avery. “An Industrial-Strength Audio Search Algorithm”. ISMIR 2003 (Shazam).
- McFee et al. “librosa: Audio and Music Signal Analysis in Python”. SciPy 2015.
- Agostinelli, Andrea et al. “MusicLM: Generating Music From Text”. arXiv:2301.11325, January 2023.
- Copet, Jade et al. “Simple and Controllable Music Generation” (MusicGen). NeurIPS 2023, arXiv:2306.05284.
- Stability AI. “Stable Audio Open”. June 2024.
- Thickstun, John et al. “Anticipatory Music Transformer”. TMLR, 2023.