Semantics and Pragmatics
Semantics studies linguistic meaning — what expressions denote, the truth conditions of sentences, the entailment relations between them, and the systematic way in which complex meanings are composed from simpler ones. Pragmatics studies meaning in use — what speakers do with language, how context fills in what is left unsaid, how implicature and presupposition operate. The boundary between the two is contested; many phenomena (definiteness, focus, modality) straddle it.
Lexical Semantics
Lexical semantics asks what individual words mean and how word meanings relate. Core relations:
- Synonymy — same or near-same meaning (couch / sofa; begin / commence). True synonymy is rare; near-synonyms typically differ in register, dialect, connotation, or collocation.
- Antonymy — opposite meaning, subdivided into gradable antonyms (hot / cold — on a scale), complementary antonyms (alive / dead — categorical), converse antonyms (buy / sell, parent / child), and reversive antonyms (open / close, appear / disappear).
- Hyponymy / hyperonymy — IS-A relation (dog is a hyponym of animal; animal is the hyperonym).
- Meronymy — PART-OF relation (wheel is a meronym of car).
- Polysemy — multiple related senses of one word (bank of a river vs financial bank are homonyms; the verb run for legs / engines / programs / colors is polysemous).
- Homonymy — distinct words sharing form (bat animal vs bat implement).
WordNet (George Miller, Princeton University 1990–present) is the canonical lexical database for English: nouns, verbs, adjectives, and adverbs organized into synsets (sets of cognitive synonyms) connected by hyponymy, meronymy, antonymy, and other relations. WordNet 3.1 contains ~117,000 synsets and ~155,000 word forms. International WordNets exist for ~60 languages; EuroWordNet and Global WordNet coordinate cross-lingual alignment. FrameNet (Charles Fillmore, Berkeley 1997–present) provides an alternative organization in terms of semantic frames — schemas for situations with characteristic participants (the COMMERCIAL_TRANSACTION frame has Buyer, Seller, Goods, Money; verbs buy, sell, pay, charge each foreground different participants).
Compositional Semantics — Montague Grammar
Richard Montague (1930–1971) launched formal semantics with three papers — English as a Formal Language (1970), Universal Grammar (1970), and The Proper Treatment of Quantification in Ordinary English (PTQ, 1973). Montague proposed that there is no theoretically important distinction between formal and natural languages: both could be given a model-theoretic semantics in intensional logic with lambda calculus.
The architecture:
- Syntactic rules build expressions.
- A semantic rule pairs with each syntactic rule, mapping the structure to an expression of intensional logic.
- The intensional logic is interpreted model-theoretically against a domain of individuals, possible worlds, times, and functions.
Each lexical item denotes a function in the appropriate type. Type theory assigns types: e for entities, t for truth values, s for world-time indices. Functions have types ⟨a,b⟩: John is type e; walks is type ⟨e,t⟩ (function from entities to truth values); every is type ⟨⟨e,t⟩,⟨⟨e,t⟩,t⟩⟩ (a generalized quantifier — function from sets to functions from sets to truth values).
Lambda calculus notation expresses these functions: walks ⟦walks⟧ = λx. walks(x); every man ⟦every man⟧ = λP. ∀x. man(x) → P(x). Function application composes meanings: ⟦every man walks⟧ = ⟦every man⟧(⟦walks⟧) = ∀x. man(x) → walks(x).
The Heim and Kratzer Textbook Tradition
Irene Heim and Angelika Kratzer’s Semantics in Generative Grammar (Blackwell, 1998) became the standard graduate text for compositional semantics. The book develops a type-driven, two-rule combinatorial system (Functional Application + Predicate Modification), introduces the treatment of quantifiers via type-shifting and Quantifier Raising, handles binding and reflexives via assignment functions, and develops a Heim file-change-semantic treatment of definites.
Truth-Conditional Semantics
Compositional semantics in the post-Montague tradition is fundamentally truth-conditional: to know the meaning of a sentence is to know the conditions under which it would be true. Donald Davidson Truth and Meaning (1967) argued that a Tarski-style truth definition for a natural language fragment — Snow is white is true iff snow is white — provides a theory of meaning.
Possible Worlds and Modality
Saul Kripke’s possible-world semantics (developed 1959–1963) interpreted modal logic in a frame ⟨W, R⟩ where W is a set of possible worlds and R an accessibility relation. □φ is true at w iff φ is true at every world w’ such that wRw’; ◇φ is true at w iff φ is true at some w’ such that wRw’. David Lewis Counterfactuals (1973) and On the Plurality of Worlds (1986) defended modal realism (possible worlds are concrete entities), while most semanticists adopt an ersatzist (worlds-as-abstract-objects) stance.
Modality in natural language is richly varied. Angelika Kratzer’s framework (1977 What Must and Can Must and Can Mean; 1981, 1991) introduces a modal base (set of accessible worlds — epistemic, deontic, circumstantial, teleological, bouletic) and an ordering source (ranking worlds by stereotypicality, ideal laws, desires). John must be home on an epistemic reading: in all worlds consistent with what is known, John is home; on a deontic reading: in all worlds consistent with the rules, John is home.
Tense and Aspect
Tense locates the event in time relative to the utterance; aspect characterizes its internal temporal structure. Hans Reichenbach Elements of Symbolic Logic (1947) introduced three points — S (speech time), R (reference time), E (event time) — that suffice to derive the main English tenses. Simple past: E,R_S; present perfect: E_S,R; past perfect: E_R_S; future perfect: S_E_R; etc.
Aspect divides into:
- Perfective — event presented as a whole, bounded (Russian, Slavic perfective verb stems; Romance preterite tenses)
- Imperfective — internal structure visible, unbounded (Romance imperfect; English progressive was running)
- Habitual — characterizing pattern (used to + V; Russian imperfective in past)
- Perfect — current relevance of a prior event (English have V-ed)
Aktionsart (lexical aspect, verbal action type) — Zeno Vendler Linguistics in Philosophy (1967) — distinguishes four classes:
- States — know, love, be tall (no internal change, no endpoint)
- Activities — run, swim, read (durative, no inherent endpoint)
- Accomplishments — build a house, walk to the store (durative + endpoint)
- Achievements — arrive, notice, die (punctual + endpoint)
Diagnostics include compatibility with in an hour (telic events) vs for an hour (atelic), the progressive (activities and accomplishments take it readily; states and achievements resist).
Quantification
Generalized quantifier theory (Jon Barwise and Robin Cooper Generalized Quantifiers and Natural Language, Linguistics and Philosophy 1981) treats determiners as relations between sets. Every(A)(B) holds iff A ⊆ B; some(A)(B) iff A ∩ B ≠ ∅; no(A)(B) iff A ∩ B = ∅; most(A)(B) iff |A ∩ B| > |A − B|. Quantifiers are characterized by properties (conservativity, monotonicity, extensivity) that explain restrictions on natural language determiners (only is not conservative and is not a true determiner; existential there constructions allow only weak quantifiers, Milsark 1977).
Scope ambiguity arises with multiple quantifiers: Every student didn’t pass is ambiguous between wide-scope negation (No student passed: ¬∃x.[student(x) ∧ passed(x)]) and narrow-scope negation (Not every student passed: ¬∀x.[student(x) → passed(x)]). Every man loves a woman allows both ∀x∃y and ∃y∀x readings. Quantifier Raising (Robert May Logical Form 1985) covertly moves the quantifier to the periphery at LF; scope diamond analyses use type-shifting or in-situ choice functions.
Discourse Representation Theory (DRT)
Hans Kamp A Theory of Truth and Semantic Representation (1981) and Irene Heim The Semantics of Definite and Indefinite Noun Phrases (1982 UMass PhD) independently developed dynamic accounts. The classic puzzle: in A farmer owns a donkey. He beats it, the pronoun it depends on the indefinite a donkey across sentences, beyond the c-command domain. DRT builds a Discourse Representation Structure (DRS) — a box of discourse referents and conditions — that grows as discourse proceeds. Indefinites introduce new referents; pronouns anaphorically pick up existing referents; definites presuppose their referent is already in the DRS.
Dynamic Semantics — File Change Semantics
Heim’s file change semantics (1982) models meaning as context change potential: the meaning of a sentence is the update it makes to a context. Contexts are sets of (world, assignment) pairs. Indefinites add new file cards; definites require an existing card; updates are partial functions on contexts. Heim’s familiarity theory captures presupposition: a definite’s presupposition is satisfied iff its referent’s file card already exists.
Event Semantics
Donald Davidson The Logical Form of Action Sentences (1967) proposed that verbs of action take an event argument: John walked slowly becomes ∃e[walking(e) ∧ Agent(e, John) ∧ Slow(e)]. The treatment elegantly explains adverbial modification (each adverb is a predicate of the event), entailment (John walked slowly entails John walked, since the conjunction simplifies), and anaphoric reference to events (It happened at noon). The Davidsonian / Neo-Davidsonian event semantics (Terence Parsons Events in the Semantics of English 1990) underpins much modern semantic theory.
Thematic Roles
The semantic relations between a predicate and its arguments — thematic roles or theta-roles — include:
- Agent — volitional initiator (John in John kicked the ball)
- Patient — undergoer of change (the ball in John kicked the ball)
- Experiencer — locus of a mental state (Mary in Mary loves Tom)
- Theme — entity moved or located (book in the book is on the table)
- Instrument — means used (hammer in John broke it with a hammer)
- Goal — endpoint of motion (London in flew to London)
- Source — origin of motion (Paris in flew from Paris)
- Recipient — receiver in transfer (Mary in gave the book to Mary)
- Benefactive — beneficiary (for Mary)
Theta-role inventories vary across theories from a handful (Dowty 1991 Thematic Proto-Roles and Argument Selection argues for just Proto-Agent and Proto-Patient with cluster properties) to dozens.
Presupposition
Presuppositions are background assumptions of an utterance. The king of France is bald presupposes that there is a king of France. John regrets cheating on the exam presupposes that John cheated on the exam. Key diagnostics: negation test (negation of the asserting sentence preserves the presupposition — The king of France is not bald still presupposes a king of France); question test; antecedent of conditional.
Projection is the central theoretical problem: which presuppositions survive embedding under operators? Robert Stalnaker Pragmatic Presuppositions (1974) and Irene Heim (1983 On the Projection Problem for Presuppositions) developed dynamic and pragmatic theories. Accommodation (David Lewis Scorekeeping in a Language Game 1979) explains how hearers silently update the common ground when an utterance presupposes something not already shared. Stalnaker’s framework treats assertion as a proposal to add a proposition to the common ground — the set of propositions mutually accepted by interlocutors.
Pragmatics — Grice and the Cooperative Principle
H. P. Grice’s William James Lectures Logic and Conversation (delivered Harvard 1967, published 1975) launched modern pragmatics. The Cooperative Principle: “Make your conversational contribution such as is required, at the stage at which it occurs, by the accepted purpose or direction of the talk exchange in which you are engaged.” The four maxims:
- Quantity — be as informative as required; not more, not less.
- Quality — say only what you believe true and have evidence for.
- Relation — be relevant.
- Manner — be perspicuous (clear, brief, orderly, unambiguous).
Conversational implicatures arise when speakers flout, exploit, or balance maxims. Did you eat the cookies? — I ate some generates the scalar implicature that the speaker did not eat all (else the stronger all would have been more informative). Where’s John? — There’s a yellow VW outside Mary’s place exploits Relation: hearer infers John is at Mary’s. Implicatures are cancelable (some, in fact all) and non-detachable (paraphrasing preserves them), distinguishing them from entailments and presuppositions.
Speech Act Theory
J. L. Austin How to Do Things with Words (William James Lectures, Harvard 1955; book 1962) observed that utterances do things, not just describe — I promise creates a promise, I name this ship names it, I bet creates a wager. Austin distinguished three acts in any utterance:
- Locutionary act — the saying itself, with sense and reference
- Illocutionary act — what is done in saying (asserting, promising, requesting, warning)
- Perlocutionary act — effect on hearer (convincing, frightening, persuading)
John Searle Speech Acts: An Essay in the Philosophy of Language (1969) systematized illocutionary force into five classes:
- Representatives / Assertives — commit speaker to truth of proposition (assert, claim, report)
- Directives — attempt to get hearer to do something (request, command, suggest)
- Commissives — commit speaker to future action (promise, offer, threaten)
- Expressives — express psychological state (thank, apologize, congratulate)
- Declarations — bring about a state of affairs by being said (baptize, fire, adjourn)
Relevance Theory
Dan Sperber and Deirdre Wilson Relevance: Communication and Cognition (1986; 2nd ed. 1995) proposed a cognitively grounded alternative to Grice’s many maxims, replacing them with a single principle of relevance: every utterance creates a presumption of optimal relevance — maximum cognitive effects for minimum processing effort. Hearers infer the speaker’s intended interpretation as the first interpretation that achieves optimal relevance. Implicatures, explicatures, ad hoc concepts, and metaphor receive unified treatments.
Politeness
Penelope Brown and Stephen Levinson Politeness: Some Universals in Language Usage (1987, expanding 1978) developed a theory of face (drawing on Goffman): positive face (desire to be approved) and negative face (desire for autonomy). Face-threatening acts (FTAs) — requests, criticism, disagreements — require face-work strategies: bald on-record (Close the window!), positive politeness (in-group markers, Close the window, mate), negative politeness (hedges, indirectness, Could you possibly close the window?), off-record (hints, It’s cold in here), or avoidance. The model has been criticized (cross-cultural critiques by Matsumoto 1988 on Japanese; Wierzbicka on Anglo bias) but remains foundational.
Deixis
Deixis is reference relative to the speech context. Five types (Charles Fillmore Lectures on Deixis 1971/1997):
- Person — I, you, we
- Place — here, there, this, that
- Time — now, then, yesterday, next week, tense
- Discourse — the former, the latter, this (as proform)
- Social — honorifics, T/V pronouns (French tu/vous, German du/Sie, Japanese complex honorific system)
Theories of Reference
How does a name or description pick out its referent?
- Descriptivism (John Stuart Mill, Gottlob Frege Über Sinn und Bedeutung 1892, Bertrand Russell On Denoting 1905): names have descriptive content; a name refers to whatever uniquely satisfies the associated description.
- Russell’s theory of definite descriptions (1905): the king of France is bald logically decomposes as ∃x[king-of-France(x) ∧ ∀y[king-of-France(y) → y=x] ∧ bald(x)] — three conjuncts: existence, uniqueness, predication. The king of France is bald is therefore simply false (not, as Strawson 1950 argued, neither true nor false).
- Causal-historical / direct reference (Saul Kripke Naming and Necessity lectures 1970, published 1980): names are rigid designators; they refer to the same object in every possible world. Reference is fixed by an initial baptism and transmitted through a causal chain of use. Names lack descriptive content.
- Hilary Putnam The Meaning of “Meaning” (1975): natural kind terms (water, gold, tiger) work similarly; “meaning ain’t in the head” — meaning depends on the external world and expert deference (Twin Earth thought experiment).
Experimental Semantics and Pragmatics
A major late-20th and 21st-century development is experimental investigation of semantic and pragmatic phenomena. Methods include eye-tracking (visual world paradigm — Tanenhaus et al. 1995), self-paced reading, ERP (N400 for semantic anomaly, P600 for syntactic violations), acceptability judgments, and crowdsourced surveys. Examples: scalar implicature processing (Bott & Noveck 2004 — some implicature is delayed), presupposition projection (Schwarz 2016), gradience in quantifier interpretation.
Distributional Semantics and Embeddings
A parallel computational tradition derives word meaning from co-occurrence patterns. John Firth 1957: “You shall know a word by the company it keeps.” Early Latent Semantic Analysis (Deerwester, Dumais, Furnas et al. 1990), Latent Dirichlet Allocation (Blei, Ng, Jordan 2003), and HAL / COALS built sparse high-dimensional vector representations from corpora.
word2vec (Tomas Mikolov, Kai Chen, Greg Corrado, Jeff Dean, Google 2013) and GloVe (Jeffrey Pennington, Richard Socher, Christopher Manning, Stanford 2014) produced dense low-dimensional embeddings via shallow neural networks or matrix factorization. The famous analogy king − man + woman ≈ queen captures structural regularities in embedding space.
ELMo (Peters, Neumann, Iyyer et al. NAACL 2018) introduced contextualized embeddings from bidirectional LSTMs. BERT (Devlin, Chang, Lee, Toutanova 2018), GPT-1/2/3/4 (Radford et al. 2018, 2019; Brown et al. 2020), and successors made transformer-based contextual representations the standard. Embeddings encode substantial lexical-semantic structure (synonymy, analogy, hyperonymy), syntactic information, and world knowledge.
The Meaning Debate — Do Large Language Models Understand?
Emily Bender and Alexander Koller Climbing Towards NLU: On Meaning, Form, and Understanding in the Age of Data (ACL 2020) argued that systems trained on text alone — without grounding in the world or in communicative intent — cannot acquire genuine meaning, only statistical patterns. Their octopus test (an intelligent octopus eavesdropping on telegraph conversations between two islanders) illustrates the worry: surface fluency without grounding cannot capture meaning.
Anna Ivanova and Kyle Mahowald with colleagues (Dissociating Language and Thought in Large Language Models, Trends in Cognitive Sciences 2024) distinguish formal linguistic competence (knowing the rules of a language) from functional competence (using language to reason, plan, ground in the world). LLMs achieve striking formal competence; functional competence is more variable. The debate parallels classical philosophical disputes over meaning (Searle’s Chinese Room, Putnam’s Twin Earth, Davidson on radical interpretation).
Adjacent
- syntax-and-grammar — syntax determines argument structure and scope possibilities
- phonetics-and-phonology — intonation signals focus and given/new
- philosophy-of-mind-and-language — reference, meaning, intentionality
- logic-and-philosophy-of-mathematics — modal logic, formal semantics
- transformer-architecture — neural language models and embeddings
- sociolinguistics-and-applied — pragmatics in discourse and across cultures