Document / Typesetting Languages Family Index


type: language-family-index family: document-typesetting languages_catalogued: 25 tags: [language-reference, family-index, typesetting, documents, markup, publishing, literate-programming]

Document / Typesetting Languages — Family Index

Family overview

The document-language family begins with Donald Knuth’s TeX (1978), born from his frustration with the typesetting of The Art of Computer Programming. TeX’s box-and-glue model and Plass-Knuth line-breaking algorithm still set the gold standard for paragraph composition almost fifty years later. LaTeX (Leslie Lamport, 1985) wrapped TeX in a logical-markup macro layer that hid the typesetting plumbing behind \section, \cite, \label — and became the lingua franca of mathematics, physics, computer science, and academic publishing. ConTeXt (Hans Hagen, late 1990s) took the opposite route: a more uniform, document-design-oriented superset built atop the same engine (now LuaTeX), favored in print houses and complex multilingual publication. Modern descendants include SILE (Lua-driven, Unicode-first, ~2014), Patoline (OCaml-based, glyph-precise), and the rising Typst (Rust-based, 2023) — Typst in particular has captured significant academic mindshare as a “LaTeX without the pain” with markdown-like syntax, real error messages, and incremental compilation.

The 2000s-2010s brought a lightweight markup explosion: John Gruber and Aaron Swartz’s Markdown (2004) was deliberately “wrong by design” (lossy, ambiguous) but won by being readable as plain text. CommonMark (2014) finally standardized the parser ambiguities; GFM layered tables, task lists, and strikethrough for GitHub. Parallel tracks — AsciiDoc (Stuart Rackham, 2002, semantically richer than Markdown), reStructuredText (Goodger, 2002, Python-docs ecosystem), MultiMarkdown, kramdown, and Pandoc Markdown (John MacFarlane’s union-of-features dialect) — split the niche along expressiveness/portability axes. Pandoc itself became the universal interchange: any markup in, any format out.

A schema split runs through the family: layout languages (TeX, Typst, troff) describe glyphs on pages; semantic markup (DocBook, DITA, Texinfo) describes document structure for downstream rendering; lightweight (Markdown family, AsciiDoc, RST) splits the difference for human-authored prose. Cross-cutting these is the literate-programming branch — Knuth’s original WEB and Funnelweb, Emacs Org-mode (Carsten Dominik, 2003), Matthew Butterick’s Pollen (Racket-based), and Posit’s Quarto (2022, RMarkdown’s successor) — where executable code blocks weave with prose to produce both program and document. The Unix man-page lineage (troff/nroff/groff/Heirloom troff/mdoc) remains in production after 50+ years for one of the most stable document formats ever shipped.

In our deep library

None catalogued in the deep library (these are markup/typesetting, not general-purpose programming). Cross-reference music-audio for LilyPond and ABC notation, and scientific for RMarkdown/Quarto literate workflows.

Tier 3 family table

LanguageFirst appearedOriginOutputStatus (2026)URL
TeX1978Donald Knuth (Stanford)DVI / PDF (via pdfTeX, XeTeX, LuaTeX)Frozen by design (Knuth’s bug-fix-only policy); engine layer thrives via XeTeX/LuaTeXhttps://www.tug.org/
LaTeX1985Leslie Lamport (DEC SRC)DVI / PDF; HTML via LaTeXMLActive; LaTeX3 / expl3 modernization; dominant in STEM publishinghttps://www.latex-project.org/
ConTeXt~1996Hans Hagen, Ton Otten (Pragma ADE)PDF via LuaTeX (ConTeXt MkIV / LMTX)Active; favored for complex layout, multilingual, print productionhttps://wiki.contextgarden.net/
Typst2023Martin Haug, Laurenz Mädje (TU Berlin)PDF / PNG / SVG; web app + open-source CLIVery active; rapid academic adoption; v0.13 series in 2026https://typst.app/
Pollen2014Matthew ButterickHTML / PDF (via Racket Scribble); X-expressionsActive but niche; powers Butterick’s Practical Typography and Beautiful Rackethttps://docs.racket-lang.org/pollen/
LilyPond1996Han-Wen Nienhuys, Jan NieuwenhuizenPDF / SVG / MIDI; engraved music notationActive; cross-ref music-audiohttps://lilypond.org/
AsciiDoc / Asciidoctor2002 (AsciiDoc); 2013 (Asciidoctor Ruby impl)Stuart Rackham; Dan Allen / OpenDeviseHTML / PDF / DocBook / EPUB / man pagesActive; Eclipse Foundation AsciiDoc Working Group standardizing the language (2024-2026)https://asciidoctor.org/
reStructuredText (RST)2002David Goodger (Docutils)HTML / LaTeX / man / XML; Sphinx pipelineActive; canonical for Python docs ecosystem (Sphinx, Read the Docs)https://docutils.sourceforge.io/rst.html
Pandoc Markdown2006John MacFarlane (UC Berkeley)40+ formats via Pandoc ASTVery active; effectively the universal-markup supersethttps://pandoc.org/MANUAL.html
groff / troff / nroff1973 (troff, Joe Ossanna); 1990 (groff, James Clark)Bell Labs; later GNUPostScript / PDF / terminal text; Unix man pagesMaintained; groff is the ubiquitous Linux/macOS man-page rendererhttps://www.gnu.org/software/groff/
mdoc1990Cynthia Livingston (UC Berkeley CSRG)Semantic macro set for BSD man pages; rendered via groff/mandocActive; default man-page format on FreeBSD, OpenBSD, NetBSD, macOShttps://mandoc.bsd.lv/
CommonMark2014John MacFarlane, Jeff Atwood, David Greenspan, et al.HTML; reference C/JS implementationsActive; spec 0.31.2 (2024); foundation for GFM, MDX, many engineshttps://commonmark.org/
GitHub Flavored Markdown (GFM)2017 (formal spec)GitHubHTML; CommonMark + tables, task lists, strikethrough, autolinksActive; arguably the most-read markup dialect on the planethttps://github.github.com/gfm/
MultiMarkdown (MMD)2005Fletcher PenneyHTML / LaTeX / OpenDocument / RTF; metadata, tables, footnotes, mathMaintained (MMD 6); influenced Pandoc and CommonMark feature setshttps://fletcher.github.io/MultiMarkdown-6/
kramdown2007Thomas LeitnerHTML / LaTeX / PDF; default Jekyll engineActive; powers Jekyll / GitHub Pages alongside CommonMark-GFMhttps://kramdown.gettalong.org/
DocBook1991HaL Computer Systems + O’Reilly; OASIS standard since 1998XSL-FO → PDF; HTML; EPUB; semantic XML schemaMaintained; persistent in tech-publishing toolchains (O’Reilly historically, Linux Documentation Project, FreeBSD Handbook)https://docbook.org/
DITA2001IBM (originally); OASIS standard since 2005HTML / PDF / EPUB via DITA Open Toolkit; topic-based authoringActive in regulated industries (medical devices, aerospace, software docs); DITA 2.0 (2024)https://www.dita-ot.org/
Texinfo1986Richard Stallman, Bob Chassell (FSF)GNU Info / HTML / PDF / EPUB; one source, many outputsActive; canonical format for GNU project documentation (Emacs, GCC, Bash)https://www.gnu.org/software/texinfo/
Quarto2022Posit (formerly RStudio); JJ Allaire, Carlos Scheidegger, et al.HTML / PDF / Word / EPUB / slides; executes R, Python, Julia, ObservableVery active; the de-facto successor to RMarkdown for reproducible publishinghttps://quarto.org/
SILE2014Simon CozensPDF; Lua-scripted typesetter using TeX line-breaking + HarfBuzzActive; v0.15 (2024); Unicode/OpenType-first alternative to TeXhttps://sile-typesetter.org/
Patoline~2012Pierre-Étienne Meunier, Christophe Raffalli (Univ. Savoie)PDF / SVG; OCaml-based; programmable typesetterMostly dormant; small academic followinghttp://patoline.org/
Lout1991Jeffrey Kingston (Univ. Sydney)PostScript / PDF; functional language, no TeX dependencyMaintained but niche; small user basehttps://github.com/william8000/lout
Heirloom troff (Heirloom Doctools)2005Carsten Bormann; revived from Solaris/Plan 9 sourcesPostScript / PDF; high-fidelity classical troff lineageMaintained; preferred for traditional troff workflows wanting OpenType + classical algorithmshttps://n-t-roff.github.io/heirloom/doctools.html
Funnelweb1986Ross Williams (Univ. Adelaide)Tangled source + woven typeset doc; language-agnostic literate programmingHistorical; v3.2 (1999) is the last release; cited in literate-programming surveyshttp://www.ross.net/funnelweb/
Org-mode2003Carsten Dominik (Emacs)HTML / LaTeX / PDF / ODT / Beamer / Reveal.js; Babel for code executionVery active; document + literate-programming + agenda + outlinerhttps://orgmode.org/

Notable threads

  • Knuth’s TeX vs Typst’s modern challenge. TeX’s box-and-glue typesetting and the Plass-Knuth optimal-fit line breaker remain unmatched for fine paragraph composition (justified text, mathematical layout). Knuth famously declared TeX “frozen” — only bug fixes, version number asymptotically approaching π. The macro layer (LaTeX) and engine layer (pdfTeX → XeTeX → LuaTeX) absorb innovation while the core stays stable. Typst (Berlin, 2023) bet that academic users would trade some typographic perfection for a sub-second incremental compile, helpful error messages, a familiar markdown-like syntax, and a real module/package system. By 2026 it has been adopted in undergraduate STEM courses and is appearing in journal templates — though LaTeX still dominates anywhere with strict publisher style files.

  • Why Markdown won despite being “wrong by design.” Gruber and Swartz’s Markdown was a pragmatic, lossy syntax with no formal grammar — every implementation parsed it slightly differently. CommonMark (MacFarlane, Atwood, Greenspan, 2014) finally produced a precise spec, after which forks proliferated: GFM (tables, task lists), MDX (JSX inside Markdown), MyST (RST/Sphinx interop), and Pandoc Markdown (kitchen-sink). The win condition was that Markdown reads as plain text — README.md renders identically as source and as HTML — which made it the default for git-hosted documentation, chat platforms (Slack, Discord, IRC variants), and static-site generators. Roughly 100 million README.md files on GitHub alone.

  • The literate-programming revival. Knuth’s WEB and Funnelweb mostly faded by the mid-1990s, but the idea returned via three vectors: Emacs Org-mode (Babel blocks executing 80+ languages with results inline), Pollen (Butterick’s Racket-based system for book-length programmable prose), RMarkdown → Quarto (Posit’s reproducible-research pipeline executing R, Python, Julia, Observable in one document), and the Jupyter notebook ecosystem adjacent to it. Quarto in particular gives a Pandoc-based publishing path where one source produces journal PDFs, websites, slides, and books with executed analyses — Knuth’s vision realized via Pandoc’s AST.

  • DocBook / DITA enterprise persistence. While developer docs migrated to Markdown and AsciiDoc, regulated-industry documentation (medical device IFUs, aerospace technical orders, automotive service manuals, large-scale software user guides) stayed on DocBook (semantic, schema-validated, single-source-multi-output) and DITA (topic-oriented, with content-reuse via conrefs and conditional processing). DITA 2.0 (2024) modernized the architecture; the DITA Open Toolkit remains the canonical processor. Component content management systems (oXygen, IXIASOFT, Heretto, Paligo) build entire commercial product lines around these XML formats.

  • The man-page lineage: troff → groff → mdoc. Ossanna’s troff (1973) and its phototypesetter cousin nroff are among the oldest still-running document formats. James Clark’s groff (GNU 1990) is the rendering engine on Linux/macOS. The mdoc macro set (Cynthia Livingston, UC Berkeley CSRG, 1990) replaced the older man macros with semantic markup (.Nm, .Fl, .Ar) and is the BSD/macOS standard. mandoc (Kristaps Dzonsons, OpenBSD) is the modern fast renderer. Heirloom Doctools preserves the classical AT&T/Plan 9 troff implementation with OpenType support. Few formats have survived 50+ years of continuous production use.

  • Pandoc as the universal document interchange. John MacFarlane’s Pandoc (Haskell, 2006) is the documentation world’s ffmpeg: read 30+ input formats (Markdown variants, RST, AsciiDoc, DocBook, LaTeX, HTML, .docx, .odt, .epub, .ipynb, MediaWiki, Org, Textile…) into a unified AST, then write 40+ output formats. It powers Quarto, R bookdown, large parts of academic publishing toolchains, and most “convert this Word doc to Markdown” workflows in existence. Lua filters let you transform the AST in flight. The 3.x series (2024-2026) added native Typst output, bringing the Typst/LaTeX/Pandoc trinity into a single-toolchain reproducible-publishing story.

Citations