Document / Typesetting Languages Family Index
type: language-family-index family: document-typesetting languages_catalogued: 25 tags: [language-reference, family-index, typesetting, documents, markup, publishing, literate-programming]
Document / Typesetting Languages — Family Index
Family overview
The document-language family begins with Donald Knuth’s TeX (1978), born from his frustration with the typesetting of The Art of Computer Programming. TeX’s box-and-glue model and Plass-Knuth line-breaking algorithm still set the gold standard for paragraph composition almost fifty years later. LaTeX (Leslie Lamport, 1985) wrapped TeX in a logical-markup macro layer that hid the typesetting plumbing behind \section, \cite, \label — and became the lingua franca of mathematics, physics, computer science, and academic publishing. ConTeXt (Hans Hagen, late 1990s) took the opposite route: a more uniform, document-design-oriented superset built atop the same engine (now LuaTeX), favored in print houses and complex multilingual publication. Modern descendants include SILE (Lua-driven, Unicode-first, ~2014), Patoline (OCaml-based, glyph-precise), and the rising Typst (Rust-based, 2023) — Typst in particular has captured significant academic mindshare as a “LaTeX without the pain” with markdown-like syntax, real error messages, and incremental compilation.
The 2000s-2010s brought a lightweight markup explosion: John Gruber and Aaron Swartz’s Markdown (2004) was deliberately “wrong by design” (lossy, ambiguous) but won by being readable as plain text. CommonMark (2014) finally standardized the parser ambiguities; GFM layered tables, task lists, and strikethrough for GitHub. Parallel tracks — AsciiDoc (Stuart Rackham, 2002, semantically richer than Markdown), reStructuredText (Goodger, 2002, Python-docs ecosystem), MultiMarkdown, kramdown, and Pandoc Markdown (John MacFarlane’s union-of-features dialect) — split the niche along expressiveness/portability axes. Pandoc itself became the universal interchange: any markup in, any format out.
A schema split runs through the family: layout languages (TeX, Typst, troff) describe glyphs on pages; semantic markup (DocBook, DITA, Texinfo) describes document structure for downstream rendering; lightweight (Markdown family, AsciiDoc, RST) splits the difference for human-authored prose. Cross-cutting these is the literate-programming branch — Knuth’s original WEB and Funnelweb, Emacs Org-mode (Carsten Dominik, 2003), Matthew Butterick’s Pollen (Racket-based), and Posit’s Quarto (2022, RMarkdown’s successor) — where executable code blocks weave with prose to produce both program and document. The Unix man-page lineage (troff/nroff/groff/Heirloom troff/mdoc) remains in production after 50+ years for one of the most stable document formats ever shipped.
In our deep library
None catalogued in the deep library (these are markup/typesetting, not general-purpose programming). Cross-reference music-audio for LilyPond and ABC notation, and scientific for RMarkdown/Quarto literate workflows.
Tier 3 family table
| Language | First appeared | Origin | Output | Status (2026) | URL |
|---|---|---|---|---|---|
| TeX | 1978 | Donald Knuth (Stanford) | DVI / PDF (via pdfTeX, XeTeX, LuaTeX) | Frozen by design (Knuth’s bug-fix-only policy); engine layer thrives via XeTeX/LuaTeX | https://www.tug.org/ |
| LaTeX | 1985 | Leslie Lamport (DEC SRC) | DVI / PDF; HTML via LaTeXML | Active; LaTeX3 / expl3 modernization; dominant in STEM publishing | https://www.latex-project.org/ |
| ConTeXt | ~1996 | Hans Hagen, Ton Otten (Pragma ADE) | PDF via LuaTeX (ConTeXt MkIV / LMTX) | Active; favored for complex layout, multilingual, print production | https://wiki.contextgarden.net/ |
| Typst | 2023 | Martin Haug, Laurenz Mädje (TU Berlin) | PDF / PNG / SVG; web app + open-source CLI | Very active; rapid academic adoption; v0.13 series in 2026 | https://typst.app/ |
| Pollen | 2014 | Matthew Butterick | HTML / PDF (via Racket Scribble); X-expressions | Active but niche; powers Butterick’s Practical Typography and Beautiful Racket | https://docs.racket-lang.org/pollen/ |
| LilyPond | 1996 | Han-Wen Nienhuys, Jan Nieuwenhuizen | PDF / SVG / MIDI; engraved music notation | Active; cross-ref music-audio | https://lilypond.org/ |
| AsciiDoc / Asciidoctor | 2002 (AsciiDoc); 2013 (Asciidoctor Ruby impl) | Stuart Rackham; Dan Allen / OpenDevise | HTML / PDF / DocBook / EPUB / man pages | Active; Eclipse Foundation AsciiDoc Working Group standardizing the language (2024-2026) | https://asciidoctor.org/ |
| reStructuredText (RST) | 2002 | David Goodger (Docutils) | HTML / LaTeX / man / XML; Sphinx pipeline | Active; canonical for Python docs ecosystem (Sphinx, Read the Docs) | https://docutils.sourceforge.io/rst.html |
| Pandoc Markdown | 2006 | John MacFarlane (UC Berkeley) | 40+ formats via Pandoc AST | Very active; effectively the universal-markup superset | https://pandoc.org/MANUAL.html |
| groff / troff / nroff | 1973 (troff, Joe Ossanna); 1990 (groff, James Clark) | Bell Labs; later GNU | PostScript / PDF / terminal text; Unix man pages | Maintained; groff is the ubiquitous Linux/macOS man-page renderer | https://www.gnu.org/software/groff/ |
| mdoc | 1990 | Cynthia Livingston (UC Berkeley CSRG) | Semantic macro set for BSD man pages; rendered via groff/mandoc | Active; default man-page format on FreeBSD, OpenBSD, NetBSD, macOS | https://mandoc.bsd.lv/ |
| CommonMark | 2014 | John MacFarlane, Jeff Atwood, David Greenspan, et al. | HTML; reference C/JS implementations | Active; spec 0.31.2 (2024); foundation for GFM, MDX, many engines | https://commonmark.org/ |
| GitHub Flavored Markdown (GFM) | 2017 (formal spec) | GitHub | HTML; CommonMark + tables, task lists, strikethrough, autolinks | Active; arguably the most-read markup dialect on the planet | https://github.github.com/gfm/ |
| MultiMarkdown (MMD) | 2005 | Fletcher Penney | HTML / LaTeX / OpenDocument / RTF; metadata, tables, footnotes, math | Maintained (MMD 6); influenced Pandoc and CommonMark feature sets | https://fletcher.github.io/MultiMarkdown-6/ |
| kramdown | 2007 | Thomas Leitner | HTML / LaTeX / PDF; default Jekyll engine | Active; powers Jekyll / GitHub Pages alongside CommonMark-GFM | https://kramdown.gettalong.org/ |
| DocBook | 1991 | HaL Computer Systems + O’Reilly; OASIS standard since 1998 | XSL-FO → PDF; HTML; EPUB; semantic XML schema | Maintained; persistent in tech-publishing toolchains (O’Reilly historically, Linux Documentation Project, FreeBSD Handbook) | https://docbook.org/ |
| DITA | 2001 | IBM (originally); OASIS standard since 2005 | HTML / PDF / EPUB via DITA Open Toolkit; topic-based authoring | Active in regulated industries (medical devices, aerospace, software docs); DITA 2.0 (2024) | https://www.dita-ot.org/ |
| Texinfo | 1986 | Richard Stallman, Bob Chassell (FSF) | GNU Info / HTML / PDF / EPUB; one source, many outputs | Active; canonical format for GNU project documentation (Emacs, GCC, Bash) | https://www.gnu.org/software/texinfo/ |
| Quarto | 2022 | Posit (formerly RStudio); JJ Allaire, Carlos Scheidegger, et al. | HTML / PDF / Word / EPUB / slides; executes R, Python, Julia, Observable | Very active; the de-facto successor to RMarkdown for reproducible publishing | https://quarto.org/ |
| SILE | 2014 | Simon Cozens | PDF; Lua-scripted typesetter using TeX line-breaking + HarfBuzz | Active; v0.15 (2024); Unicode/OpenType-first alternative to TeX | https://sile-typesetter.org/ |
| Patoline | ~2012 | Pierre-Étienne Meunier, Christophe Raffalli (Univ. Savoie) | PDF / SVG; OCaml-based; programmable typesetter | Mostly dormant; small academic following | http://patoline.org/ |
| Lout | 1991 | Jeffrey Kingston (Univ. Sydney) | PostScript / PDF; functional language, no TeX dependency | Maintained but niche; small user base | https://github.com/william8000/lout |
| Heirloom troff (Heirloom Doctools) | 2005 | Carsten Bormann; revived from Solaris/Plan 9 sources | PostScript / PDF; high-fidelity classical troff lineage | Maintained; preferred for traditional troff workflows wanting OpenType + classical algorithms | https://n-t-roff.github.io/heirloom/doctools.html |
| Funnelweb | 1986 | Ross Williams (Univ. Adelaide) | Tangled source + woven typeset doc; language-agnostic literate programming | Historical; v3.2 (1999) is the last release; cited in literate-programming surveys | http://www.ross.net/funnelweb/ |
| Org-mode | 2003 | Carsten Dominik (Emacs) | HTML / LaTeX / PDF / ODT / Beamer / Reveal.js; Babel for code execution | Very active; document + literate-programming + agenda + outliner | https://orgmode.org/ |
Notable threads
-
Knuth’s TeX vs Typst’s modern challenge. TeX’s box-and-glue typesetting and the Plass-Knuth optimal-fit line breaker remain unmatched for fine paragraph composition (justified text, mathematical layout). Knuth famously declared TeX “frozen” — only bug fixes, version number asymptotically approaching π. The macro layer (LaTeX) and engine layer (pdfTeX → XeTeX → LuaTeX) absorb innovation while the core stays stable. Typst (Berlin, 2023) bet that academic users would trade some typographic perfection for a sub-second incremental compile, helpful error messages, a familiar markdown-like syntax, and a real module/package system. By 2026 it has been adopted in undergraduate STEM courses and is appearing in journal templates — though LaTeX still dominates anywhere with strict publisher style files.
-
Why Markdown won despite being “wrong by design.” Gruber and Swartz’s Markdown was a pragmatic, lossy syntax with no formal grammar — every implementation parsed it slightly differently. CommonMark (MacFarlane, Atwood, Greenspan, 2014) finally produced a precise spec, after which forks proliferated: GFM (tables, task lists), MDX (JSX inside Markdown), MyST (RST/Sphinx interop), and Pandoc Markdown (kitchen-sink). The win condition was that Markdown reads as plain text — README.md renders identically as source and as HTML — which made it the default for git-hosted documentation, chat platforms (Slack, Discord, IRC variants), and static-site generators. Roughly 100 million README.md files on GitHub alone.
-
The literate-programming revival. Knuth’s WEB and Funnelweb mostly faded by the mid-1990s, but the idea returned via three vectors: Emacs Org-mode (Babel blocks executing 80+ languages with results inline), Pollen (Butterick’s Racket-based system for book-length programmable prose), RMarkdown → Quarto (Posit’s reproducible-research pipeline executing R, Python, Julia, Observable in one document), and the Jupyter notebook ecosystem adjacent to it. Quarto in particular gives a Pandoc-based publishing path where one source produces journal PDFs, websites, slides, and books with executed analyses — Knuth’s vision realized via Pandoc’s AST.
-
DocBook / DITA enterprise persistence. While developer docs migrated to Markdown and AsciiDoc, regulated-industry documentation (medical device IFUs, aerospace technical orders, automotive service manuals, large-scale software user guides) stayed on DocBook (semantic, schema-validated, single-source-multi-output) and DITA (topic-oriented, with content-reuse via conrefs and conditional processing). DITA 2.0 (2024) modernized the architecture; the DITA Open Toolkit remains the canonical processor. Component content management systems (oXygen, IXIASOFT, Heretto, Paligo) build entire commercial product lines around these XML formats.
-
The man-page lineage: troff → groff → mdoc. Ossanna’s troff (1973) and its phototypesetter cousin nroff are among the oldest still-running document formats. James Clark’s groff (GNU 1990) is the rendering engine on Linux/macOS. The mdoc macro set (Cynthia Livingston, UC Berkeley CSRG, 1990) replaced the older
manmacros with semantic markup (.Nm,.Fl,.Ar) and is the BSD/macOS standard. mandoc (Kristaps Dzonsons, OpenBSD) is the modern fast renderer. Heirloom Doctools preserves the classical AT&T/Plan 9 troff implementation with OpenType support. Few formats have survived 50+ years of continuous production use. -
Pandoc as the universal document interchange. John MacFarlane’s Pandoc (Haskell, 2006) is the documentation world’s
ffmpeg: read 30+ input formats (Markdown variants, RST, AsciiDoc, DocBook, LaTeX, HTML, .docx, .odt, .epub, .ipynb, MediaWiki, Org, Textile…) into a unified AST, then write 40+ output formats. It powers Quarto, R bookdown, large parts of academic publishing toolchains, and most “convert this Word doc to Markdown” workflows in existence. Lua filters let you transform the AST in flight. The 3.x series (2024-2026) added native Typst output, bringing the Typst/LaTeX/Pandoc trinity into a single-toolchain reproducible-publishing story.
Citations
- https://www.tug.org/
- https://www.latex-project.org/
- https://wiki.contextgarden.net/
- https://typst.app/
- https://github.com/typst/typst
- https://docs.racket-lang.org/pollen/
- https://lilypond.org/
- https://asciidoctor.org/
- https://docs.asciidoctor.org/
- https://docutils.sourceforge.io/rst.html
- https://pandoc.org/
- https://pandoc.org/MANUAL.html
- https://www.gnu.org/software/groff/
- https://mandoc.bsd.lv/
- https://commonmark.org/
- https://github.github.com/gfm/
- https://fletcher.github.io/MultiMarkdown-6/
- https://kramdown.gettalong.org/
- https://docbook.org/
- https://www.dita-ot.org/
- https://www.gnu.org/software/texinfo/
- https://quarto.org/
- https://sile-typesetter.org/
- http://patoline.org/
- https://github.com/william8000/lout
- https://n-t-roff.github.io/heirloom/doctools.html
- http://www.ross.net/funnelweb/
- https://orgmode.org/
- https://en.wikipedia.org/wiki/TeX
- https://en.wikipedia.org/wiki/LaTeX
- https://en.wikipedia.org/wiki/Markdown
- https://en.wikipedia.org/wiki/DocBook
- https://en.wikipedia.org/wiki/Darwin_Information_Typing_Architecture