i18n / Localization / Locale-Data DSLs Family Index


type: language-family-index family: i18n-locale languages_catalogued: 24 tags: [language-reference, family-index, i18n-locale, cldr, icu, messageformat, gettext, fluent, xliff, tmx, tbx, plurals, bidi]

i18n / Localization / Locale-Data — Family Index

Family overview

Internationalization-and-localization DSLs are the textual languages that describe locale data (how a culture writes dates, sorts strings, pluralises nouns, formats numbers, lays out bidi text), message templates (how a translatable string with placeholders is authored, plural-selected, gender-selected, and rendered), and interchange formats (how translations move between developers, translators, and translation-management systems). The family sits in three concentric rings: a foundational locale-data layer (Unicode + CLDR), a message-formatting layer (gettext, ICU MessageFormat 1.x → MessageFormat 2.0, Mozilla Fluent, Rails I18n YAML, FormatJS), and a platform-specific resource-file zoo (Android strings.xml, iOS Localizable.xcstrings, .NET .resx, Java .properties, Qt .ts).

The foundational layer is dominated by CLDR. The Unicode Common Locale Data Repository, expressed in LDML (UTS #35), is the canonical source of locale data — calendars, plural rules, collation, transforms, currency formatting, and more — for every modern OS, browser, and runtime. CLDR ships twice a year (April and October); the current line is CLDR 48 (October 2025, paired with Unicode 17 and ICU 78 released March 2026), with CLDR 49 in General Submission since April 2026. Because every JS Intl, Java ICU, .NET globalisation feature, Windows National Language Support, glibc locale, and macOS/iOS/Android locale traces back to CLDR, version-pinning matters for reproducibility.

The message-formatting layer is in the middle of a generational shift. ICU MessageFormat 1.x — invented inside ICU in the early 2000s and embedded into Java’s java.text.MessageFormat, .NET’s string.Format, and FormatJS / Intl.MessageFormat — had real ergonomic problems: nested plurals were near-unreadable, the function vocabulary (number, plural, select, selectordinal) was non-extensible, and translator UX suffered. MessageFormat 2.0 advanced from Final Candidate to Stable in CLDR 47 (March 2025) and is now a stable part of LDML and Unicode’s recommended successor; CLDR 47/48 ship its data and the Unicode message-format-wg drove formal approval. In parallel, XLIFF 2.2 became an OASIS Specification on March 13, 2025, adding a Plural/Gender/Select Module aligned conceptually with MF2. Mozilla Fluent (FTL) remains the heterodox alternative, used heavily across Firefox and sister projects (60% more Fluent strings landed in Firefox during 2025), with deliberately translator-first ergonomics and asymmetric branching.

The platform zoo is messier and stickier. Apple introduced Localizable.xcstrings (a JSON-based “String Catalog”) in Xcode 15 (2023), unifying the legacy .strings + .stringsdict + XLIFF flow into a single source of truth; Xcode 16/17 (2025–2026) defaults to xcstrings for new projects. Android remains on res/values/strings.xml for backward and tooling-lock-in reasons. Java’s .properties is still alive in 2026 despite its long-standing Unicode-escape pathology. .NET’s .resx/.resw continue as ResourceManager’s interchange. The interchange layer (XLIFF, TMX, TBX) is the silent plumbing under translation-management systems — XLIFF for in-flight bilingual files, TMX for translation-memory exchange (still on the 2005 1.4b spec, never updated after LISA’s 2011 dissolution), and TBX for terminology (now ISO 30042:2019 / TBX 3.0).

In our deep library

None catalogued as standalone Tier 1/2 notes — i18n DSLs are auxiliary to the host platforms they live on.

Cross-reference:

  • api-description — XLIFF/TMX/TBX are XML schemas with formal XSDs, like SOAP/WSDL/OpenAPI.
  • notation-spec — CLDR plural-rules (“i = 1 and v = 0”) and Unicode bidi controls are small formal languages.
  • document-typesetting — TeX babel/polyglossia packages are the typeset-document parallel of these DSLs.
  • citation-formats — MARC 880 fields handle alternate-graphic-representation for bibliographic localization.
  • javascriptIntl, FormatJS / intl-messageformat, i18next, react-intl, LinguiJS all live here.
  • javaResourceBundle, java.text.MessageFormat, ICU4J.
  • csharp.resx/.resw + System.Resources.ResourceManager + System.Globalization.
  • pythongettext stdlib, Babel (Python), flask-babel, django.utils.translation.
  • ruby — Rails I18n YAML.
  • swiftLocalizable.xcstrings + String.LocalizationValue.
  • kotlin — Android strings.xml.
  • cpp — ICU4C, gettext, Qt Linguist.

Tier 3 family table — Master locale data

FormatFirst appearedOriginTypeStatus (2026)URL
CLDR LDML (Locale Data Markup Language, UTS #35)2003 (CLDR 1.0)Unicode Consortium (Mark Davis et al.)XML schema describing every locale’s dates, numbers, calendars, collation, plurals, transformsVery active — current is CLDR 48 (Oct 2025, paired w/ Unicode 17), CLDR 49 in Survey-Tool General Submission since 2026-04-29; twice-yearly cadencehttps://cldr.unicode.org/
CLDR Plural Rules DSL2007 (CLDR 1.6)Unicode CLDR-TCTiny embedded rule language (i = 1 and v = 0, n % 10 = 0..2) — operands n, i, v, w, f, t, c, eStable, normative inside LDML Part 3 (Numbers); spec stable across CLDR 47/48https://www.unicode.org/reports/tr35/tr35-numbers.html#Language_Plural_Rules
CLDR Transforms / ICU rule-based transliteration2005 (CLDR 1.3)Unicode CLDR + ICURule DSL for transliteration (e.g. Cyrl-Latn, Han-Latn, Any-NFC) — left-to-right rewrite rules with contextActive; CLDR 48 added Han→Latin and Gujarati→Latin updates for Unicode 17https://www.unicode.org/reports/tr35/tr35-general.html#Transforms
Unicode Bidi Algorithm controls2001 (UAX #9)Unicode ConsortiumTiny “language” of bidi-control codepoints: LRM, RLM, ALM, LRE/RLE/PDF (deprecated), LRI/RLI/FSI/PDIActive; UAX #9 updates with each Unicode release (Unicode 17 in Sept 2025)https://www.unicode.org/reports/tr9/

Tier 3 family table — Message format / placeholder DSLs

FormatFirst appearedOriginTypeStatus (2026)URL
GNU gettext PO/POT1995Sun → Free Software Foundation (Ulrich Drepper et al.)msgid/msgstr text format with Plural-Forms header expression; .po sources, .mo binary, .pot templateVery active — still the de facto UNIX/Linux i18n format; underpins Python, PHP, Perl, Ruby, GNOME, KDEhttps://www.gnu.org/software/gettext/manual/gettext.html
ICU MessageFormat 1.x~2000 (ICU MessageFormat), Java port 1.4 (2002)IBM ICU projectBrace-delimited DSL with {var, plural, ...}, {var, select, ...}, {var, selectordinal, ...}Active but superseded — still ubiquitous via Java MessageFormat, ICU4C, FormatJS, intl-messageformathttps://unicode-org.github.io/icu/userguide/format_parse/messages/
MessageFormat 2.0 (LDML Part 9)Draft 2020+, Stable in CLDR 47 (Mar 2025)Unicode message-format-wg (Mihai Niță, Addison Phillips, Eemeli Aro et al.)New syntax with declarations (.input, .local), matchers (.match), extensible function registry; designed for nested plural+genderRecommended — CLDR-blessed successor to MF1; stable as of LDML 47, refined in LDML 48; implementations in ICU4J 76+, ICU4C 76+, JS reference implhttps://www.unicode.org/reports/tr35/tr35-messageFormat.html
Mozilla Fluent (FTL)2017 (Project Fluent at Mozilla)Mozilla (Stas Małolepszy, Zibi Braniecki)Declarative .ftl files; asymmetric branching via { $var -> selectors; no printf-style; first-class translator workflowActive — heavy use in Firefox + sister projects; +60% Fluent strings in Firefox during 2025; spec stable, tooling under continued investment via moz-l10nhttps://projectfluent.org/
Java MessageFormat (java.text)JDK 1.1 (1997)Sun → Oracle / OpenJDK{0, choice, ...}, {0, number, ...}, {0, date, ...} — limited dialect of ICU MFActive, in JDK; ICU4J’s com.ibm.icu.text.MessageFormat is the richer alternativehttps://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/text/MessageFormat.html
Rails I18n / Ruby YAML locale2008 (Rails 2.2)David Heinemeier Hansson + Sven Fuchs et al.YAML files keyed by locale → key tree; interpolation %{name}, :count plural keys via CLDR rulesVery active, default Rails localization layerhttps://guides.rubyonrails.org/i18n.html
i18next JSON2011i18next project (Jan Mühlemann)JSON tree with _plural / _zero / _one / _other keys; or i18next-icu plugin for ICU MFVery active — dominant in JS/React/Vue/Angular ecosystems alongside FormatJShttps://www.i18next.com/
FormatJS / intl-messageformat2014 (Yahoo)Yahoo → OpenJS FoundationICU MessageFormat 1.x parser/runtime + React react-intl bindings; pursuing MF2 alignmentVery active, the canonical strict-ICU JS optionhttps://formatjs.github.io/

Tier 3 family table — Platform-specific resource files

FormatFirst appearedOriginTypeStatus (2026)URL
Java .properties ResourceBundleJDK 1.1 (1997)Sun → Oraclekey=value ISO-8859-1 (with \uXXXX escapes) or UTF-8 (since JDK 9 default)Active, ubiquitous on JVM; native2ascii legacy still bites pre-JDK 9 codebaseshttps://docs.oracle.com/en/java/javase/21/docs/api/java.base/java/util/ResourceBundle.html
.NET .resx / .resw.NET FX 1.0 (2002)MicrosoftXML key/value with strongly-typed designer codegen (Resources.Designer.cs); .resw is the WinRT/UWP variantActive, the canonical .NET 8/9 resource format; consumed by System.Resources.ResourceManagerhttps://learn.microsoft.com/en-us/dotnet/core/extensions/work-with-resx-files-programmatically
Android res/values/strings.xmlAndroid 1.0 (2008)GoogleXML with <string>, <plurals>, <string-array>; per-locale dirs (values-fr/, values-ar/); CDATA + Java/Kotlin String.format placeholdersVery active — backwards-compatibility lock-in keeps it the canonical Android format despite calls for replacementhttps://developer.android.com/guide/topics/resources/string-resource
iOS legacy .strings + .stringsdictNeXTSTEP era → Mac OS X 10.0 (2001)ApplePlain-text key/value "hello" = "Hola"; plus XML plist .stringsdict for pluralsLegacy — still widely used but Apple is steering everyone to xcstrings; Xcode auto-converts on demandhttps://developer.apple.com/documentation/foundation/optimizing-your-app-s-localization-for-language-direction
Apple Localizable.xcstrings (String Catalog)Xcode 15 (2023)AppleJSON-based unified format combining .strings + .stringsdict + XLIFF metadata; per-language translation-state tracking; device variations; substitutionsActive and default — Xcode 15+ default for new strings; Xcode 16/17 (2025–2026) tooling matures; “Convert to String Catalog” migrates legacy projectshttps://developer.apple.com/documentation/xcode/localizing-and-varying-text-with-a-string-catalog
Qt Linguist .ts (XML) + .qm (binary)Qt 1.x era (~1996), modernised in Qt 4 (2005)Trolltech → Qt CompanyXML translation source (.ts) compiled by lrelease to binary .qm; lupdate extracts from C++/QMLActive, current in Qt 6.7+ (2025–2026); also accepts XLIFF as alternativehttps://doc.qt.io/qt-6/linguist-ts-file-format.html
Mozilla DTD / .properties / .lang (legacy)2000s (XUL Firefox)MozillaLegacy XUL-era DTD entity files plus Java-style .properties; superseded by Fluent inside FirefoxLegacy — most strings migrated to Fluent by 2024; residual files remain in older add-on ecosystemshttps://firefox-source-docs.mozilla.org/l10n/fluent/migrations.html

Tier 3 family table — Translation interchange / TMS

FormatFirst appearedOriginTypeStatus (2026)URL
XLIFF 2.2XLIFF 1.0 (2002), 2.0 (2014), 2.1 (2018, ISO 21720:2024 in July 2024), 2.2 OASIS Specification 2025-03-13OASIS XLIFF TCXML interchange of bilingual translatable content; modules for glossary, change-tracking, ITS, Plural/Gender/Select (new in 2.2)Current is 2.2 (March 2025); 2.1 also widely deployed; 1.2 still in legacy toolinghttps://docs.oasis-open.org/xliff/xliff-core/v2.2/xliff-core-v2.2-part1.html
TMX (Translation Memory eXchange) 1.4b1997 (TMX 1.0), 1.4b in 2005LISA OSCAR (now defunct)XML for translation-memory interchange across CAT toolsFrozen at 1.4b — LISA dissolved 2011; TMX 2.0 was drafted but never finalised; still ubiquitoushttps://www.gala-global.org/lisa-oscar-standards
TBX (TermBase eXchange) / ISO 30042:2019TBX 1.0 (2002), ISO 30042:2008, ISO 30042:2019 (TBX 3.0)LISA → ISO TC 37XML for terminology databases with concept-oriented structureActive — TBX 3.0 / ISO 30042:2019 is current; managed under ISO since LISA’s 2011 dissolutionhttps://www.iso.org/standard/62510.html
iOS XLIFF export (Apple flavour)Xcode 6 (2014)AppleXLIFF 1.2 dialect used by xcodebuild -exportLocalizationsLegacy / declining — being supplanted by xcstrings as the source of truth; Xcode still exports XLIFF for vendor handoffhttps://developer.apple.com/documentation/xcode/exporting-localizations
MARC 880 alternate-graphic-representationMARC 21 (1999, building on MARC 1968)Library of Congress / OCLCBibliographic-record subfield 880 holding parallel script versions of fields (e.g. CJK + Latin transliteration)Active in library cataloguing — OCLC, LC, national libraries; cross-link citation-formatshttps://www.loc.gov/marc/bibliographic/bd880.html

Notable threads

  • CLDR is the silent foundation under everything. Every modern OS (Windows NLS, macOS/iOS, Android, ChromeOS, glibc) bundles CLDR-derived locale data; every browser Intl API, Java/ICU4J, .NET globalisation feature, Python Babel, and PHP Intl extension traces back to CLDR XML. The twice-yearly cadence (April/October) and version-pinning matter for reproducibility — your test suite that asserts a specific date format may break across CLDR versions because, say, French formal-form spacing rules changed. CLDR 48 (Oct 2025, paired with Unicode 17 + ICU 78) is the current stable line, with CLDR 49 in Survey-Tool General Submission since 2026-04-29.

  • MessageFormat 2.0 is the long-awaited fix to MF1’s ergonomic problems. ICU MessageFormat 1.x suffered from three real defects: nested {count, plural, ...} inside {gender, select, ...} was painful to author and untranslatable; the function vocabulary was hard-coded (no third-party {var, currency, ...}); and the translator UX assumed the translator understood ICU syntax. MF2 (LDML Part 9, stable in CLDR 47 March 2025, refined in CLDR 48) introduces declarations (.input, .local), an explicit .match matcher with multi-key keys for combined plural+gender, and an extensible function registry — translators can work with structurally tagged messages rather than nested braces. ICU4J 76+ and ICU4C 76+ ship reference implementations; FormatJS, i18next-icu, and Fluent-adjacent tooling are all moving toward MF2 alignment.

  • Fluent’s translator-first bet on grammatical-gender + asymmetry. Mozilla designed Fluent specifically around the observation that languages do not have isomorphic grammar — a single English source string may need 8 distinct Polish forms (gender × number × case) and zero distinct Japanese forms. Fluent makes asymmetric branching ({ $userGender -> [feminine] ... [masculine] ... *[other] ... }) idiomatic, lets translators add or remove branches without round-tripping with developers, and refuses to be printf-shaped. Adoption is heavy inside Mozilla (Firefox, Thunderbird, MDN, Pontoon) and growing slowly elsewhere; outside the Mozilla orbit, MF2 is the consensus path forward.

  • Apple’s xcstrings (Xcode 15, 2023) is a category move. Before xcstrings, an Apple-platform localisation pipeline meant juggling .strings (regular keys), .stringsdict (plurals only, in plist XML), per-target Info.plist localisation, and XLIFF round-trips with vendors — four format dialects per app. Localizable.xcstrings is a single JSON file with first-class translation-state tracking (new, translated, needs_review, stale), device variations (iphone, ipad, mac, watch), and substitution placeholders. Xcode 15 made it the default for new strings; Xcode 16/17 (2025–2026) matured the tooling. The trade-off: xcstrings is Apple-only with no formal interchange spec, so vendor TMS integrations route through XLIFF export.

  • The XLIFF / TMX / TBX standards are unevenly maintained. XLIFF is the live one — 2.0 in 2014, 2.1 in 2018 (became ISO 21720:2024 in July 2024), 2.2 became an OASIS Specification on 2025-03-13 with a new Plural/Gender/Select Module conceptually aligned with MF2 and XLIFF 2.2. TMX is frozen at 1.4b (2005) — LISA, the original maintainer, dissolved in 2011, TMX 2.0 was drafted but never finalised, and the format is now governed by GALA under Creative Commons; despite that, every CAT tool still imports/exports TMX 1.4b. TBX transitioned cleanly to ISO TC 37 stewardship and is now ISO 30042:2019 (TBX 3.0).

  • The plural-rules problem nobody escapes. English has 2 plural categories (one, other); Russian has 4 (one, few, many, other); Arabic has 6 (zero, one, two, few, many, other); Welsh has its own quirky set; some languages have 0 distinct plural forms (Japanese, Chinese, Korean). CLDR’s plural-rules DSL — small expressions over operands n, i, v, w, f, t, c, e (number, integer digits, visible fractional digits, etc.) — is a tiny but normative formal language embedded inside LDML Part 3 (Numbers). Every i18n stack (gettext Plural-Forms, ICU plural selector, MF2 :number matcher, Fluent’s plural functions, Rails’s :count rule) ultimately defers to CLDR plural categories. Getting plurals wrong is the single most common visible-localization defect.

  • The bidi/RTL story is harder than it looks. Unicode’s bidi algorithm (UAX #9) is itself a small declarative “language”: embedding levels, paragraph direction inference, isolate-vs-embed semantics (isolates LRI/RLI/FSI/PDI, post-Unicode 6.3, replaced the deprecated LRE/RLE/PDF embeddings), and the bidi-control codepoints (LRM, RLM, ALM). Mishandled bidi shows up as garbled phone numbers, broken filenames, mis-aligned Arabic punctuation, and the famous Trojan-Source attack-class (CVE-2021-42574). When user-supplied content mixes scripts, the only safe pattern is to wrap with FSI…PDI (“first-strong isolate … pop directional isolate”).

  • Why Android stuck with strings.xml in 2026. Backward compatibility and tooling lock-in. Android Studio’s resource manager, Lint rules (MissingTranslation, Typos, ImpliedQuantity), AAPT2 string interning, and the entire Resources.getString() runtime are calibrated for strings.xml + <plurals> + values-<locale>/ directories. Google has discussed a more structured replacement multiple times but has never shipped one — the migration cost across the Play Store is too high. The format limps along with <xliff:g> placeholder annotations as its only real concession to modern i18n thinking.

Citations