Genealogy / Family-History / Heritage-Data DSLs Family Index


type: language-family-index family: genealogy languages_catalogued: 20 tags: [language-reference, family-index, genealogy, gedcom, gedcom-x, fhiso, gramps, dna, ancestry, pedigree-numbering]

Genealogy / Family-History — Family Index

Family overview

Genealogy is one of the few computing domains where a single 1984 file format still dominates 42 years later. GEDCOM (GEnealogical Data COMmunication) was published by The Church of Jesus Christ of Latter-day Saints / FamilySearch in 1984 as a line-oriented, level-prefixed, tag-based text format — 0 @I1@ INDI declares an individual at level 0, 1 NAME John /Smith/ adds their name at level 1, 2 GIVN John refines at level 2. The 5.5 lineage (1996), then 5.5.1 (published 1999, finalised 2019 after two decades of de-facto-standard limbo) remained the workhorse import/export format across every genealogy tool — FTM, RootsMagic, Legacy, Gramps, MyHeritage, Ancestry, webtrees, TNG — well into the 2020s. The format’s longevity is half blessing (universal interchange) and half curse (ad-hoc extension mechanism via custom _TAGS, no formal schema, ambiguous date and place semantics, ASCII / ANSEL / UTF-8 encoding chaos).

The modernisation push is FamilySearch GEDCOM 7.0, released 19 May 2021 and now at 7.0.18 (the 7.0.x line saw 7.0.16 in March 2025 and 7.0.17 in February 2026, with 7.1 in working-draft on GitHub). GEDCOM 7 mandates UTF-8 (no more ANSEL), gives every tag a URI for unambiguous extensibility, formalises structured types (DATE, PLACE, AGE), introduces the .gdz zip-package format bundling media alongside the .ged, and is maintained on GitHub with a public changelog. Adoption is real but slow: most consumer software still exports GEDCOM 5.5.1 by default for compatibility, while webtrees, TNG, Family Historian, Heredis, and a growing slice of the ecosystem read both flavours and a much smaller slice fully writes 7.0.

A second modernisation track, GEDCOM-X, was launched by FamilySearch in 2012 as a JSON/XML alternative built around RESTful interface definitions — conceptually cleaner (proper conclusion-vs-evidence model, formal name parts, source references as first-class entities). It never displaced the text-tag GEDCOM in practice. By 2021 FamilySearch had positioned GEDCOM-X for bulk transfer and app-to-app communication, while pushing GEDCOM 7 as the single-researcher interchange standard. The community standards-body alternative, FHISO (Family History Information Standards Organisation, incorporated 2013), pursues the same goal from outside the FamilySearch monopoly: Citation Elements (a modular evidence-citation DSL), and ELF (Extended Legacy Format, a GEDCOM-compatible serialisation with a real extensibility mechanism). FHISO’s pace is slow and committee-driven but it is the only neutral standards venue.

Surrounding the interchange layer is a proprietary file-format zoo: Family Tree Maker .ftm (Ancestry → Software MacKiev, FTM 2024 released May 2025), RootsMagic .rmgc/.rmtree/.rmbackup (SQLite-backed but proprietary schema), Legacy Family Tree, Reunion (Mac), Heredis (French), and MyHeritage/Ancestry.com TreeShare server-side formats. Gramps XML (.gramps, .gpkg) is the open-source outlier — a gzip-compressed XML superset of GEDCOM that Gramps 6.0.x (2025) recommends as the lossless format. The DNA-testing layer added a parallel pseudo-DSL since ~2007: 23andMe, AncestryDNA, MyHeritage DNA, FamilyTreeDNA all export tab-separated raw-data files containing ~600,000–1,000,000 SNP genotype calls (rsID, chromosome, position, allele1, allele2). GEDmatch is the cross-vendor comparison hub. (23andMe filed Chapter 11 in March 2025; raw-data download remained available through mid-2026 but vendor-lock risk is now actively discussed.) Finally, pedigree numbering systems (Ahnentafel since 1590, Henry 1935, d’Aboville 1940) are not file formats but compact textual notations for encoding ancestor / descendant position — still widely used in printed pedigrees and report generators.

In our deep library

None catalogued. Genealogy DSLs do not have standalone deep-library notes — they are domain formats rather than general-purpose languages.

Cross-reference:

  • citation-formats — FHISO Citation Elements is a sibling of MARC 800 personal-name authority records and the broader bibliographic-citation DSL family; sources/evidence in genealogy overlap heavily with library-science citation models.
  • api-description — GEDCOM-X JSON/XML schemas are described via the same JSON-Schema / XSD machinery used elsewhere; the GEDCOM-X RS specification is REST/HTTP interface-definition territory.
  • notation-spec — pedigree numbering systems (Ahnentafel, Henry, d’Aboville) are compact formal notations, conceptually adjacent to the systematic-naming family.
  • bio-fileformats — consumer-DNA raw data files (23andMe, AncestryDNA) are a low-density sibling of the VCF / PLINK / FASTQ formats used in scientific genomics; the same SNP positions are reported but with vendor-specific manifests.
  • document-typesetting — adjacent for pedigree-report generation pipelines.

Tier 3 family table — Open / interchange standards

FormatFirst appearedOriginTypeStatus (2026)URL
GEDCOM 5.5.11999 (published) / 2019 (finalised)FamilySearch / LDS ChurchLine-oriented, level-prefixed tag text; ANSEL/UTF-8Dominant de-facto interchange; still the default export of most software (webtrees stated target)https://gedcom.io/specifications/ged551.pdf
FamilySearch GEDCOM 7.019 May 2021FamilySearchModernised GEDCOM; UTF-8 mandatory, URIs for tags, structured types, .gdz zip packageActive — current 7.0.18; 7.0.17 (Feb 2026), 7.0.16 (Mar 2025); 7.1 in working draft on GitHubhttps://gedcom.io/specifications/FamilySearchGEDCOMv7.html
GEDCOM-X (JSON/XML)2012FamilySearchJSON + XML serialisations of an OO-conclusion data model; RESTful interface defs (gedcomx-rs)Maintained but stalled — never displaced GEDCOM tag format; FamilySearch repositioned it as a bulk-transfer / API model after GEDCOM 7 launchedhttps://www.familysearch.org/innovate/gedcom-x
FHISO ELF (Extended Legacy Format)Draft, ongoingFHISOGEDCOM-compatible serialisation with structured extensibilityDraft — community standards-body alternative; pre-1.0https://github.com/fhiso/legacy-format
FHISO Citation ElementsDraft 2017+, ongoingFHISOModular evidence-citation DSL with concepts + serialisation bindingsDraft — Concepts published Sep 2017; GEDCOM-X RDFa bindings published; GEDCOM 7 binding not yet draftedhttps://fhiso.org/TR/cev-concepts-20170911
Gramps XML (.gramps, .gpkg)2001Gramps open-source projectgzip-compressed XML superset of GEDCOM; .gpkg bundles mediaActive — Gramps 6.0.4 released 10 Aug 2025; recommended lossless formathttps://www.gramps-project.org/wiki/index.php/Gramps_XML
Wikidata genealogical statements2012Wikimedia FoundationRDF property graph; P22 father, P25 mother, P26 spouse, P40 child, P39/P22 chainsVery active — large public-figure pedigree dataset; queryable via SPARQL endpointhttps://www.wikidata.org/wiki/Wikidata:WikiProject_Genealogy
W3C / Schema.org Person2011W3C / Schema.org consortiumJSON-LD / RDFa vocabulary with parent, spouse, children, relatedTo; used by search-engine schema markupActive but minimal genealogical depth; oriented at web markup, not full pedigreeshttps://schema.org/Person
PROV-O genealogical adaptations2013+W3C PROV Working Group + academic adaptersOWL ontology for provenance; research papers map family-history evidence chains to PROV-O activities/agentsNiche / research — not a deployed exchange formathttps://www.w3.org/TR/prov-o/

Tier 3 family table — Proprietary / vendor formats

FormatFirst appearedOriginTypeStatus (2026)URL
Family Tree Maker .ftm1989 (FTM origin) / 2017 (MacKiev)Banner Blue → Broderbund → Ancestry.com → Software MacKievProprietary database file; SQLite-based since FTM 2014; round-trips via GEDCOMActive — FTM 2024 launched 10 May 2025; Ancestry tree sync + FamilySearch hintshttps://www.mackiev.com/ftm/
RootsMagic .rmgc / .rmtree / .rmbackup2003 (RM origin), .rmtree from RM9 (2022)RootsMagic Inc.SQLite-3 backed; proprietary schema; .rmgc legacy, .rmtree current, .rmbackup archivesActive — RM10 reads RM9 .rmtree files; .rmgc from RM4–7 requires explicit importhttps://support.rootsmagic.com/hc/en-us/articles/224924947
Legacy Family Tree1997Millennia CorporationProprietary FoxPro/DBF-derived schema; GEDCOM import/exportActive but reduced development velocity; popular in LDS communityhttps://legacyfamilytree.com/
MyHeritage native (cloud)2003MyHeritage Ltd.Server-side proprietary; GEDCOM import/export at user boundary; “Smart Matches” engineVery active — large commercial subscription servicehttps://www.myheritage.com/
Ancestry.com TreeShare~2014Ancestry.com LLCProprietary sync protocol between Ancestry online trees and FTM desktop; not a public file formatActive — required for FTM ↔ Ancestry tree synchttps://www.ancestry.com/c/family-tree-help/ancestry-treeshare-for-family-tree-maker
Reunion for Mac1990Leister ProductionsProprietary Mac-native database; GEDCOM import/exportActive — Reunion 14 current; long-standing Mac-only optionhttps://www.leisterpro.com/
Heredis1994BSD Concept (France)Proprietary French-origin; multi-platform; GEDCOM import/exportActive — strongest in French / European markethttps://www.heredis.com/
webtrees2010 (fork of PhpGedView)Greg Roach + community (open source)PHP / MySQL web app; GEDCOM-first storage; reads 5.5.1 + much of 7.0, exports 5.5.1Active — 2.x line current; partial GEDCOM 7 read support via Jefferson49 ExtendedImportExport modulehttps://www.webtrees.net/
TNG (The Next Generation of Genealogy Sitebuilding)2003Darrin Lythgoe (commercial)PHP / MySQL self-hosted web app; GEDCOM 5.5.1 + 7.0 import; one-time licenceActive — recent updates include GEDCOM 7.0 media cropping, mobile responsive layoutshttps://tngsitebuilding.com/
GenealogyJ2002Open-source community (Java)Java desktop app; GEDCOM 5.5/5.5.1 read/writeLargely dormant — minimal modern activity but binaries still distributedhttp://genealogyj.sourceforge.net/
PAF (Personal Ancestral File)1984FamilySearch / LDS ChurchProprietary DOS / Windows app; GEDCOM 5.5 exportDiscontinued 15 July 2013; last update 2002; ~3.2 million copies distributed in its lifetimehttps://www.familysearch.org/en/newsroom/personal-ancestral-file-paf-is-discontinued

Tier 3 family table — DNA / genetic ancestry

FormatFirst appearedOriginTypeStatus (2026)URL
23andMe raw data (.txt / .zip)200723andMe Inc.Tab-separated: rsID, chromosome, position, genotype (allele1+allele2); ~600k–1M SNPs per fileAt-risk — 23andMe filed Chapter 11 March 2025; raw download still available mid-2026 but vendor continuity uncertainhttps://customercare.23andme.com/hc/en-us/articles/212196868
AncestryDNA raw data2012Ancestry.comTab-separated SNP genotype file; ~700k autosomal positions; download from MyDNA tabActive — largest consumer database (>20M tested)https://support.ancestry.com/s/article/Downloading-AncestryDNA-Raw-Data
MyHeritage DNA raw data2016MyHeritage Ltd.CSV genotype file; uses Illumina OmniExpress-derived chipActivehttps://www.myheritage.com/
FamilyTreeDNA raw data2000 (autosomal from 2010)Gene by Gene / FamilyTreeDNAMultiple flavours: Family Finder autosomal CSV, Y-DNA STR/SNP, mtDNA HVRActive — only mainstream Y-DNA / mtDNA vendor with surname projectshttps://www.familytreedna.com/
GEDmatch upload format2010GEDmatch LLC (acquired by Verogen 2019, then Qiagen)Accepts normalised raw-data uploads from all five major vendors; cross-vendor SNP matchingActive — central comparison hub; gained notoriety via Golden State Killer (2018) law-enforcement usehttps://www.gedmatch.com/

Tier 3 family table — Pedigree numbering / notation

FormatFirst appearedOriginTypeStatus (2026)URL
Ahnentafel (Sosa-Stradonitz)1590Michaël Eytzinger (Cologne), formalised by Sosa (1676), popularised by Stradonitz (1898)Ascending: subject = 1, father = 2n, mother = 2n+1; binary positional encodingUniversal in genealogy software; default ancestor-numbering schemehttps://en.wikipedia.org/wiki/Ahnentafel
Henry System1935Reginald Buchanan HenryDescending: progenitor = 1, oldest child = 11, next = 12, oldest grandchild = 111Common in published descendancy reportshttps://en.wikipedia.org/wiki/Genealogical_numbering_systems#Henry_System
d’Aboville System1940Jacques d’Aboville (France)Descending; dotted-decimal variant of Henry (11 → 1.1, 112 → 1.1.2) — disambiguates >9 childrenCommon in French-language genealogy and many modern software exportshttps://en.wikipedia.org/wiki/Genealogical_numbering_systems#d.27Aboville_System

Notable threads

  • GEDCOM 5.5.1 persistence vs GEDCOM 7 adoption. Five years after the 7.0 release (May 2021 → mid-2026), GEDCOM 5.5.1 is still the dominant interchange format. Most desktop software (RootsMagic, Legacy, FTM 2024) defaults to 5.5.1 export for compatibility with the long tail of older tools; webtrees explicitly states 5.5.1 as its target and supports only a “very small list” of G7 tags on read. The slow uptake is structural: GEDCOM was always a lowest-common-denominator format and the network effect rewards being the format everyone else can read. GEDCOM 7’s stronger guarantees (UTF-8, URIs, structured DATE/PLACE) only matter when both ends of a transfer support them — until then, 5.5.1 wins by default.

  • GEDCOM-X’s failure to displace the tag format. When FamilySearch launched GEDCOM-X in 2012 the assumption was that a clean JSON/XML model with proper conclusion-vs-evidence separation, RESTful endpoints, and modern serialisation would obviously beat a 1984 line-oriented tag format. It didn’t. Reasons: (a) the installed base of GEDCOM-reading software was overwhelming, (b) GEDCOM-X’s data model was too normalised for casual hobbyist use, (c) the FamilySearch API became the de-facto deployment of GEDCOM-X rather than a community-wide interchange standard, and (d) the GEDCOM 7 modernisation effort siphoned community energy back to the tag format starting in 2019. As of 2026 GEDCOM-X is “maintained” but practically scoped to FamilySearch’s own APIs and bulk-transfer use cases.

  • FHISO as the community standards-body alternative. FHISO was incorporated in 2013 precisely because the genealogy world’s standards process had been owned by a single entity (FamilySearch / LDS Church) for 30 years. FHISO’s deliverables — Citation Elements, ELF (Extended Legacy Format) — explicitly target the gaps GEDCOM left: a structured citation/evidence model, and a real extensibility mechanism (vs GEDCOM’s ad-hoc _TAGS). The pace is slow and committee-driven; Citation Elements Concepts has been at draft since 2017 and the GEDCOM 7 binding is still unwritten. Whether FHISO ever delivers a widely-adopted standard remains genuinely open — but it is the only neutral venue.

  • Consumer-DNA database wars and their raw-data formats. The 2007–2020 boom of consumer DNA testing (23andMe 2007, FamilyTreeDNA autosomal 2010, AncestryDNA 2012, MyHeritage DNA 2016) created a parallel pseudo-DSL: tab-separated SNP genotype files in vendor-specific layouts. Cross-vendor comparison is handled by GEDmatch (founded 2010, acquired by Verogen / Qiagen in 2019) which normalises uploads from all five major vendors. The infamous 2018 Golden State Killer case used GEDmatch to identify a suspect by genetic genealogy, triggering policy reforms and a contraction of public-database opt-ins. The 23andMe Chapter 11 filing in March 2025 crystallised a long-standing risk: vendor lock-in of irreplaceable genetic data. Raw-data download remained available through mid-2026 but the long-term custody of those genotype files is now an active concern.

  • Proprietary file-format lock-in in genealogy software. FTM (.ftm), RootsMagic (.rmtree / .rmgc / .rmbackup), Legacy, Reunion, Heredis, MyHeritage cloud — each stores the user’s tree in a proprietary schema with GEDCOM as the only export path. The GEDCOM round-trip is lossy: custom fields, source-citation templates, multimedia metadata, place-name research notes, and the tool’s own data structures (Living Privatization rules, hint-source caches) don’t survive. Gramps XML is the open-source alternative — a gzip-compressed XML superset of GEDCOM that Gramps 6.0.x (2025) recommends as the lossless internal format — and the only major non-proprietary native store. The pattern is identical to the early-1990s word-processor zoo: every product had its own .doc/.wpd/.sam/.lwp, and a single lossy interchange format.

  • Pedigree numbering systems as a compact textual notation. Ahnentafel (Eytzinger 1590, popularised by Stradonitz 1898), Henry (1935), and d’Aboville (1940) are not file formats but micro-DSLs for encoding pedigree position in a single number. Ahnentafel is binary-positional (father = 2n, mother = 2n+1) so a string like “53” uniquely identifies “father’s mother’s father’s mother’s father” relative to the subject — the same trick decimal numbers use for positional value. The descending systems (Henry, d’Aboville) encode birth order down the generations: “1.1.2” = “first child of the first child of the progenitor.” All three are still in active use in printed reports, GEDCOM _AHNENTAFEL extensions, and software export options; d’Aboville is preferred when any generation has >9 children because the dots disambiguate.

Citations