Genealogy / Family-History / Heritage-Data DSLs Family Index
type: language-family-index family: genealogy languages_catalogued: 20 tags: [language-reference, family-index, genealogy, gedcom, gedcom-x, fhiso, gramps, dna, ancestry, pedigree-numbering]
Genealogy / Family-History — Family Index
Family overview
Genealogy is one of the few computing domains where a single 1984 file format still dominates 42 years later. GEDCOM (GEnealogical Data COMmunication) was published by The Church of Jesus Christ of Latter-day Saints / FamilySearch in 1984 as a line-oriented, level-prefixed, tag-based text format — 0 @I1@ INDI declares an individual at level 0, 1 NAME John /Smith/ adds their name at level 1, 2 GIVN John refines at level 2. The 5.5 lineage (1996), then 5.5.1 (published 1999, finalised 2019 after two decades of de-facto-standard limbo) remained the workhorse import/export format across every genealogy tool — FTM, RootsMagic, Legacy, Gramps, MyHeritage, Ancestry, webtrees, TNG — well into the 2020s. The format’s longevity is half blessing (universal interchange) and half curse (ad-hoc extension mechanism via custom _TAGS, no formal schema, ambiguous date and place semantics, ASCII / ANSEL / UTF-8 encoding chaos).
The modernisation push is FamilySearch GEDCOM 7.0, released 19 May 2021 and now at 7.0.18 (the 7.0.x line saw 7.0.16 in March 2025 and 7.0.17 in February 2026, with 7.1 in working-draft on GitHub). GEDCOM 7 mandates UTF-8 (no more ANSEL), gives every tag a URI for unambiguous extensibility, formalises structured types (DATE, PLACE, AGE), introduces the .gdz zip-package format bundling media alongside the .ged, and is maintained on GitHub with a public changelog. Adoption is real but slow: most consumer software still exports GEDCOM 5.5.1 by default for compatibility, while webtrees, TNG, Family Historian, Heredis, and a growing slice of the ecosystem read both flavours and a much smaller slice fully writes 7.0.
A second modernisation track, GEDCOM-X, was launched by FamilySearch in 2012 as a JSON/XML alternative built around RESTful interface definitions — conceptually cleaner (proper conclusion-vs-evidence model, formal name parts, source references as first-class entities). It never displaced the text-tag GEDCOM in practice. By 2021 FamilySearch had positioned GEDCOM-X for bulk transfer and app-to-app communication, while pushing GEDCOM 7 as the single-researcher interchange standard. The community standards-body alternative, FHISO (Family History Information Standards Organisation, incorporated 2013), pursues the same goal from outside the FamilySearch monopoly: Citation Elements (a modular evidence-citation DSL), and ELF (Extended Legacy Format, a GEDCOM-compatible serialisation with a real extensibility mechanism). FHISO’s pace is slow and committee-driven but it is the only neutral standards venue.
Surrounding the interchange layer is a proprietary file-format zoo: Family Tree Maker .ftm (Ancestry → Software MacKiev, FTM 2024 released May 2025), RootsMagic .rmgc/.rmtree/.rmbackup (SQLite-backed but proprietary schema), Legacy Family Tree, Reunion (Mac), Heredis (French), and MyHeritage/Ancestry.com TreeShare server-side formats. Gramps XML (.gramps, .gpkg) is the open-source outlier — a gzip-compressed XML superset of GEDCOM that Gramps 6.0.x (2025) recommends as the lossless format. The DNA-testing layer added a parallel pseudo-DSL since ~2007: 23andMe, AncestryDNA, MyHeritage DNA, FamilyTreeDNA all export tab-separated raw-data files containing ~600,000–1,000,000 SNP genotype calls (rsID, chromosome, position, allele1, allele2). GEDmatch is the cross-vendor comparison hub. (23andMe filed Chapter 11 in March 2025; raw-data download remained available through mid-2026 but vendor-lock risk is now actively discussed.) Finally, pedigree numbering systems (Ahnentafel since 1590, Henry 1935, d’Aboville 1940) are not file formats but compact textual notations for encoding ancestor / descendant position — still widely used in printed pedigrees and report generators.
In our deep library
None catalogued. Genealogy DSLs do not have standalone deep-library notes — they are domain formats rather than general-purpose languages.
Cross-reference:
- citation-formats — FHISO Citation Elements is a sibling of MARC 800 personal-name authority records and the broader bibliographic-citation DSL family; sources/evidence in genealogy overlap heavily with library-science citation models.
- api-description — GEDCOM-X JSON/XML schemas are described via the same JSON-Schema / XSD machinery used elsewhere; the GEDCOM-X RS specification is REST/HTTP interface-definition territory.
- notation-spec — pedigree numbering systems (Ahnentafel, Henry, d’Aboville) are compact formal notations, conceptually adjacent to the systematic-naming family.
- bio-fileformats — consumer-DNA raw data files (23andMe, AncestryDNA) are a low-density sibling of the VCF / PLINK / FASTQ formats used in scientific genomics; the same SNP positions are reported but with vendor-specific manifests.
- document-typesetting — adjacent for pedigree-report generation pipelines.
Tier 3 family table — Open / interchange standards
| Format | First appeared | Origin | Type | Status (2026) | URL |
|---|---|---|---|---|---|
| GEDCOM 5.5.1 | 1999 (published) / 2019 (finalised) | FamilySearch / LDS Church | Line-oriented, level-prefixed tag text; ANSEL/UTF-8 | Dominant de-facto interchange; still the default export of most software (webtrees stated target) | https://gedcom.io/specifications/ged551.pdf |
| FamilySearch GEDCOM 7.0 | 19 May 2021 | FamilySearch | Modernised GEDCOM; UTF-8 mandatory, URIs for tags, structured types, .gdz zip package | Active — current 7.0.18; 7.0.17 (Feb 2026), 7.0.16 (Mar 2025); 7.1 in working draft on GitHub | https://gedcom.io/specifications/FamilySearchGEDCOMv7.html |
| GEDCOM-X (JSON/XML) | 2012 | FamilySearch | JSON + XML serialisations of an OO-conclusion data model; RESTful interface defs (gedcomx-rs) | Maintained but stalled — never displaced GEDCOM tag format; FamilySearch repositioned it as a bulk-transfer / API model after GEDCOM 7 launched | https://www.familysearch.org/innovate/gedcom-x |
| FHISO ELF (Extended Legacy Format) | Draft, ongoing | FHISO | GEDCOM-compatible serialisation with structured extensibility | Draft — community standards-body alternative; pre-1.0 | https://github.com/fhiso/legacy-format |
| FHISO Citation Elements | Draft 2017+, ongoing | FHISO | Modular evidence-citation DSL with concepts + serialisation bindings | Draft — Concepts published Sep 2017; GEDCOM-X RDFa bindings published; GEDCOM 7 binding not yet drafted | https://fhiso.org/TR/cev-concepts-20170911 |
Gramps XML (.gramps, .gpkg) | 2001 | Gramps open-source project | gzip-compressed XML superset of GEDCOM; .gpkg bundles media | Active — Gramps 6.0.4 released 10 Aug 2025; recommended lossless format | https://www.gramps-project.org/wiki/index.php/Gramps_XML |
| Wikidata genealogical statements | 2012 | Wikimedia Foundation | RDF property graph; P22 father, P25 mother, P26 spouse, P40 child, P39/P22 chains | Very active — large public-figure pedigree dataset; queryable via SPARQL endpoint | https://www.wikidata.org/wiki/Wikidata:WikiProject_Genealogy |
| W3C / Schema.org Person | 2011 | W3C / Schema.org consortium | JSON-LD / RDFa vocabulary with parent, spouse, children, relatedTo; used by search-engine schema markup | Active but minimal genealogical depth; oriented at web markup, not full pedigrees | https://schema.org/Person |
| PROV-O genealogical adaptations | 2013+ | W3C PROV Working Group + academic adapters | OWL ontology for provenance; research papers map family-history evidence chains to PROV-O activities/agents | Niche / research — not a deployed exchange format | https://www.w3.org/TR/prov-o/ |
Tier 3 family table — Proprietary / vendor formats
| Format | First appeared | Origin | Type | Status (2026) | URL |
|---|---|---|---|---|---|
Family Tree Maker .ftm | 1989 (FTM origin) / 2017 (MacKiev) | Banner Blue → Broderbund → Ancestry.com → Software MacKiev | Proprietary database file; SQLite-based since FTM 2014; round-trips via GEDCOM | Active — FTM 2024 launched 10 May 2025; Ancestry tree sync + FamilySearch hints | https://www.mackiev.com/ftm/ |
RootsMagic .rmgc / .rmtree / .rmbackup | 2003 (RM origin), .rmtree from RM9 (2022) | RootsMagic Inc. | SQLite-3 backed; proprietary schema; .rmgc legacy, .rmtree current, .rmbackup archives | Active — RM10 reads RM9 .rmtree files; .rmgc from RM4–7 requires explicit import | https://support.rootsmagic.com/hc/en-us/articles/224924947 |
| Legacy Family Tree | 1997 | Millennia Corporation | Proprietary FoxPro/DBF-derived schema; GEDCOM import/export | Active but reduced development velocity; popular in LDS community | https://legacyfamilytree.com/ |
| MyHeritage native (cloud) | 2003 | MyHeritage Ltd. | Server-side proprietary; GEDCOM import/export at user boundary; “Smart Matches” engine | Very active — large commercial subscription service | https://www.myheritage.com/ |
| Ancestry.com TreeShare | ~2014 | Ancestry.com LLC | Proprietary sync protocol between Ancestry online trees and FTM desktop; not a public file format | Active — required for FTM ↔ Ancestry tree sync | https://www.ancestry.com/c/family-tree-help/ancestry-treeshare-for-family-tree-maker |
| Reunion for Mac | 1990 | Leister Productions | Proprietary Mac-native database; GEDCOM import/export | Active — Reunion 14 current; long-standing Mac-only option | https://www.leisterpro.com/ |
| Heredis | 1994 | BSD Concept (France) | Proprietary French-origin; multi-platform; GEDCOM import/export | Active — strongest in French / European market | https://www.heredis.com/ |
| webtrees | 2010 (fork of PhpGedView) | Greg Roach + community (open source) | PHP / MySQL web app; GEDCOM-first storage; reads 5.5.1 + much of 7.0, exports 5.5.1 | Active — 2.x line current; partial GEDCOM 7 read support via Jefferson49 ExtendedImportExport module | https://www.webtrees.net/ |
| TNG (The Next Generation of Genealogy Sitebuilding) | 2003 | Darrin Lythgoe (commercial) | PHP / MySQL self-hosted web app; GEDCOM 5.5.1 + 7.0 import; one-time licence | Active — recent updates include GEDCOM 7.0 media cropping, mobile responsive layouts | https://tngsitebuilding.com/ |
| GenealogyJ | 2002 | Open-source community (Java) | Java desktop app; GEDCOM 5.5/5.5.1 read/write | Largely dormant — minimal modern activity but binaries still distributed | http://genealogyj.sourceforge.net/ |
| PAF (Personal Ancestral File) | 1984 | FamilySearch / LDS Church | Proprietary DOS / Windows app; GEDCOM 5.5 export | Discontinued 15 July 2013; last update 2002; ~3.2 million copies distributed in its lifetime | https://www.familysearch.org/en/newsroom/personal-ancestral-file-paf-is-discontinued |
Tier 3 family table — DNA / genetic ancestry
| Format | First appeared | Origin | Type | Status (2026) | URL |
|---|---|---|---|---|---|
23andMe raw data (.txt / .zip) | 2007 | 23andMe Inc. | Tab-separated: rsID, chromosome, position, genotype (allele1+allele2); ~600k–1M SNPs per file | At-risk — 23andMe filed Chapter 11 March 2025; raw download still available mid-2026 but vendor continuity uncertain | https://customercare.23andme.com/hc/en-us/articles/212196868 |
| AncestryDNA raw data | 2012 | Ancestry.com | Tab-separated SNP genotype file; ~700k autosomal positions; download from MyDNA tab | Active — largest consumer database (>20M tested) | https://support.ancestry.com/s/article/Downloading-AncestryDNA-Raw-Data |
| MyHeritage DNA raw data | 2016 | MyHeritage Ltd. | CSV genotype file; uses Illumina OmniExpress-derived chip | Active | https://www.myheritage.com/ |
| FamilyTreeDNA raw data | 2000 (autosomal from 2010) | Gene by Gene / FamilyTreeDNA | Multiple flavours: Family Finder autosomal CSV, Y-DNA STR/SNP, mtDNA HVR | Active — only mainstream Y-DNA / mtDNA vendor with surname projects | https://www.familytreedna.com/ |
| GEDmatch upload format | 2010 | GEDmatch LLC (acquired by Verogen 2019, then Qiagen) | Accepts normalised raw-data uploads from all five major vendors; cross-vendor SNP matching | Active — central comparison hub; gained notoriety via Golden State Killer (2018) law-enforcement use | https://www.gedmatch.com/ |
Tier 3 family table — Pedigree numbering / notation
| Format | First appeared | Origin | Type | Status (2026) | URL |
|---|---|---|---|---|---|
| Ahnentafel (Sosa-Stradonitz) | 1590 | Michaël Eytzinger (Cologne), formalised by Sosa (1676), popularised by Stradonitz (1898) | Ascending: subject = 1, father = 2n, mother = 2n+1; binary positional encoding | Universal in genealogy software; default ancestor-numbering scheme | https://en.wikipedia.org/wiki/Ahnentafel |
| Henry System | 1935 | Reginald Buchanan Henry | Descending: progenitor = 1, oldest child = 11, next = 12, oldest grandchild = 111 | Common in published descendancy reports | https://en.wikipedia.org/wiki/Genealogical_numbering_systems#Henry_System |
| d’Aboville System | 1940 | Jacques d’Aboville (France) | Descending; dotted-decimal variant of Henry (11 → 1.1, 112 → 1.1.2) — disambiguates >9 children | Common in French-language genealogy and many modern software exports | https://en.wikipedia.org/wiki/Genealogical_numbering_systems#d.27Aboville_System |
Notable threads
-
GEDCOM 5.5.1 persistence vs GEDCOM 7 adoption. Five years after the 7.0 release (May 2021 → mid-2026), GEDCOM 5.5.1 is still the dominant interchange format. Most desktop software (RootsMagic, Legacy, FTM 2024) defaults to 5.5.1 export for compatibility with the long tail of older tools; webtrees explicitly states 5.5.1 as its target and supports only a “very small list” of G7 tags on read. The slow uptake is structural: GEDCOM was always a lowest-common-denominator format and the network effect rewards being the format everyone else can read. GEDCOM 7’s stronger guarantees (UTF-8, URIs, structured DATE/PLACE) only matter when both ends of a transfer support them — until then, 5.5.1 wins by default.
-
GEDCOM-X’s failure to displace the tag format. When FamilySearch launched GEDCOM-X in 2012 the assumption was that a clean JSON/XML model with proper conclusion-vs-evidence separation, RESTful endpoints, and modern serialisation would obviously beat a 1984 line-oriented tag format. It didn’t. Reasons: (a) the installed base of GEDCOM-reading software was overwhelming, (b) GEDCOM-X’s data model was too normalised for casual hobbyist use, (c) the FamilySearch API became the de-facto deployment of GEDCOM-X rather than a community-wide interchange standard, and (d) the GEDCOM 7 modernisation effort siphoned community energy back to the tag format starting in 2019. As of 2026 GEDCOM-X is “maintained” but practically scoped to FamilySearch’s own APIs and bulk-transfer use cases.
-
FHISO as the community standards-body alternative. FHISO was incorporated in 2013 precisely because the genealogy world’s standards process had been owned by a single entity (FamilySearch / LDS Church) for 30 years. FHISO’s deliverables — Citation Elements, ELF (Extended Legacy Format) — explicitly target the gaps GEDCOM left: a structured citation/evidence model, and a real extensibility mechanism (vs GEDCOM’s ad-hoc
_TAGS). The pace is slow and committee-driven; Citation Elements Concepts has been at draft since 2017 and the GEDCOM 7 binding is still unwritten. Whether FHISO ever delivers a widely-adopted standard remains genuinely open — but it is the only neutral venue. -
Consumer-DNA database wars and their raw-data formats. The 2007–2020 boom of consumer DNA testing (23andMe 2007, FamilyTreeDNA autosomal 2010, AncestryDNA 2012, MyHeritage DNA 2016) created a parallel pseudo-DSL: tab-separated SNP genotype files in vendor-specific layouts. Cross-vendor comparison is handled by GEDmatch (founded 2010, acquired by Verogen / Qiagen in 2019) which normalises uploads from all five major vendors. The infamous 2018 Golden State Killer case used GEDmatch to identify a suspect by genetic genealogy, triggering policy reforms and a contraction of public-database opt-ins. The 23andMe Chapter 11 filing in March 2025 crystallised a long-standing risk: vendor lock-in of irreplaceable genetic data. Raw-data download remained available through mid-2026 but the long-term custody of those genotype files is now an active concern.
-
Proprietary file-format lock-in in genealogy software. FTM (
.ftm), RootsMagic (.rmtree/.rmgc/.rmbackup), Legacy, Reunion, Heredis, MyHeritage cloud — each stores the user’s tree in a proprietary schema with GEDCOM as the only export path. The GEDCOM round-trip is lossy: custom fields, source-citation templates, multimedia metadata, place-name research notes, and the tool’s own data structures (Living Privatization rules, hint-source caches) don’t survive. Gramps XML is the open-source alternative — a gzip-compressed XML superset of GEDCOM that Gramps 6.0.x (2025) recommends as the lossless internal format — and the only major non-proprietary native store. The pattern is identical to the early-1990s word-processor zoo: every product had its own.doc/.wpd/.sam/.lwp, and a single lossy interchange format. -
Pedigree numbering systems as a compact textual notation. Ahnentafel (Eytzinger 1590, popularised by Stradonitz 1898), Henry (1935), and d’Aboville (1940) are not file formats but micro-DSLs for encoding pedigree position in a single number. Ahnentafel is binary-positional (father = 2n, mother = 2n+1) so a string like “53” uniquely identifies “father’s mother’s father’s mother’s father” relative to the subject — the same trick decimal numbers use for positional value. The descending systems (Henry, d’Aboville) encode birth order down the generations: “1.1.2” = “first child of the first child of the progenitor.” All three are still in active use in printed reports, GEDCOM
_AHNENTAFELextensions, and software export options; d’Aboville is preferred when any generation has >9 children because the dots disambiguate.
Citations
- FamilySearch GEDCOM 7 specification: https://gedcom.io/specifications/FamilySearchGEDCOMv7.html
- FamilySearch GEDCOM changelog (7.0.x release history): https://gedcom.io/changelog/
- FamilySearch GEDCOM GitHub releases: https://github.com/FamilySearch/GEDCOM/releases
- GEDCOM 5.5.1 specification PDF: https://gedcom.io/specifications/ged551.pdf
- GEDCOM-X main page: https://www.familysearch.org/innovate/gedcom-x
- GEDCOM-X RS (RESTful interface defs) on GitHub: https://github.com/FamilySearch/gedcomx-rs
- FHISO home: https://fhiso.org/
- FHISO technical work strategy: https://tech.fhiso.org/strategy
- FHISO Citation Elements Concepts: https://fhiso.org/TR/cev-concepts-20170911
- FHISO ELF (Extended Legacy Format) on GitHub: https://github.com/fhiso/legacy-format
- Gramps 6.0.4 release (Aug 2025): https://gramps-project.org/blog/2025/08/gramps-6-0-4-released/
- Gramps XML format reference: https://www.gramps-project.org/wiki/index.php/Gramps_XML
- webtrees GEDCOM data format notes: https://webtrees.net/gedcom/
- TNG Sitebuilding home: https://tngsitebuilding.com/
- Software MacKiev FTM 2024: https://www.mackiev.com/ftm/
- RootsMagic file extensions: https://support.rootsmagic.com/hc/en-us/articles/224924947
- Personal Ancestral File discontinuation announcement: https://www.familysearch.org/en/newsroom/personal-ancestral-file-paf-is-discontinued
- 23andMe Chapter 11 bankruptcy reporting (Mar 2025): https://dna-explained.com/2025/03/25/23andme-files-for-bankruptcy-what-you-need-to-know/
- Genealogical numbering systems (Ahnentafel / Henry / d’Aboville): https://en.wikipedia.org/wiki/Genealogical_numbering_systems
- Wikidata WikiProject Genealogy: https://www.wikidata.org/wiki/Wikidata:WikiProject_Genealogy
- Schema.org Person: https://schema.org/Person
- W3C PROV-O: https://www.w3.org/TR/prov-o/