Patent / IP / Standards-Document DSLs Family Index
type: language-family-index family: patent-ip-standards languages_catalogued: 26 tags: [language-reference, family-index, patent-ip-standards, wipo-st96, st26, uspto-xml, epo-xml, niso-sts, rfc-xml-v3, ipc, cpc]
Patent / IP / Standards-Document — Family Index
Family overview
Patent / IP / standards-document DSLs are the set of XML vocabularies, classification grammars, and authoring markup languages used to encode the legal-technical artefacts of the intellectual-property and standards-development worlds. The dominant gravity well is WIPO Standard ST.96 — an XML schema family for industrial-property data covering patents, trademarks, designs, geographical indications, and copyright — currently at version 8.0, released October 2024, and explicitly not backwards-compatible with v7.0/v7.1. ST.96 is the slow-converging global standard intended to supersede the older WIPO ST.36 (patents), ST.66 (trademarks), and ST.86 (designs) format families. ST.96 v8.0 organises its components into Common, Patent, Trademark, Design, Geographical Indication, and Copyright sub-schemas under Annex III.
Running on a parallel track since 2022 is WIPO Standard ST.26, the XML format for biological sequence listings in patent applications. ST.26 went live with a coordinated international “big-bang” cutover on 2022-07-01, replacing the legacy plain-text ST.25 standard. Any patent application (including continuations and divisionals) with a filing date on or after that cutover must submit nucleotide and amino-acid sequence disclosures as a single ST.26 XML file — a hard procedural rule with large practical impact on biotech filings.
Each major patent office overlays ST.96 with local extensions. The USPTO has, since the 2024-01-17 fee surcharge, effectively mandated DOCX-based filing whose downstream XML conversion conforms to ST.96 (with the legacy PatFT/AppFT bulk-text formats and PAUS-XML ICE-DTD-based grants persisting as adjacent data products on data.uspto.gov). The EPO maintains its own EBD (EPO Bibliographic Data) XML stream plus the worldwide DOCDB bibliographic database and the INPADOC family-extended view; the JPO, CNIPA, and KIPO each publish ST.96-aligned XML for their national caseload. International routing systems — the PCT for patents, Madrid for trademarks, Hague for industrial designs — are administered by WIPO on top of the same ST.96 schema family.
The IP world also runs on a small constellation of classification languages — controlled vocabularies expressed as hierarchical codes: IPC (International Patent Classification, WIPO), CPC (Cooperative Patent Classification, joint EPO/USPTO, currently CPC 2026.05 in force as of 2026-04-07), NICE (trademark goods/services, currently 13th edition NCL 13-2026 since 2026-01-01), Vienna (trademark figurative elements), and Locarno (industrial-design products, currently LOC15 v2026 since 2026-01-01). Wholly separate from the IP universe but procedurally adjacent is the standards-document authoring world: the IETF’s RFC XML v3 vocabulary (RFC 7991, published 2016, frozen at xml2rfc 3.0.0 with the bis document slowly catching up to real-world usage), NISO STS 1.2 (ANSI/NISO Z39.102-2022, used by ISO/IEC and national SDOs as the canonical XML for standards), and the W3C-ecosystem markdown-ish authoring layers ReSpec and Bikeshed.
In our deep library
No standalone deep-library notes — this entire family is composed of XML vocabularies, controlled classification codes, and lightweight markdown wrappers rather than general-purpose languages.
Cross-reference:
- document-typesetting — overlapping territory: OASIS DocBook for standards documents, XML-based authoring pipelines, the broader DITA / DocBook / TEI lineage.
- citation-formats — IP filings cite prior art with their own conventions (USPTO 102/103 references, EPO X/Y/A category codes); ISO standards use ISO 690 citations.
- api-description — ST.96, ST.26, NISO STS, and RFC XML v3 are all XML-Schema-described vocabularies; the schema-as-contract pattern is the same shape as OpenAPI/JSON-Schema.
- notation-spec — formal-grammar overlap for the schema definitions themselves; CPC/IPC/NICE/Locarno/Vienna are essentially controlled-vocabulary mini-languages with hierarchical code grammars.
- government-civictech — patent and trademark filings live at the legal/regulatory boundary; the procedural-XML pattern is shared with court e-filing (LegalXML / OASIS LegalDocML).
- bio-fileformats — direct overlap: ST.26 biosequence listings encode the same nucleotide/amino-acid alphabets as FASTA/GenBank but inside a constrained XML envelope with mandatory feature qualifiers and INSDC-aligned vocabularies.
Tier 3 family table — WIPO standards
| Format | First appeared | Origin | Type | Status (2026) | URL |
|---|---|---|---|---|---|
| WIPO ST.96 | 2009 (v1.0), v8.0 in 2024-10 | WIPO Committee on WIPO Standards (CWS), XML4IP Task Force | XML Schema family for all industrial property (patents, trademarks, designs, GIs, copyright) | Current — v8.0 released Oct 2024; explicitly not backward-compatible with v7.x | https://www.wipo.int/standards/en/st96/v8-0/ |
| WIPO ST.36 | 2003 | WIPO SDWG → CWS | Older patent-only XML vocabulary | Legacy — succeeded by ST.96; still in some bulk-data products and historical archives | https://www.wipo.int/en/web/standards/standards |
| WIPO ST.66 | 2008 | WIPO | Older trademark-only XML vocabulary | Legacy — succeeded by ST.96 trademark components | https://www.wipo.int/en/web/standards/standards |
| WIPO ST.86 | 2008 | WIPO | Older industrial-design XML vocabulary | Legacy — succeeded by ST.96 design components | https://www.wipo.int/en/web/standards/standards |
| WIPO ST.26 | 2018 (recommendation), 2022-07-01 mandate | WIPO SEQL Task Force | XML for biosequence listings in patent applications (nucleotide + amino-acid disclosures) | Mandatory — replaced ST.25 on 2022-07-01 “big-bang” date for all new filings worldwide | https://www.wipo.int/en/web/standards/sequence |
| WIPO ST.25 | 1998 | WIPO | Plain-text sequence-listing standard | Retired — replaced by ST.26 on 2022-07-01 for new filings | https://www.wipo.int/en/web/standards/sequence/faq |
| WIPO ST.97 | 2020 | WIPO CWS | XML for general IP-related documents (annexes, correspondence) | Active — narrower scope than ST.96 | https://www.wipo.int/en/web/standards/standards |
| WIPO ST.50-series | 1970s–present | WIPO | Bibliographic-data recommendations (ST.50 patent corrections, ST.9 bibliographic data, ST.8 IPC) | Active — referenced by EPO EBD and most patent-office XML | https://www.wipo.int/en/web/standards/standards |
Tier 3 family table — Patent office XML
| Format | First appeared | Origin | Type | Status (2026) | URL |
|---|---|---|---|---|---|
| USPTO Patent XML (ICE DTD) / PAUS-XML | ~2002 | USPTO | DTD-based XML for grants and applications via the ICE (International Common Element) DTD | Active — bulk data on data.uspto.gov; ST.96-aligned conversion now flows from DOCX filing | https://developer.uspto.gov/product/patent-grant-full-text-dataxml |
| USPTO DOCX filing | 2021 voluntary, 2024-01-17 surcharge for non-DOCX | USPTO | OOXML (.docx) specification + claims + abstract; converted to ST.96 XML downstream | Mandatory in practice for new utility nonprovisional applications since 2024-01-17 | https://www.uspto.gov/patents/apply/patent-center |
| USPTO PatFT / AppFT bulk text | 1970s archives, web 2001 | USPTO | Legacy plain-text-with-tags bulk patent / application data | Legacy — Patent Public Search replaced the PatFT/AppFT web UIs in 2022; bulk text archives still distributed | https://www.uspto.gov/sites/default/files/documents/Search-field-conversion-PatFT-AppFT-QRG-Patent-Public-Search.pdf |
| EPO EBD (Bibliographic Data XML) | early 2000s | European Patent Office | Weekly XML feed of new EP applications/specifications and amendments | Active — ST.8/ST.9-conformant XML; pre-2005 backfile may still be SGML | http://docs.epoline.org/ebd/xmlinfo.htm |
| EPO DOCDB | 1968 (paper), XML form post-2000 | EPO | Worldwide patent bibliographic master database, distributed in XML | Active — backbone of EPO/Espacenet and many third-party patent-search products | https://www.epo.org/en/searching-for-patents/data |
| INPADOC | 1972 | EPO (originally International Patent Documentation Centre, Vienna) | International patent legal-status + family data, distributed as XML | Active — INPADOC families extend DOCDB families by shared-priority linkage | https://www.epo.org/en/searching-for-patents/data |
| EPO Patent Information Export Format | 2010s | EPO | Bulk-export wrapper formats around EBD/DOCDB | Active — for licensed bulk subscribers | https://www.epo.org/en/searching-for-patents/data |
| JPO XML | 2000s | Japan Patent Office | National patent/utility-model/design/trademark XML; progressively migrating to ST.96 | Active | https://www.jpo.go.jp/e/system/laws/sesaku/data/datapolicy.html |
| CNIPA XML | 2010s | China National Intellectual Property Administration | Chinese patent and trademark XML; ST.96 alignment in progress | Active — largest filing volume in the world | https://english.cnipa.gov.cn/ |
| KIPO XML | 2000s | Korean Intellectual Property Office | Korean patent XML; ST.96 alignment in progress | Active | https://www.kipo.go.kr/en/ |
| PCT XML (Patent Cooperation Treaty) | 2000s | WIPO (PCT Operations) | XML for international PCT applications under WIPO administration | Active — built on ST.96 patent components | https://www.wipo.int/pct/en/ |
Tier 3 family table — IP classification systems
| Format | First appeared | Origin | Type | Status (2026) | URL |
|---|---|---|---|---|---|
| IPC (International Patent Classification) | 1971 (Strasbourg Agreement) | WIPO | Hierarchical patent classification (sections A–H → ~80,000 entries) | Active — IPC 2026.01 in force from 2026-01-01 | https://www.wipo.int/classifications/ipc/en/ |
| CPC (Cooperative Patent Classification) | 2013 | EPO + USPTO (jointly maintained) | Extension of IPC with much finer granularity (~260,000 entries) | Active — CPC 2026.05 entered force 2026-04-07; rolling quarterly-ish releases | https://www.cooperativepatentclassification.org/ |
| NICE Classification | 1957 (Nice Agreement) | WIPO | International classification of goods and services for trademark registration (45 classes) | Active — 13th edition NCL 13-2026 in force from 2026-01-01, replacing NCL 12-2025 | https://www.wipo.int/en/web/classification-nice |
| Vienna Classification | 1973 (Vienna Agreement) | WIPO | Classification of the figurative elements of marks | Active — used alongside NICE for figurative-mark searches | https://www.wipo.int/classifications/vienna/en/ |
| Locarno Classification | 1968 (Locarno Agreement) | WIPO | International classification for industrial designs (32 classes) | Active — LOC15 version 2026 product list since 2026-01-01 | https://www.wipo.int/en/web/classification-locarno |
Tier 3 family table — International registration systems
| Format | First appeared | Origin | Type | Status (2026) | URL |
|---|---|---|---|---|---|
| Madrid System XML | 1891 treaty, modern XML 2000s | WIPO (Madrid Registry) | XML for international trademark filings under the Madrid Protocol; eMadrid + Madrid e-Filing pipelines | Active — 115 members covering 131 countries as of May 2025 | https://www.wipo.int/en/web/madrid-system |
| Hague System XML | 1925 treaty, modern XML 2000s | WIPO (Hague Registry) | XML for international industrial-design filings under the Hague Agreement | Active | https://www.wipo.int/en/web/hague-system |
| TRIPS Agreement (adjacent) | 1995 | WTO | Treaty framework (not a data format) — minimum substantive IP standards binding on WTO members | Active — legal substrate for all the above | https://www.wto.org/english/tratop_e/trips_e/trips_e.htm |
| EPC Article 7 / Implementing Regulations (adjacent) | 1973 (EPC), 2000 revision | EPO contracting states | Treaty + implementing regulations governing European patents (not a data format) | Active — legal substrate for EPO procedure | https://www.epo.org/en/legal/epc |
Tier 3 family table — Standards-document authoring
| Format | First appeared | Origin | Type | Status (2026) | URL |
|---|---|---|---|---|---|
| NISO STS 1.2 | 2017 (STS 1.0), 2022 (STS 1.2 as ANSI/NISO Z39.102-2022) | NISO Standards Tag Suite WG | XML for full text + metadata of standards documents; derived from JATS (Z39.96) | Current — STS 1.2 fully backward-compatible with 1.0; used by ISO/IEC and national SDOs | https://www.niso.org/standards-committees/sts |
| IETF RFC XML v3 / xml2rfc v3 | 2016 (RFC 7991) | IETF (Paul Hoffman et al.) | XML vocabulary for authoring RFCs; replaced the long-running v2 vocabulary | Active but in maintenance — xml2rfc 3.0.0 froze the grammar; RFC 7991bis catching up to real-world dialect | https://datatracker.ietf.org/doc/html/rfc7991 |
| xml2rfc (tool) | early 2000s (v1/v2), 2018+ (v3) | IETF Tools | Reference processor for RFC XML; converts to PDF/HTML/text RFC outputs | Active — Python package, used by RFC Production Center | https://github.com/ietf-tools/xml2rfc |
| Kramdown-RFC | 2010s | Carsten Bormann (IETF community) | Markdown + extensions that compile to RFC XML v3 | Active — the de facto path for non-XML-native IETF authors | https://github.com/cabo/kramdown-rfc |
| W3C ReSpec | 2009+ | Robin Berjon → W3C community | HTML+JS in-browser preprocessor that renders into a W3C/WHATWG-style spec | Very active — widely used for W3C TR documents and many community specs | https://respec.org/docs/ |
| W3C Bikeshed | 2014+ | Tab Atkins (CSS WG) | Source-to-spec preprocessor (Python); markdown-ish input → HTML spec with boilerplate, bibliography, indexes | Very active — used by CSS specs, many other W3C WGs, WHATWG, and the C++ standards committee | https://github.com/speced/bikeshed |
| OASIS DocBook (for standards) | 1991 | HaL Computer Systems + O’Reilly → OASIS | General technical-documentation XML; sometimes used for standards | Active — overlaps with document-typesetting | https://docbook.org/ |
Notable threads
-
WIPO ST.96 is the slowly-converging global standard. ST.96 v8.0 (Oct 2024) is the eighth major revision in fifteen years, and version cadence reflects the fundamental tension of any inter-office XML standard: every national office runs ST.96 with local extensions (PAUS at USPTO, EPO bibliographic profiles, JPO/CNIPA/KIPO national wrappers) and every breaking schema change forces coordinated office-side migration. The explicit “not backward-compatible with v7” warning on v8.0 is telling — schema simplifications and namespace cleanups still ship as breaking changes a decade and a half into the standard’s life. ST.96 effectively eats ST.36 (patents), ST.66 (trademarks), and ST.86 (designs) on a slow trajectory.
-
The ST.26 biosequence cutover (2022-07-01) was the largest mandatory IP-format migration in decades. ST.25 had been the plain-text status quo since 1998 and was unable to represent branched sequences, D-amino acids, nucleotide analogues, or many of the modifications routine in modern molecular biology. The “big-bang” date was coordinated across WIPO, USPTO, EPO, JPO, CNIPA, KIPO, and PCT-receiving offices simultaneously — a rare instance of a hard global cutover in IP procedure. A subtle rule: any continuation or divisional filed on or after 2022-07-01 must use ST.26 even if the parent was an ST.25 filing, which has caused real procedural pain in the biotech prosecution world. ST.26 is also a rare ST.96-adjacent standard with its own dedicated authoring tool ecosystem (WIPO Sequence, WIPO Sequence Validator).
-
The patent-office XML federation overlays ST.96 with local extensions. The reference architecture is “ST.96 common components + office-specific extension schemas in a separate namespace.” USPTO’s PAUS-XML layer carries 35 U.S.C. statutory metadata; EPO EBD encodes EPC procedural events; JPO/CNIPA/KIPO carry national bibliographic identifiers. Bulk-data consumers (Patent Analytics PATSTAT, Google Patents, lens.org, IPlytics, etc.) generally normalise these into a common internal model rather than processing each office’s dialect raw. Pre-2005 EPO data may still arrive in SGML rather than XML — a reminder that “the international patent corpus” is decades-deep and format-heterogeneous.
-
IP classification systems are controlled-vocabulary mini-languages with hierarchical code grammars. IPC (~80k entries), CPC (~260k entries, EPO+USPTO joint), NICE (45 trademark classes), Vienna (figurative elements), Locarno (32 design classes). They are not “programming languages” but they are formal languages — each code has a strict positional grammar (e.g., CPC
H04L 9/00decomposes into section/class/subclass/group/subgroup) and revisions ship on calendar schedules: IPC 2026.01, CPC 2026.05, NICE 13-2026, Locarno LOC15-2026 are all in force as of mid-2026. Tooling around these classifications (concordance tables, IPC↔CPC mappings, classification-prediction ML models) is its own active ecosystem. -
IETF RFC XML v3 (RFC 7991, 2016) replaced the long-running v2 vocabulary but the migration has been bumpy. The xml2rfc reference processor diverged from the published RFC 7991 grammar during deployment, the grammar was frozen at xml2rfc 3.0.0 to stop the bleeding, and RFC 7991bis is slowly catching the spec up to the dialect that real RFCs actually use — work that paused during the RSWG/RFC 9280 governance overhaul and is resuming in 2025–2026 under the new RFC change-management team. In practice most authors now write Kramdown-RFC (Carsten Bormann’s Markdown-with-extensions front end) and let it emit RFC XML v3 for them.
-
W3C ReSpec and Bikeshed displaced raw HTML for spec authoring. Both took over from the older XMLSpec / pure-HTML-with-stylesheets approach in the early 2010s. ReSpec is HTML+JS that runs in the browser (the spec file is the spec, transformed on load); Bikeshed is a Python preprocessor with a markdown-ish source format. Bikeshed dominates in CSS, WHATWG, and the C++ committee; ReSpec dominates in the broader W3C TR pipeline. The W3C
spec-prodGitHub Action standardises CI/CD for both — build, validate, publish to GH Pages and/or w3.org via Echidna. -
NISO STS is the JATS-for-standards story. STS 1.2 (ANSI/NISO Z39.102-2022) is derived from JATS (Z39.96, the dominant scholarly-article XML), extended with normative-content structures and adoption/translation patterns. ISO and IEC use STS as the basis of their XML production pipelines; national SDOs (ANSI, BSI, DIN, JISC) follow. STS sits in the same conceptual niche as ST.96 but for normative standards rather than IP rights — and they don’t overlap in scope, despite both being XML-Schema-described technical-document vocabularies maintained by international bodies.
-
The slow XML→JSON drift is real but partial. Some IP systems (search APIs, bulk-data overlays, modern e-filing UIs) increasingly expose JSON alongside XML, but the canonical-document format remains XML in essentially every patent office and every standards body. The reason is regulatory: XML Schema is the validation language that procedural law has been written to assume, and the migration cost of switching every implementing regulation to reference a JSON Schema is greater than the developer-ergonomic upside. Expect XML primacy in this domain through at least the late 2020s.
Citations
- WIPO Standards portal: https://www.wipo.int/en/web/standards/standards
- WIPO ST.96 v8.0 release notes (Oct 2024): https://www.wipo.int/standards/en/st96/v8-0/release_notes.html
- WIPO ST.96 v8.0 Annex III schemas: https://www.wipo.int/standards/en/st96/v8-0/annex-iii/index.html
- WIPO ST.26 implementation FAQ: https://www.wipo.int/en/web/standards/sequence/faq
- WIPO ST.26 “goes live” announcement (2022-07-01): https://www.wipo.int/en/web/pct-system/w/news/2022/news_0039
- USPTO ST.26 news: https://www.uspto.gov/patents/apply/sequence-listing-resource-center/wipo-standard-st26-news
- USPTO XML Resources: https://www.uspto.gov/learning-and-resources/xml-resources
- USPTO Patent Grant Full Text XML: https://developer.uspto.gov/product/patent-grant-full-text-dataxml
- USPTO Patent Application XML: https://developer.uspto.gov/product/patent-application-dataxml
- USPTO DOCX filing surcharge legal framework (Sept 2025): https://www.uspto.gov/sites/default/files/documents/2025LegalFrameworkPES.pdf
- EPO EBD XML documentation: http://docs.epoline.org/ebd/xmlinfo.htm
- EPO bulk data (DOCDB, INPADOC, EBD): https://www.epo.org/en/searching-for-patents/data
- CPC home (joint EPO/USPTO): https://www.cooperativepatentclassification.org/
- CPC 2026.05 announcement: https://www.cooperativepatentclassification.org/home
- IPC (WIPO): https://www.wipo.int/classifications/ipc/en/
- NICE Classification 13-2026 advance publication: https://www.wipo.int/en/web/classification-nice/w/news/2025/nice-classification-ncl-13-2026-advance-publication-now-available
- Locarno Classification (WIPO): https://www.wipo.int/en/web/classification-locarno
- Madrid System (WIPO): https://www.wipo.int/en/web/madrid-system
- Hague System (WIPO): https://www.wipo.int/en/web/hague-system
- NISO STS home: https://www.niso.org/standards-committees/sts
- ANSI/NISO Z39.102-2022 STS 1.2: https://www.niso.org/publications/z39102-2022-sts
- IETF RFC 7991 (xml2rfc v3 vocabulary, 2016): https://datatracker.ietf.org/doc/html/rfc7991
- IETF RFCXML issue tracker (vocabulary work, 2025+): https://github.com/ietf-tools/RFCXML
- W3C ReSpec docs: https://respec.org/docs/
- W3C Bikeshed: https://github.com/speced/bikeshed
- W3C spec-prod GitHub Action: https://github.com/w3c/spec-prod
- TRIPS Agreement (WTO): https://www.wto.org/english/tratop_e/trips_e/trips_e.htm
- EPC (EPO legal texts): https://www.epo.org/en/legal/epc