COBOL — Reference
Source: https://gcc.gnu.org/onlinedocs/gcobol/
COBOL
- Created: 1959 by CODASYL (Conference on Data Systems Languages), with influence from Grace Hopper’s FLOW-MATIC. Designed under US DoD contract for portable business data processing.
- Latest stable: COBOL 2023 = ISO/IEC 1989:2023 (published 2023). Predecessors: COBOL-60, -68, -74, -85, 2002, 2014.
- Paradigms: Imperative, procedural, with object-oriented features added in COBOL 2002 (classes, interfaces, inheritance).
- Typing: Static, with PICTURE clauses describing data layout at the byte/digit level rather than abstract types. Strong layout discipline; weak abstract type checking by modern standards.
- Memory: No GC. Mostly static allocation declared in the DATA DIVISION; modern COBOL adds dynamic allocation (
ALLOCATE/FREE). Working-storage, local-storage, linkage sections segregate lifetimes. - Compilation: Ahead-of-time. Major implementations: IBM Enterprise COBOL for z/OS (mainframe, dominant), IBM COBOL for AIX/Linux on Power, Micro Focus Visual COBOL (Windows/Linux/.NET/JVM), GnuCOBOL (free, transpiles to C), Fujitsu NetCOBOL, GCC COBOL frontend (gcobol) — merged into GCC 15 (2025) as the first FOSS COBOL compiler in the GCC tree.
- Primary domains: Mainframe batch + transaction processing, banking core systems, insurance, government, payroll, ERP. Estimated 200+ billion lines in production; 95% of card-swipe transactions touch COBOL.
- Official docs: https://gcc.gnu.org/onlinedocs/gcobol/ (gcobol), https://gnucobol.sourceforge.io/ (GnuCOBOL), https://www.iso.org/standard/74527.html (ISO/IEC 1989:2023), https://www.ibm.com/docs/en/cobol-zos (Enterprise COBOL).
At a glance
COBOL is the language of records and reports: fixed-layout data, decimal arithmetic without floating-point error, hierarchical record definitions, and English-like statements. Its longevity comes from two things — (1) trillions of dollars of compiled production code that nobody dares touch, and (2) it is genuinely good at exactly what it does (mainframe batch, sequential file processing, packed-decimal financial math). Modern dialects (Enterprise COBOL 6+, Visual COBOL, gcobol) support OO, embedded XML/JSON, JVM/.NET interop, and Unicode, but the workhorse style is still COBOL-85 + structured programming + CICS/DB2.
Getting started
Install (free options):
- GnuCOBOL (3.2, July 2023):
sudo apt install gnucobol(Debian/Ubuntu),brew install gnu-cobol(macOS). Compiler iscobc. - gcobol (GCC’s COBOL frontend, merged into GCC 15, Apr 2025): build GCC with
--enable-languages=cobolor use a distro that ships GCC 15+. - IBM COBOL for Linux on x86 (commercial, free trial): https://www.ibm.com/products/cobol-compiler-linux.
- Micro Focus Visual COBOL Personal Edition (free for non-commercial).
Hello world (hello.cob, free-form):
IDENTIFICATION DIVISION.
PROGRAM-ID. HELLO.
PROCEDURE DIVISION.
DISPLAY "Hello, world!".
STOP RUN.Compile + run with GnuCOBOL: cobc -x -free hello.cob && ./hello. With gcobol: gcobol -free -o hello hello.cob && ./hello. (-x = build executable, not module; -free = free-form source — without it, columns 1-6 are reserved.)
Project layout. No standardized layout. Convention is one program per .cob file, with shared record definitions in .cpy (COPY) files included via COPY MEMBER-NAME.. Modules are linked statically or dynamically called via CALL "PROGRAM-NAME".
Build tools. No package manager. GNU make is standard for FOSS COBOL. On z/OS, JCL (Job Control Language) drives compile+link+execute steps; DBB (Dependency Based Build) + Zowe are the modern Git-based replacements for legacy SCM (Endevor, ChangeMan).
REPL. None. Edit-compile-run cycle. Some dialects (Micro Focus) ship animator/debugger that feels interactive.
Basics
Source format. Fixed format (the default for legacy code): columns 1-6 sequence numbers, column 7 indicator (* = comment, - = continuation, D = debug), columns 8-11 “Area A” (division/section/paragraph headers), columns 12-72 “Area B” (statements), 73-80 ignored. Free format (modern, COBOL 2002+): no column rules, use cobc -free or >>SOURCE FORMAT FREE directive.
Four divisions, in order:
- IDENTIFICATION DIVISION —
PROGRAM-ID., author, date metadata. - ENVIRONMENT DIVISION —
CONFIGURATION SECTION(source/object computer),INPUT-OUTPUT SECTIONwithFILE-CONTROL(assigns logical files to physical paths/datasets). - DATA DIVISION —
FILE SECTION(record layouts for files),WORKING-STORAGE SECTION(program-static),LOCAL-STORAGE SECTION(auto-storage per call),LINKAGE SECTION(parameters from caller). - PROCEDURE DIVISION — the actual code, organized into
SECTIONs andPARAGRAPHs.
Types via PICTURE clauses. A PIC clause describes character layout, not abstract type. Examples:
01 CUST-ID PIC 9(8). *> 8-digit unsigned integer (display)
01 PRICE PIC S9(7)V99 COMP-3. *> signed 7.2 packed decimal (4 bytes)
01 COUNTER PIC S9(9) COMP-5. *> native 32-bit signed binary
01 NAME PIC X(30). *> 30 chars
01 RATE PIC 9V9(4). *> 1 digit before, 4 after implicit decimal9= digit,X= any char,A= letter,S= sign,V= implicit decimal point,(n)= repeat.- USAGE clauses:
DISPLAY(default; one digit/char per byte, EBCDIC or ASCII),COMP/COMP-4/BINARY(native binary),COMP-3/PACKED-DECIMAL(BCD, 2 digits/byte + sign nibble — the workhorse for money),COMP-5(native binary, no truncation to PIC width),COMP-1(single-precision float),COMP-2(double float — rarely used; you want COMP-3 for money).
Hierarchical records via level numbers. 01 is a record root; 02-49 are subordinate fields; 66 for renaming, 77 for standalone, 88 for condition names:
01 CUSTOMER.
05 CUST-ID PIC 9(8).
05 CUST-NAME.
10 FIRST-NAME PIC X(15).
10 LAST-NAME PIC X(20).
05 STATUS-CODE PIC X.
88 ACTIVE VALUE "A".
88 INACTIVE VALUE "I" "X".Then test with IF ACTIVE ... (88-level condition names).
Variables/scoping. Lexical scope is the program. Sections/paragraphs share all DATA DIVISION storage (it’s effectively all global within the program). LOCAL-STORAGE gets fresh per call (re-entrancy). Nested programs (COBOL-85+) provide scoping, but flat structure is overwhelmingly common.
Control flow. IF / ELSE / END-IF, EVALUATE / WHEN (case), PERFORM paragraph-name (call), PERFORM ... THRU ... TIMES, PERFORM VARYING I FROM 1 BY 1 UNTIL ... (the for-loop), PERFORM ... WITH TEST AFTER UNTIL ... (do-while). Modern: structured END-IF/END-PERFORM/END-EVALUATE terminators close blocks unambiguously — always use them; legacy “period scoping” (statements terminated by .) is a bug minefield.
Procedures. Two flavors: (1) paragraphs invoked with PERFORM — share storage, no formal parameters. (2) CALL to a separate program: CALL "SUBPROG" USING BY REFERENCE A, BY VALUE B, BY CONTENT C. The called program declares matching LINKAGE SECTION items and PROCEDURE DIVISION USING .... BY REFERENCE (default) passes pointer; BY VALUE (COBOL-2002+) passes copy; BY CONTENT passes copy but receiver sees it as reference.
Strings. PIC X(n) is fixed-length, blank-padded, ASCII or EBCDIC depending on platform. STRING ... DELIMITED BY ... INTO ... for concatenation, UNSTRING for tokenization, INSPECT ... TALLYING / REPLACING for find/replace, MOVE FUNCTION UPPER-CASE(X) TO Y. National (Unicode UTF-16) via PIC N(n) and USAGE NATIONAL.
Collections. Arrays = “tables” via OCCURS:
01 MONTHLY-SALES.
05 MONTH-AMT OCCURS 12 TIMES PIC S9(7)V99 COMP-3.Index with MONTHLY-SALES (3). Multi-dimensional: nested OCCURS. Searchable tables: OCCURS 100 TIMES INDEXED BY I ASCENDING KEY IS K, then SEARCH ALL does binary search.
Intermediate
Modules. Programs callable via dynamic CALL. COPY books (.cpy) are textual includes for shared record layouts — every shop has hundreds. REPLACING clause on COPY does textual substitution: COPY CUSTREC REPLACING ==:PFX:== BY ==CUST==. produces a copy with :PFX: replaced by CUST everywhere. Modern (COBOL-85+) nested programs allow lexical nesting with IS COMMON and IS GLOBAL visibility.
Error handling. Per-statement: ON SIZE ERROR, ON OVERFLOW, INVALID KEY, AT END, NOT ON SIZE ERROR. File I/O sets FILE STATUS codes (two-digit string). No exceptions in classic COBOL; COBOL-2002 added RAISE/DECLARATIVES for exception handling but adoption is uneven. Mainframe shops use the Language Environment (LE) condition handler for cross-language error propagation.
Concurrency. None in the standard. Mainframe concurrency comes from outside: CICS transactions are inherently per-instance, batch jobs are scheduled by JES2/JES3, parallel sysplex provides cluster-level concurrency. COBOL programs themselves are single-threaded. Recent dialects (Visual COBOL on .NET/JVM) inherit host concurrency.
I/O. First-class, the language was designed around it.
- Sequential files:
OPEN INPUT INFILE / READ INFILE INTO REC AT END SET EOF TO TRUE END-READ / CLOSE INFILE. - Indexed files (VSAM/KSDS-style):
ORGANIZATION IS INDEXED RECORD KEY IS CUST-ID ALTERNATE RECORD KEY IS NAME WITH DUPLICATES.READ ... KEY IS ...,START,REWRITE. - Relative files: access by record number.
- Line-sequential (text files on Unix): GnuCOBOL/Visual COBOL extension.
Stdlib highlights. Intrinsic functions (FUNCTION …): math (SQRT, LOG, MEAN), strings (UPPER-CASE, LOWER-CASE, REVERSE, TRIM), date/time (CURRENT-DATE, INTEGER-OF-DATE, DAY-OF-WEEK), type conversion (NUMVAL, NUMVAL-C for currency-formatted strings). COBOL-2014 added JSON GENERATE / JSON PARSE and XML GENERATE / XML PARSE for in-language serialization (one of the big wins for modernization).
Advanced
Memory model. WORKING-STORAGE is allocated once, persists across calls (effectively static). LOCAL-STORAGE is allocated per invocation (needed for re-entrant programs called concurrently from CICS). LINKAGE SECTION items are not allocated by the called program — they reference caller’s storage. Pointers (USAGE POINTER, USAGE PROCEDURE-POINTER) and ADDRESS OF enable dynamic structures and FFI; ALLOCATE/FREE (COBOL-2002) heap-allocate.
Concurrency deep dive. In modern COBOL (Enterprise COBOL on z/OS), THREAD compiler option enables thread-safe code via LOCAL-STORAGE; CICS transactions and IMS message regions provide the actual concurrency. On distributed platforms, Visual COBOL on .NET/JVM exposes host threads. COBOL standard 2002 has no concurrency primitives.
FFI. Three patterns: (1) CALL “C-FUNCTION” with matching LINKAGE SECTION — works directly to C if calling conventions match. GnuCOBOL transpiles to C, so any C library is callable. (2) EXEC SQL preprocessor (DB2, Oracle, PostgreSQL via OpenESQL) embeds SQL with EXEC SQL ... END-EXEC, expanded by precompiler into CALLs to runtime. (3) EXEC CICS for CICS transaction services (EXEC CICS READ FILE('CUSTFILE') RIDFLD(CUST-ID) INTO(CUSTREC) END-EXEC). On z/OS, the Language Environment (LE) lets COBOL, PL/I, C, and assembler share a runtime, condition handling, and storage manager.
Reflection. Almost none. LENGTH OF returns the byte length of an item. ADDRESS OF returns a pointer. No introspection of record layouts at runtime; all layout knowledge is compile-time via copybooks.
Performance tools. On z/OS: Strobe, APA (Application Performance Analyzer), IBM Fault Analyzer for post-mortem dumps, CICS Performance Analyzer. Compile flags matter — OPT(2) (ARCH(11+) on modern z) for hot paths, LIST/MAP for storage-map listings. On distributed: standard profilers if compiled to native; .NET/JVM dialects use host profilers (PerfView, JFR).
God mode
REDEFINES lets two declarations alias the same storage — used to interpret a record different ways, or implement variant records:
01 TXN-RECORD.
05 TXN-TYPE PIC X.
05 TXN-DATA PIC X(99).
05 PURCHASE REDEFINES TXN-DATA.
10 AMOUNT PIC S9(7)V99 COMP-3.
10 SKU PIC X(15).
10 FILLER PIC X(78).
05 REFUND REDEFINES TXN-DATA.
10 ORIG-TXN PIC 9(12).
10 AMOUNT PIC S9(7)V99 COMP-3.
10 FILLER PIC X(82).Combined with COMP-3 (packed decimal — 2 digits/byte plus sign nibble), this is how every credit card transaction record on Earth is laid out.
COPY … REPLACING as a poor man’s macro system. Use ==:tag:== markers and substitute. Combined with conditional >>IF / >>ELSE / >>END-IF directives (COBOL 2002+), copybooks become a real preprocessor.
Embedded SQL precompilers. DB2 precompiler (DSNHPC) and Oracle Pro*COBOL translate EXEC SQL into calls + cursors + descriptor structs. SQLCODE/SQLSTATE returned in the SQLCA (SQL Communications Area) record. Modern: OpenESQL (Micro Focus) targets PostgreSQL/MS SQL/ODBC. CICS preprocessor (DFHECP1$) translates EXEC CICS similarly.
REPORT WRITER. A largely-forgotten gem of the standard: declarative reports with RD (Report Description) clauses defining headers, footers, control breaks, page format. INITIATE / GENERATE record / TERMINATE. Eliminates hand-coded pagination and totaling logic. Optional in COBOL-85, mandatory in 2002. Most shops moved to dedicated report tools (SAS, Crystal) but it’s still in the standard.
OO COBOL (2002+). CLASS-ID. WIDGET INHERITS FROM BASE. METHOD-ID. PAINT. INVOKE WIDGET "PAINT" RETURNING .... Visual COBOL leans into this; mainframe shops mostly ignore it.
GnuCOBOL → C. cobc -E hello.cob shows the C output. Useful for understanding semantics, debugging, or hand-editing for unusual deployment. The C source is verbose but readable.
JCL integration. On z/OS, the COBOL program is just one step in a JCL job:
//STEP1 EXEC PGM=PAYROLL
//INFILE DD DSN=PROD.CUSTOMER.MASTER,DISP=SHR
//OUTFILE DD DSN=PROD.PAYROLL.OUT,DISP=(NEW,CATLG),
// SPACE=(CYL,(100,50)),DCB=(RECFM=FB,LRECL=200)The DD statements bind logical file names (in SELECT INFILE ASSIGN TO INFILE) to physical datasets. Understanding JCL is essential for any z/OS COBOL work.
Language Environment intrinsics. On z/OS, LE provides callable services: CEELOCT (local time), CEEDATE (date format), CEE3ABD (abend), CEEDCOD (decode condition tokens). These are the platform-portable layer beneath COBOL/PL/I/C.
Idioms & style
- Always use scope terminators (
END-IF,END-PERFORM,END-EVALUATE,END-READ). Period-terminated statements are a bug magnet — one stray period closes a block early. - Use
EVALUATEover chainedIF/ELSE IF. It’s COBOL’sswitchand supportsWHEN OTHER,WHEN ALSO(multi-dimensional), and ranges. - Use 88-level condition names instead of magic literals:
88 ACTIVE VALUE "A"thenIF ACTIVEreads cleanly and centralizes the constant. - Naming:
KEBAB-CASE-WITH-DASHES. Prefix indicators are common:WS-for working-storage,LK-for linkage,FD-for file description,WS-CUST-ID-N(N = numeric edited). - Use COMP-3 for currency. Decimal arithmetic, no float error, ~half the bytes of
DISPLAY. - Avoid
GO TOandALTER.ALTERwas removed in COBOL 2002 for a reason. UsePERFORMfor structured control flow. - Keep paragraphs small and
PERFORMthem. Resist falling-through paragraph boundaries — it’s allowed and almost always a bug. - Formatter/linter: cobol-check (unit tests + assertions), cobolcritic, OpenSource COBOL Lint (rare). Most shops rely on code reviews + compiler warnings (
cobc -Wflags). IBM provides Code Coverage + Debug Tool for Enterprise COBOL. - Expert review focus: (1) Period placement — do scope terminators close every nested block? (2)
MOVEbetween mismatched PICs — silent truncation or padding? (3)ON SIZE ERRORhandled on every COMPUTE? (4) File status checked after every I/O verb? (5)INITIALIZEcalled on records before reuse? (6) Re-entrant programs using LOCAL-STORAGE rather than WORKING-STORAGE? (7) COMP vs COMP-3 vs DISPLAY — wrong choice for money or for binary keys?
Ecosystem
- Mainframe runtimes: z/OS (CICS, IMS, DB2, JES, RACF, MQ), z/VSE, IBM i (formerly AS/400, now Power) with ILE COBOL.
- Distributed compilers: GnuCOBOL (FOSS, transpiles to C), GCC gcobol (FOSS, native, GCC 15+), Micro Focus Visual COBOL (Win/Linux/.NET/JVM), Fujitsu NetCOBOL, IBM COBOL for Linux on x86/Power.
- Modernization: Heirloom Computing (mainframe → JVM), AWS Mainframe Modernization (Blu Age, Micro Focus), Astadia, Asysco (mainframe → .NET).
- DevOps for mainframe: IBM Z Open Editor (VS Code), Zowe (open-source mainframe CLI/SDK), DBB (Dependency Based Build, Git-based replacement for Endevor), Bridge for Git.
- Testing: cobol-check (xUnit-style), MFUnit (Micro Focus), IBM zUnit, CoboMojo.
- Notable users: Every major US bank, IRS, Social Security Administration, Fortune 500 insurance, FAA, every airline reservation system, most state DMVs and unemployment systems (the headline-grabbing pandemic crash-stories were all COBOL backends overwhelmed, not COBOL itself failing).
Gotchas
- Period scoping.
IF X = 1 MOVE A TO B. MOVE C TO D.— the secondMOVEis outside the IF because of the period. UseEND-IF. This single rule has caused more COBOL bugs than any other. - Implicit
MOVEtruncation. MovingPIC X(10)toPIC X(5)silently truncates the right side; movingPIC 9(5)toPIC 9(3)silently drops high digits without raisingON SIZE ERROR(that fires only on COMPUTE/arithmetic). UseMOVE ... TO X END-MOVEis no help; check lengths manually or useINSPECT. - EBCDIC vs ASCII. Mainframe COBOL data is EBCDIC; distributed COBOL is ASCII. Files moved between platforms need conversion (
iconv, FTP ASCII mode, special FD clauses).PIC X(1) VALUE "A"is a different byte on z/OS vs Linux. - COMP vs COMP-5.
COMP(a.k.a. COMP-4, BINARY) respects PIC width —PIC S9(4) COMPtruncates beyond 9999.COMP-5is “true binary” — uses the full storage width regardless of PIC. Mixing them when interfacing with C will silently corrupt data. - COMP-3 sign nibble. Packed-decimal stores sign in the low nibble of the last byte:
0xC= positive,0xD= negative,0xF= unsigned. Hand-edited or imported data with the wrong sign nibble silently parses but fails arithmetic comparisons. STOP RUNin a called program. On z/OS LE,STOP RUNfrom a sub-program may terminate the entire LE enclave, not just the sub-program. UseEXIT PROGRAMorGOBACK(the modern, always-correct verb) instead.CALLoverhead. StaticCALLis cheap; dynamicCALL identifierdoes a load-on-first-call and pays a lookup cost. For inner loops, prefer static. Cancel dynamically loaded programs withCANCELto free storage; otherwise they leak across CICS transactions.- File status not checked.
READ/WRITE/OPENset a 2-byte status code; if you don’t check it (AT END,INVALID KEY, or explicit FILE STATUS), failures silently propagate. Mandatory practice: every I/O verb has a status check. INITIALIZEand FILLER.INITIALIZEskipsFILLERitems by default and only sets numeric to zero, alphanumeric to spaces. UseINITIALIZE REPLACING ALL DATA BY ...orINITIALIZE ... WITH FILLERfor full clearing.- Standards drift. “COBOL” without qualification could be any of -85, 2002, 2014, 2023, IBM Enterprise (which adds extensions), Micro Focus (adds different extensions), GnuCOBOL (subset). Always compile-check on the target dialect.
Citations
- ISO/IEC 1989:2023 (COBOL 2023): https://www.iso.org/standard/74527.html.
- gcobol (GCC COBOL frontend) docs: https://gcc.gnu.org/onlinedocs/gcobol/.
- GnuCOBOL home (3.2, July 2023): https://gnucobol.sourceforge.io/.
- IBM Enterprise COBOL for z/OS docs: https://www.ibm.com/docs/en/cobol-zos.
- Micro Focus Visual COBOL: https://www.microfocus.com/en-us/products/visual-cobol/overview.
- Wikipedia, COBOL (history, divisions, OO, dialects, usage stats): https://en.wikipedia.org/wiki/COBOL.
- IBM Language Environment (LE) Programming Guide: https://www.ibm.com/docs/en/zos/3.1.0?topic=environment-programming-guide.
- Zowe (open-source mainframe DevOps): https://www.zowe.org/.
- DBB (Dependency Based Build): https://www.ibm.com/docs/en/dbb.