Walkthrough: Design a SOC + Incident Response Platform (Mid-Enterprise, 5-10K Endpoints)
This walkthrough scopes a Security Operations Center (SOC) plus incident-response platform for a mid-enterprise organization: 5,000-10,000 endpoints, 500-2,000 servers + cloud workloads, hybrid on-prem + multi-cloud, ~300M-2B annual revenue, with regulated data (PCI-DSS in scope, SOX, regional privacy regimes including GDPR + CCPA). The design covers SIEM tier, SOAR automation, EDR/XDR, NDR, identity threat detection, threat intel, vulnerability management, exposure management, deception, IR playbooks, staffing model (build vs MSSP vs hybrid), MITRE ATT&CK mapping, and the budget envelope (3-8M annual). The reference period is 2025-2026 — post-NIST CSF 2.0 (Feb 2024), CIRCIA reporting rules nearing effectivity, EU NIS2 + DORA in force.
Reference enterprises and recent incidents shaping the design: MGM Resorts (Sep 2023, 15M ransom); Change Healthcare (Feb 2024, BlackCat, 5.4B in airline + healthcare losses — not an attack but instructive); SolarWinds SUNBURST (2020, supply-chain); Microsoft Storm-0558 (Jul 2023, Azure key compromise affecting State Dept); Snowflake customer breaches (2024, valid credentials abused — Ticketmaster, AT&T, Santander).
1. Program spec
| Parameter | Target | Notes |
|---|---|---|
| Endpoints | 7,500 (laptops + workstations + servers + VMs) | Mix Windows / Mac / Linux |
| Cloud workloads | 2,000 (AWS + Azure primarily, some GCP) | EKS/AKS pods counted at deployment, not pod-instance |
| Mobile | 8,000 (MDM-enrolled) | iOS + Android |
| Network sites | 1 HQ + 8 satellite offices + 3 data centers + 2 cloud regions | 1 Gbps WAN typical |
| Identity surface | 12,000 directory accounts (employees + contractors + service) | Entra ID primary, Okta secondary |
| Coverage | 24×7×365 | Follow-the-sun or 3-shift |
| MTTD target | <15 min for critical | Median across confirmed incidents |
| MTTR target | <4 hr containment for critical | From initial detection |
| Major-incident MTTR | <72 hr full eradication + recovery | Per CISA + NIST IR-4 |
| Tier 1 alert volume | 200-800/day (after tuning) | Pre-tuning often 5,000+/day |
| Annual budget | $3-8M | See section 12 |
| Headcount | 12-22 FTE direct security | + leveraged MSSP / IR retainer |
2. SOC tier model and staffing
2.1 Function tiers
| Tier | Role | Daily activities |
|---|---|---|
| Tier 1 — Alert triage | Initial alert review, false-positive filter, escalate | Watch dashboards, validate IOC + telemetry, dispatch tickets |
| Tier 2 — Incident response | Investigate confirmed alerts, scope, contain | Pivot across SIEM + EDR + identity, write IR notes |
| Tier 3 — Threat hunting + senior IR | Proactive hunts, complex cases, malware analysis | Hypothesis-driven hunts, reverse engineering, advanced forensics |
| Tier 4 — Engineering | Detection content, SOAR playbooks, platform tuning | Sigma + KQL + Splunk SPL detection writing, parsers |
| SOC manager / lead | Operational management, metrics, vendor liaison | Scheduling, vendor reviews, exec reporting |
2.2 Staffing model — build vs MSSP vs hybrid
Three patterns for 24×7 coverage:
| Model | Annual cost (US blended) | Pros | Cons |
|---|---|---|---|
| Full in-house 24×7 | $2.5-5M just for Tier 1+2 (15-20 FTE) | Full control, deep institutional knowledge | Hiring + retention nightmare, particularly nights/weekends |
| Pure MSSP | $0.8-2.5M | Cheap, fast to stand up | Generic playbooks, alert-shopping behavior, weak custom detection |
| Hybrid — MSSP Tier 1 + in-house Tier 2-4 | $1.5-3.5M total | Best of both; outsource the worst shift work | Hand-off friction; needs disciplined runbook ownership |
For a 7,500-endpoint enterprise, hybrid dominates 2024-2026. Pattern: MSSP handles 24×7 Tier 1 (alert validation + immediate containment per pre-approved playbook); in-house Tier 2 takes over on escalation; in-house Tier 3 hunts; in-house Tier 4 engineers detection content + SOAR.
MSSP options for Tier 1:
- Arctic Wolf (~$60-120/endpoint/yr) — Concierge model + Arctic Wolf MDR
- Expel (~$45-85/endpoint/yr) — BYO-SIEM model + Expel Workbench
- Red Canary (~$50-95/endpoint/yr) — strong EDR partnership (CrowdStrike + Microsoft)
- Mandiant Managed Defense (Google Cloud) — enterprise pricing; ex-FireEye DNA, premium positioning
- Secureworks Taegis MDR — Dell subsidiary, ~$40-90/endpoint
- Rapid7 MDR — ~$50-95/endpoint
- CrowdStrike Falcon Complete (~$75-180/endpoint) — vendor-lock-in caveat
- Sophos MDR — strong SMB-mid penetration
- Trustwave (Singtel) — legacy enterprise
In-house headcount (Tier 2-4) for 7,500 endpoint enterprise:
- 4-6 Tier 2 incident responders
- 1-2 Tier 3 threat hunters
- 2-3 Tier 4 detection engineers
- 1 SOC manager / lead
- 1 IR / forensics lead
- 1-2 security architects (cross-functional)
- 1 governance/compliance liaison
- = 11-16 FTE direct + manager
2.3 Coverage patterns
- Single-time-zone 24×7 — 3-shift rotation in one location. Hard on staff; tier 1 burnout common. US blended fully-burdened ~$110-170K per FTE.
- Follow-the-sun — 8-hr-day operations in 3 time zones (Americas + EMEA + APAC). Vendor partnerships or own offices. Healthier for staff but requires multi-region governance.
- Outsourced nights + weekends, in-house business hours — common for sub-1000-employee SOCs.
3. SIEM tier — the decision
3.1 Vendor selection
Three dominant SIEM platforms for mid-enterprise 2025-2026, plus emerging alternatives:
| Platform | Vendor | Pricing model | Strengths | Weaknesses |
|---|---|---|---|---|
| Splunk Enterprise Security | Splunk (Cisco, $28B acquisition Mar 2024) | Ingest-based ($1,500-3,500/GB/day annualized) | Mature, vast app ecosystem, deepest detection content | Cost at scale; ingest pricing punishes verbose log sources |
| Microsoft Sentinel | Microsoft | $1.5-3/GB ingested + commitment tiers | Native E5 integration, cheap if already heavy MS; KQL is excellent | Heavily favored toward MS workloads; Log Analytics complexity |
| Google SecOps (Chronicle) | Google Cloud | Flat per-employee ($2-8/user/mo) — no ingest pricing | Massive scale, 12-month retention default, no ingest tax | Smaller content marketplace than Splunk; UDM learning curve |
| CrowdStrike Falcon LogScale / NG-SIEM | CrowdStrike | Index-free; per-GB-stored + retention | Hot+cold data, very fast queries, great UX | Younger detection content library; lock-in with Falcon stack |
| Panther | Panther Labs | $0.60-1.20/GB stored (cloud-native, Snowflake-based) | Code-first detections (Python), modern UX | Newer, smaller content library |
| Sumo Logic Cloud SIEM | Sumo Logic | $2-4/GB ingested | Strong cloud-native | Less common in enterprise security |
| Elastic Security | Elastic | Per-resource, open source core | Cheap at very high volume + flexible | Ops-heavy if self-hosted; managed cloud pricing fairer |
Selection for this design: Microsoft Sentinel primary, given Entra ID + M365 + Azure heavy footprint. Splunk secondary for legacy on-prem + network sources. Or Google Chronicle if predictable flat pricing matters more than ecosystem.
The 2-SIEM pattern is unusual but common in 2025: Microsoft Sentinel for identity + cloud + endpoint, Splunk for network + OT + DLP + legacy app logs. Costs ~25-40% more but avoids forced data-bloat into a single vendor’s ingest meter.
See observability-stack for parallel observability tooling.
3.2 Log sources
| Source | Daily volume (estimate) | Critical for |
|---|---|---|
| Windows event logs (Sysmon, Security, Application) | 30-150 GB | Endpoint behavior, lateral movement |
| Azure AD / Entra sign-ins + audit | 5-30 GB | Identity threats |
| M365 Defender / Defender for Endpoint | 15-80 GB | Email + endpoint |
| AWS CloudTrail | 5-40 GB | Cloud control plane |
| AWS VPC Flow / GuardDuty | 20-200 GB | Network |
| Firewall (Palo Alto, Cisco, Fortinet) | 30-300 GB | Perimeter + lateral |
| Web proxy / SWG (Zscaler, Netskope, Palo Alto Prisma) | 20-200 GB | Egress + URL filtering |
| EDR telemetry (CrowdStrike RTR, SentinelOne Deep Visibility) | 20-100 GB | Endpoint deep telemetry |
| DNS (internal recursive + DoH/DoT egress) | 10-50 GB | C2 + DGA detection |
| MFA + IDP (Okta, Duo, Auth0) | 1-10 GB | Identity threats |
| Cloud SaaS (Box, Slack, GitHub, Salesforce admin) | 5-50 GB | Insider + supply-chain |
| Kubernetes audit + container runtime | 10-80 GB | K8s threats |
| OT / SCADA (where present) | 1-15 GB | OT-specific detections |
Total ~170-1,300 GB/day → at 120K-950K/yr just for ingest. Strong tiering required: keep noisy/low-value logs in cheap storage (Azure Storage Lifecycle, S3 Glacier) with on-demand rehydration; or push them through a log-shaping pipeline (Cribl Stream, Datadog Observability Pipelines, Fluentbit + filtering) before SIEM ingest.
3.3 Log pipeline
Cribl Stream (or Fluent Bit + custom rules) sits between sources and SIEM:
- Drops high-volume / low-value events (DNS heartbeats, health-check noise)
- Routes copies to multiple destinations (SIEM + cheap storage + lake)
- Normalizes formats (OCSF — Open Cybersecurity Schema Framework, AWS+Splunk-driven; or ECS — Elastic Common Schema)
- Adds enrichment (asset metadata, geo, threat intel)
Cribl licensing $0.30-0.80/GB processed; payback typically <12 mo through SIEM-ingest reduction of 40-70%.
3.4 Detection content
- Sigma rules — vendor-neutral detection format; share + author rules cross-SIEM.
- Splunk ES Risk-Based Alerting + Splunk SOAR Use Cases.
- Sentinel Analytic Rules + Microsoft Defender Threat Intelligence (MDTI) integration; Sentinel Content Hub.
- Chronicle YARA-L rules.
- MITRE ATT&CK mapping — every detection rule should map to one or more ATT&CK techniques.
- DeTT&CT + Atomic Red Team + Sigma HQ repositories — open content libraries.
- VECTR (SRA) for purple team / detection coverage tracking.
Annual detection-engineering target: 2-4 new validated detections per Tier 4 engineer per week (50-100 per engineer/yr at quality bar).
4. EDR / XDR
4.1 Vendor selection
Three dominant EDR platforms 2024-2026:
| Platform | Vendor | Strengths | Pricing per endpoint/yr |
|---|---|---|---|
| CrowdStrike Falcon (Insight + Prevent + OverWatch + Identity Protection) | CrowdStrike | Best-in-class detection; mature OverWatch hunting service | $70-200 (modular) |
| Microsoft Defender for Endpoint (P1 + P2) | Microsoft | Native E5; free if E5; Office + identity integration | Included in E5 ($57/user/mo) |
| SentinelOne Singularity | SentinelOne | Strong autonomous response; competitive pricing; Linux + IoT support | $50-150 |
Other contenders: Palo Alto Cortex XDR, Sophos Intercept X EDR, Trellix (FireEye + McAfee), Cybereason, VMware Carbon Black (now Broadcom — declining), Trend Vision One, ESET PROTECT MDR.
The CrowdStrike July 19 2024 outage (Falcon Channel File 291 caused Windows BSOD on ~8.5M endpoints worldwide, ~$5.4B aggregate impact reported by Parametrix) ended the “one EDR everywhere” orthodoxy. Many enterprises now run two EDRs — CrowdStrike on the majority + Defender for Endpoint on the rest as fallback. Microsoft Sysmon as a free always-on telemetry source independent of EDR.
4.2 Coverage
- Windows + macOS + Linux servers + workstations: 100% EDR coverage non-negotiable
- Mobile (iOS + Android): MDM-managed via Intune, Jamf, Workspace ONE; mobile threat defense (Lookout, Zimperium) for sensitive populations only
- Containers + Kubernetes nodes: cluster-side runtime security (Wiz Runtime, Sysdig Falco, Aqua Security, CrowdStrike Container)
- Cloud workloads / serverless: agentless cloud security posture management (CSPM) — Wiz, Orca, Tenable Cloud Security
- OT / ICS: Dragos, Claroty, Nozomi Networks for industrial protocol monitoring
4.3 Response automation
EDR endpoint response automation (without human approval):
- Network isolation (host quarantine) on confirmed malware
- Process termination
- File quarantine
- Registry remediation
- AD/Entra account auto-disable on high-confidence credential-theft signals
Requires careful tuning. Pre-approved playbook list signed off by IT operations + business stakeholders quarterly.
5. NDR + network visibility
5.1 NDR vendors
Network Detection and Response complements EDR for east-west traffic + unmanaged devices:
| Vendor | Strengths | Notes |
|---|---|---|
| Darktrace | Self-learning baseline (Bayesian Enterprise Immune System); strong on novel anomaly | Pricier; PhD-tier explainability sometimes weak |
| ExtraHop (CrowdStrike acquisition Aug 2024) | Wire-data analysis; deep packet inspection; strong forensic timeline | Best for high-fidelity east-west monitoring |
| Vectra AI | Attack signal intelligence on top of NDR; multi-cloud | Strong identity threat detection |
| Corelight | Zeek-based; rich session metadata; open NDR | Often pairs with another NDR for richer detection |
| Cisco Secure Network Analytics (Stealthwatch) | NetFlow-based; Cisco-shop fit | Showing age; license consolidation under Cisco SecureX |
| Arista NDR (ex-Awake Security) | Strong on entity-behavior; Arista network integration | Smaller deployment base |
| Plixer Scrutinizer | NetFlow-only; cheaper | Less detection sophistication |
For 7,500-endpoint enterprise: 1 NDR vendor (typically Darktrace or ExtraHop) covering DC + critical office segments + cloud VPC traffic mirrors. Capex + Year 1 license $300K-900K.
5.2 Network architecture for visibility
- TAP / SPAN ports at every datacenter aggregation switch
- VPC Traffic Mirroring (AWS) / Azure vTAP / Google Cloud Packet Mirroring → NDR appliances
- DNS resolver logs (split-horizon recursive, BIND or Microsoft DNS or Infoblox)
- DHCP logs for asset attribution
- ZTNA telemetry (Zscaler ZIA + ZPA, Netskope, Cloudflare Zero Trust, Palo Alto Prisma Access)
5.3 Asset inventory
Authoritative asset inventory is the foundation. Sources fused:
- AD / Entra ID
- Intune + Jamf + Workspace ONE
- CrowdStrike Falcon Discover
- ServiceNow CMDB
- Cloud provider native inventory (AWS Config, Azure Resource Graph, GCP Asset Inventory)
- DHCP + DNS
Fusion platforms: Axonius, JupiterOne, runZero (formerly Rumble), Lansweeper. ~$80-200K/yr for mid-enterprise.
6. Identity threat detection (ITDR)
Identity is the new perimeter. Major incidents 2023-2024 (MGM, Caesars, Change Healthcare, Snowflake-customer cohort) all pivoted on identity compromise.
6.1 IdP stack
- Entra ID (Microsoft) — primary IdP for the design enterprise; P2 license required for risk-based conditional access + Identity Protection + Privileged Identity Management (PIM)
- Okta — secondary or B2B federation
- Duo / Azure MFA / Okta Verify — MFA enforcement; phishing-resistant (FIDO2 keys or platform authenticators) on privileged accounts
- Privileged Access Management (PAM): CyberArk PAM, Delinea Secret Server (formerly Thycotic), HashiCorp Vault for service accounts, BeyondTrust Password Safe
- Active Directory — Tier 0 protection, ESAE / red-forest decommissioning, Microsoft Defender for Identity (formerly ATA)
- See auth-authz
6.2 ITDR vendors
- Microsoft Defender for Identity (Entra ID + on-prem AD) — covered in E5
- CrowdStrike Falcon Identity Protection (formerly Preempt)
- Silverfort — multi-factor + identity protection without agent
- Semperis Directory Services Protector — AD forensics + recovery (the Storm-0558 + post-NotPetya playbook)
- Authmind, Push Security, Oort — newer entrants
ITDR detections to prioritize:
- Anomalous sign-in (impossible travel, unusual ASN, prior-unseen device)
- Token theft / replay (refresh-token reuse, primary-refresh-token theft per Storm-0558)
- Privileged role elevation
- Password spray + credential stuffing
- AS-REP roasting + Kerberoasting (on-prem AD)
- DCSync + DCShadow attacks
- Conditional access policy modification
- Service principal credential drift
- Stale + over-privileged service account use
7. SOAR — playbook automation
7.1 Vendors
| Platform | Vendor | Strengths |
|---|---|---|
| XSOAR (Cortex XSOAR) | Palo Alto Networks (Demisto acquisition 2019) | Most mature; 800+ content packs; case management |
| Splunk SOAR (Phantom) | Splunk/Cisco | Tight Splunk ES integration |
| Microsoft Sentinel Automation Rules + Logic Apps | Microsoft | Cheap if Sentinel-native; less mature than XSOAR |
| Tines | Tines (Dublin, founded 2018) | Modern no-code; rapidly growing in modern SOCs |
| Torq | Torq | Code-first hyperautomation; newer |
| Swimlane | Swimlane | Strong case management |
| Chronicle SOAR (formerly Siemplify, Google acquisition 2022) | Google Cloud | Chronicle-native |
For new builds 2024-2026, Tines or Torq are the modern choices — faster playbook authoring + cleaner workflow than XSOAR. XSOAR remains dominant in large legacy SOCs.
7.2 Playbooks — common patterns
- Phishing triage: user-reports-button → URL detonation (urlscan.io, Browserling, Cuckoo, ANY.RUN, VMRay) + attachment sandbox (Joe Sandbox, Hatching Triage, ANY.RUN, VirusTotal) + IOC enrichment (VirusTotal, AbuseIPDB, GreyNoise, AlienVault OTX, Recorded Future) → verdict → mailbox-wide URL block or false-positive close
- Suspicious login: ITDR alert → asset enrichment → user verification via Slack/Teams DM → user confirms = close; user denies or no-response = revoke sessions + reset password + initiate IR
- Malware on endpoint: EDR alert → quarantine host → snapshot memory + disk → enrich (file hash + parent process tree + persistence + lateral indicators) → escalate or close
- DLP / Insider event: data egress detected → SaaS log enrichment → check sensitivity classification → user confirmation → escalate to HR + Legal if confirmed
- Cloud misconfig (CSPM): Wiz / Orca finding → ticket to platform team → patch SLA tracking → close on remediation
- Brute force / spray: lockout pattern → IDP block + IP block at WAF + alert
- Threat hunt automation: scheduled queries → anomaly scoring → analyst review queue
7.3 Communication automation
- Pager: PagerDuty (~$25-49/user/mo) or Opsgenie (Atlassian) — escalation policies, on-call rotations
- Chat-ops: Slack or Teams with bot integration; severity rooms auto-created per incident
- Status page: Statuspage (Atlassian) or Better Stack for customer-facing comms during major incidents
- IR docs: Notion or Confluence; case mgmt in TheHive (open source) or SOAR-native (XSOAR, Tines)
8. Threat intelligence
8.1 Feeds
| Feed | Cost/yr | Purpose |
|---|---|---|
| Mandiant Threat Intelligence (Google) | $80-300K | Premium APT + nation-state tracking + advisories |
| Recorded Future | $80-250K | Broad coverage + automated ingestion + Insikt Group reports |
| CrowdStrike Falcon Intelligence | included in Falcon tiers | Strong eCrime + adversary attribution |
| Microsoft Defender Threat Intelligence (MDTI) | included E5 | M365-integrated |
| Anomali ThreatStream | $60-180K | Multi-feed aggregation platform |
| ThreatConnect | $50-150K | TIP + SOAR-adjacent |
| Flashpoint | $70-200K | Dark web + fraud + insider |
| Intel 471 | $80-250K | Crimeware + ransomware-as-a-service |
| VirusTotal Enterprise | $5-15K | File + URL + IP enrichment API |
| Shodan + Censys | $5-20K | External-attack-surface enrichment |
| AbuseIPDB + GreyNoise | $5-15K | IP reputation + scanning noise |
| AlienVault OTX | free | OSINT IOC feed |
| MISP (Malware Information Sharing Platform) | free / community | IOC sharing |
| ISACs (FS-ISAC, H-ISAC, A-ISAC, etc.) | $20-100K membership | Sector-specific sharing |
For 7,500-endpoint enterprise: 1 premium feed (Mandiant or Recorded Future) + Microsoft MDTI (E5-bundled) + free OSINT feeds (AlienVault OTX, GreyNoise community) + sector ISAC.
8.2 Threat intel platform (TIP)
Anomali ThreatStream, ThreatConnect, or MISP (open source) consolidate feeds + STIX/TAXII normalization + indicator deduplication + scoring + dispatch to SIEM/EDR/firewall via API integrations.
8.3 Strategic intel
Quarterly threat-landscape briefings synthesized for exec leadership: trend in ransomware payouts (Chainalysis Crypto Crime Report), top targeting actors for the sector, supply-chain risks, geopolitics-driven shifts (e.g., Iran nexus post-Oct 2023, China nexus around Taiwan flashpoints, Russia + Ukraine cyber escalation).
9. MITRE ATT&CK mapping + threat-informed defense
ATT&CK is the lingua franca of detection 2025-2026:
- Coverage mapping: every detection rule mapped to one or more techniques. Track in DeTT&CT or VECTR. Target: cover 70%+ of techniques relevant to identified threat actors.
- Adversary emulation: Atomic Red Team (open source) for atomic technique tests; CALDERA (MITRE) for chained scenarios; commercial: SCYTHE, AttackIQ, Cymulate, SafeBreach, Picus Security for BAS (Breach and Attack Simulation).
- Purple team exercises: quarterly internal exercises using realistic TTPs (e.g., Scattered Spider playbook for IT-helpdesk social engineering + MFA bombing; BlackCat ransomware staging; APT41 lateral movement).
- External red team: annual ~$100-300K engagement (Bishop Fox, TrustedSec, IOActive, NCC Group, Mandiant Red Team, Outflank).
- Bug bounty / VDP: HackerOne / Bugcrowd / Intigriti ~50-200K.
10. Vulnerability + exposure management
10.1 Scanning + EASM
- Vulnerability scanner: Tenable.io / Tenable Security Center, Qualys VMDR, Rapid7 InsightVM. ~$25-90K/yr for 10K assets.
- Web app scanner: Invicti (formerly Netsparker), Burp Suite Enterprise (PortSwigger), Acunetix, Detectify, StackHawk for CI integration.
- External Attack Surface Management (EASM): Tenable ASM (formerly BitSight + ASM acquisition), Microsoft Defender EASM (RiskIQ acquisition), CrowdStrike Falcon Surface, Censys ASM, Bishop Fox CAST.
- Container + IaC scanning: Wiz, Snyk, Aqua, Sysdig, Tenable Cloud Security, Palo Alto Prisma Cloud, Trend Vision One Cloud Security.
- SBOM + supply chain: GitHub Advanced Security, Snyk, Mend (WhiteSource), Chainguard for hardened images, in-toto / SLSA for build provenance.
10.2 Risk-based vulnerability management (RBVM)
CVSS alone is insufficient. Modern stack:
- EPSS (Exploit Prediction Scoring System, FIRST.org) — probability of exploitation in 30 days
- CISA KEV (Known Exploited Vulnerabilities) catalog — actively-exploited list
- Asset criticality — business-impact weighting
- Compensating controls — patched? mitigated? unreachable from internet?
Tools: Tenable VPR, Qualys TruRisk, Rapid7 ActiveRisk, Vulcan Cyber (Tenable acquisition 2024), Nucleus Security.
SLAs typical:
| Severity (post-context) | Internet-exposed | Internal |
|---|---|---|
| KEV / 9.0+ + exploited | 24-72 hr | 7 days |
| Critical (CVSS 9.0+) | 7 days | 30 days |
| High (7.0-8.9) | 30 days | 60 days |
| Medium (4.0-6.9) | 90 days | 180 days |
| Low (<4.0) | 180 days | next cycle |
11. IR playbooks + tabletop
11.1 Major scenarios
| Scenario | Playbook owner | Key partners |
|---|---|---|
| Ransomware | IR lead | Backup ops, Legal, PR, ransom payment intermediary (Coveware, Chainalysis), insurer |
| Business Email Compromise (BEC) / wire fraud | IR lead | Finance, Legal, FBI IC3, bank fraud unit |
| Supply chain compromise | IR lead + Architecture | Vendor risk, Legal, CISA |
| Insider threat / data exfil | IR lead | HR, Legal, e-discovery |
| Cloud misconfig leak | Cloud security lead | Platform engineering, Legal, PR if customer data |
| DDoS | NetOps + IR | ISP, Cloudflare/Akamai/CloudFront, Customers |
| Identity provider compromise (Storm-0558 class) | IR lead + IAM lead | IdP vendor, Customers, Legal |
| OT / safety-critical incident | OT lead + IR | Plant operations, Safety, Regulator |
Each playbook: scope, roles, decision tree, internal comms template, external comms template, evidence preservation checklist, legal hold trigger, regulator-notification timelines.
11.2 IR retainer
Pre-negotiated retainer with external IR firm: rapid-response, forensics depth, expert testimony capability.
- Mandiant (Google) — top of market; ~750-1,500
- CrowdStrike Services — strong with Falcon stack; bundled options
- Unit 42 (Palo Alto Networks) — strong with PAN stack
- Stroz Friedberg (Aon) — financial-services depth
- Kroll — broad IR + e-discovery
- CYPFER + Coveware — ransomware-negotiation specialists
- Microsoft DART — Microsoft-incident-specific via support contracts
- Sygnia — APT + targeted ops; strong in financial + critical infra
Retainer typically $50-200K/yr commitment + drawdown rate; converts to standby engineering hours if no IR.
11.3 Tabletop cadence
- Quarterly internal tabletop (60-90 min, scenario-focused, IR team + IT leadership)
- Semi-annual cross-functional (90-120 min, +Legal +HR +PR +Finance +Exec)
- Annual external-facilitated (Mandiant + insurer-driven; 4-6 hr, full-day)
12. Cost build-up — annual budget
12.1 Year-1 setup (one-time)
| Item | Cost |
|---|---|
| SIEM platform setup + initial content + consulting | $300,000 |
| EDR deployment + config | $80,000 |
| SOAR platform + initial playbooks | $120,000 |
| NDR appliances + deployment | $250,000 |
| Asset inventory platform setup | $35,000 |
| Network TAP + visibility infrastructure | $80,000 |
| SOC build-out (room, screens, ergonomics, comms) | $120,000 |
| External assessment + tabletop + initial pen test | $180,000 |
| Year-1 one-time | ~$1,165,000 |
12.2 Annual recurring (run-rate)
| Line | Annual cost (mid-range) |
|---|---|
| SIEM (Sentinel + Splunk hybrid, 600 GB/day blended) | $750,000 |
| EDR (CrowdStrike Falcon Insight + Prevent + OverWatch, 7,500 endpoints) | $725,000 |
| MDM + Defender for Endpoint (covered by E5; allocate from IT budget) | $0 (allocated elsewhere) |
| NDR (Darktrace or ExtraHop, mid-tier) | $480,000 |
| SOAR (Tines or XSOAR) | $185,000 |
| ITDR (CrowdStrike Identity Protection or Silverfort) | $145,000 |
| Threat intel (Mandiant or Recorded Future + community feeds + ISAC) | $185,000 |
| Vulnerability mgmt (Tenable.io + Wiz + Snyk) | $260,000 |
| EASM + asset inventory (Axonius + Microsoft Defender EASM) | $115,000 |
| MSSP — Tier 1 24×7 (Arctic Wolf or Expel) | $480,000 |
| PAM (CyberArk PAM mid-size) | $185,000 |
| PagerDuty + Slack + tooling | $48,000 |
| IR retainer (Mandiant or Unit 42) | $120,000 |
| Annual external red team + pen test | $180,000 |
| Bug bounty (HackerOne + bounty pool) | $130,000 |
| Cyber insurance premium (~$5M coverage; varies wildly) | $280,000 |
| Compliance audit (SOC 2 + PCI-DSS QSA + others) | $145,000 |
| Training + certifications (SANS, OffSec, vendor) | $80,000 |
| Tabletop facilitation (external annual) | $35,000 |
| In-house headcount (14 FTE × $185K avg loaded) | $2,590,000 |
| Misc operational (travel, conferences, swag, misc) | $60,000 |
| Total annual recurring | ~$7,228,000 |
Total = 7.2M/yr; over 5 years ~2.4B impact; MGM 2023 reported $100M+. Insurance + IR retainer + competent SOC is materially cheaper than the modal breach cost.
12.3 Trimming the budget
If 7-8M):
- Drop second SIEM (Splunk) — save $400K
- Drop NDR (rely on EDR + cloud-native) — save $480K
- Drop ITDR vendor; rely on Defender for Identity — save $145K
- Drop premium threat intel; rely on MDTI + free feeds — save $185K
- Reduce headcount to 9-10 FTE; rely more on MSSP — save $700K-900K
- Skip external red team; rely on BAS only — save $100K
- → ~$4.5M run rate. Still functional; coverage of advanced TTPs noticeably weaker.
13. Regulatory + compliance overlay
13.1 US
- CIRCIA (Cyber Incident Reporting for Critical Infrastructure Act of 2022) — proposed rule Apr 2024; 72-hr incident reporting to CISA; effective ~Oct 2025 once final rule published.
- SEC Cybersecurity Disclosure Rule (Jul 2023; effective Dec 2023) — Form 8-K Item 1.05 within 4 business days of material cybersecurity incident; 10-K annual disclosure of risk management.
- HIPAA Security Rule — if healthcare; OCR enforcement increasingly active.
- PCI-DSS v4.0.1 (effective Mar 31 2024; future-dated requirements Mar 31 2025) — payment card data.
- GLBA Safeguards Rule updates (2023) — financial services.
- NYDFS 23 NYCRR 500 + amendments (Nov 2023) — financial services in NY.
- CMMC 2.0 — DoD contractors; final rule late 2024 / 2025 phased implementation.
- See administrative-law + employment-and-environmental-law.
13.2 EU
- NIS2 Directive (effective Oct 17 2024; transposition ongoing 2025) — broad cyber baseline + incident reporting for essential + important entities.
- DORA (Digital Operational Resilience Act, Jan 17 2025 effective) — financial services + critical ICT third-party providers.
- GDPR Art 33 + 34 — 72-hr breach notification to supervisory authority; affected-individual notification when high risk.
- EU Cyber Resilience Act (in force Dec 2024; effective Dec 2027 fully) — product security for connected devices.
13.3 Other jurisdictions
- UK NIS Regulations 2018 + post-Brexit updates
- Singapore Cybersecurity Act 2018 + CSA Code of Practice
- Australian SOCI Act (Security of Critical Infrastructure)
- China DSL / PIPL / CSL for entities operating in mainland China
13.4 Frameworks
- NIST CSF 2.0 (Feb 2024) — Govern function added; broad management framework
- NIST SP 800-53 Rev 5 — federal controls library
- ISO 27001:2022 — international cert (controls reorg vs 2013)
- ISO 27017 + 27018 — cloud + privacy extensions
- CIS Controls v8.1 — operationally pragmatic
- MITRE D3FEND — defensive techniques pair with ATT&CK
- AICPA SOC 2 Type II — see design-saas-platform-launch for SaaS angle
14. Recent attack patterns to defend against (2024-2026)
- Scattered Spider / UNC3944 (MGM, Caesars, Snowflake-adjacent) — IT-helpdesk social engineering, MFA fatigue, Okta abuse, ransomware deploy. Defense: phishing-resistant MFA, ITSM agent verification, helpdesk re-verification protocols.
- BlackCat / ALPHV (Change Healthcare Feb 2024) — Citrix + VPN + ESXi → AD compromise → BlackCat ransomware. Operation disrupted by FBI Dec 2023, group “exit scammed” Mar 2024. Defense: external attack surface, patching, ESXi exposure, AD Tier 0.
- Akira + LockBit + Play + Medusa + Cl0p — top RaaS families 2024-2025; Cl0p MOVEit + GoAnywhere mass exploitation 2023-2024.
- MFA bombing / push fatigue — Microsoft Authenticator number-matching mandatory Feb 2023; FIDO2 push for highest-privilege accounts.
- Token theft + AiTM proxies (EvilProxy, Tycoon 2FA) — defeat traditional MFA via reverse proxy. Defense: conditional access + token-binding + FIDO2.
- Supply chain — npm / PyPI typosquatting + maintainer takeover — Polyfill.io (Jul 2024 ~110K websites), XZ Utils backdoor (Mar 2024, near-miss critical infrastructure).
- CVE-2024-3094 (XZ Utils) — multi-year social engineering of OSS maintainer position; serves as cautionary case for OSS dependency monitoring.
- Microsoft Storm-0558 (Jul 2023) + Midnight Blizzard / APT29 (Jan 2024 Microsoft corporate accounts + Hewlett Packard Enterprise) — nation-state cloud-identity attacks.
- Snowflake-customer breaches (UNC5537 / Shiny Hunters, May-Jun 2024) — credential theft via infostealer malware → valid Snowflake credentials → exfil Ticketmaster, AT&T, Santander, LendingTree, Advance Auto Parts. Defense: MFA on all Snowflake (even though “trusted” data), info-stealer detection.
15. Risk register
- Alert fatigue + analyst burnout — Tier 1 turnover 25-50% annual at typical SOCs; rotation + automation + clear escalation criteria mandatory.
- Vendor concentration — CrowdStrike Jul 19 2024 outage demonstrated the single-vendor risk. Defense-in-depth + Sysmon + native OS controls + multi-SIEM mitigate.
- MSSP quality variance — alert-shopping (passing alerts back to customer without enriched triage) is endemic at lower-tier MSSPs; rigorous SLAs + cycle of evidence-of-work audits.
- Detection gaps — coverage of cloud + identity + container + SaaS lag behind endpoint+network; intentional roadmap to close.
- Insider risk — DLP-only approaches miss collusion + planned exfil; UEBA + asset behavioral baselining needed for advanced cases.
- Regulatory clock risk — SEC 4-day Form 8-K, GDPR 72-hr, NIS2 24-hr-initial + 72-hr-full, CIRCIA 72-hr-significant + 24-hr-ransom — overlapping clocks during a major incident; pre-built notification matrix in IR playbook.
- AI-assisted attackers — LLM-generated phishing + voice clones + deepfake video calls accelerating 2024-2026 (FBI IC3 + EU CERT advisories). Companion: AI-assisted defenders (Microsoft Security Copilot, CrowdStrike Charlotte AI, Google Sec-PaLM) ramping in parallel.
- OT / IT convergence — Colonial Pipeline 2021 lesson still relevant; OT segmentation + Purdue model + ICS-specific NDR (Dragos, Claroty, Nozomi).
- Cyber insurance market hardening — premiums up 50-100% 2021-2024 then softened mid-2024; coverage exclusions expanding (war/terrorism — Merck v Ace Lloyd 2022; nation-state); careful policy reading + alignment with security controls.
16. Adjacent
- design-saas-platform-launch — SOC perspective on the SaaS that customers depend on
- design-data-center-cooling-system — physical layer for on-prem SOC
- auth-authz — identity + access foundations the SOC monitors
- observability-stack — log pipeline overlaps with SOC ingest pipeline
- distributed-systems-fundamentals — failure modes the SOC reasons about
- administrative-law — CIRCIA, SEC disclosure, NYDFS regulatory layer
- employment-and-environmental-law — insider + investigative employment-law constraints