Winter Soldier

Paste an indicator. Watch Claude investigate.

A Kali-native OSINT and pen-test operator console for cybercrime victims, attribution investigators, and authorised pen-test engagements. The whole stack — Next.js UI, Python MCP bridge, full Kali Linux toolkit, Flask backend — runs inside one Docker image. One docker run, one volume to back up, one OAuth token in Settings.

Scroll
1image
Docker artefact
12+
Indicator types
/data
One volume
0host prereqs
Beyond Docker
Paste-and-pivot

One indicator. A fleet of investigators.

Drop any indicator into the paste bar — a wallet address, a domain, an email, a file hash, a phone number. Winter Soldier auto-detects the type, spawns a Claude Code CLI subprocess to investigate, and queues every newly-discovered IOC as a pending pivot for the operator to approve.

>● detected: domain
01
investigatingevil-domain.example.testdomain
whoiscrt.shwaybackdnssubfinderamass
subs47 subdomains discovered · 12 with active SSL certs in last 90d
whoisregistrant proxied (Whois Privacy Corp, BS) · registered 2024-08-12
reuseNS records match 3 known phishing clusters · pivots queued ↓
turns 12tokens 184kelapsed 0:47+2 pivots queued
complete
02
pivoted to203.0.113.42ip-v4evil-domain.example.test
shodanrdappassive-dnsnmap
asnAS200000 — BulletProof Hosting Ltd · BG · 142 hosts on this /24
ports22 (OpenSSH 8.2) · 80 (nginx) · 443 (Let's Encrypt) · 3389 (RDP)
historypreviously hosted: 14 domains across 4 phishing kits since 2024
turns 8tokens 96kelapsed 0:31+1 pivot queued
complete
03
pivoted tobc1q…7t6jznwallet-btcdonation footer link
mempool.spaceOFAC listblockstreamarkham
volume₿ 14.82 received · ₿ 14.79 forwarded · 312 deposits since 2024-11
clusterco-spending cluster of 8 addresses · 2-of-3 multisig pattern
sanctionsno direct OFAC match · 2-hop link to a sanctioned mixer (suspect)
turns 6tokens 74kelapsed 0:24leaf node
complete

↳ each subprocess streams findings live via SSE; pivots fan out unbounded until depth caps hit

Twelve indicator types

Everything an investigator pastes, mapped to its own pipeline.

Each indicator type has its own prompt template, its own skill-augmented Claude workflow, and its own auto-pivot rules. The detector recognises them on paste; the dispatcher routes them automatically.

wallet (BTC)
wallet (ETH)
domain
url
ip-v4
ip-v6
email
phone (E.164)
username
person
hash (MD5)
hash (SHA1)
hash (SHA256)
repository
CVE
package
tx-hash
onion
Pen Test Mode

One toggle decides what the investigators can see.

The Kali toolkit is always installed and always running inside the container. The toggle is a pure MCP-config switch: it decides which tools are registered with each new spawned investigator. Toggling is instantaneous — no container start, no compose dance, no healthcheck wait.

Tools registered for this mode
What this means

Architecture

One container, three concerns, one published port.

s6-overlay runs as PID 1 and supervises two long-running services. Per-investigator Claude Code CLI subprocesses spawn on demand inside the same container. The audited bearer-token boundary between the MCP bridge and Flask survives intact — just on loopback instead of host.docker.internal.

winter-soldier (kalilinux/kali-rolling)
s6
s6-overlay (PID 1)
supervises both services · forwards SIGTERM cleanly to docker stop
nextjs (standalone)
operator UI · OAuth-token Settings · paste-and-pivot
:3000
🐍
kali-flask
Kali HTTP API · bearer-token gate · zebbern fork (audited)
127.0.0.1:5000
↑ spawns on demand ↑
claude CLI
one per active investigation · --mcp-config scratch/mcp.json
python mcp_server.py
MCP bridge · stdio in, HTTP to flask via loopback
one command, anywheredocker run -d \
  --name winter-soldier \
  -p 3000:3000 \
  -v winter-soldier-data:/data \
  --cap-add NET_RAW --cap-add NET_ADMIN \
  --device /dev/net/tun \
  ghcr.io/flightxcaptain/winter-soldier:latest

# open http://localhost:3000
# paste OAuth token in Settings → done.
What's in the box

A Kali-powered operator console, ready to docker run.

Winter Soldier collapses what used to be four host prerequisites (Node, npm, Python 3, Docker Desktop) into one. The full Kali pentest toolkit, the OSINT investigator pipeline, the Settings UI, and the audited security boundary all live inside one image. The operator runs docker run, pastes a token, starts investigating.

Single unified container

Next.js on 0.0.0.0:3000, Kali Flask on 127.0.0.1:5000, Python MCP bridge as a loopback child, supervised by s6-overlay. One docker run, one volume to back up.

Paste-and-pivot OSINT

Wallet, domain, URL, IP, email, phone, username, person, file hash, repository, CVE, package, tx-hash, onion — auto-detected and routed to the right investigator template.

Pen Test Mode toggle

Switches what tools investigators see. No docker compose dance. Mode applies to new pastes only; in-flight investigations keep whatever mode they were spawned with.

Audited security boundary

Inherited verbatim from the upstream zebbern-kali-mcp fork audit. Stripped arbitrary-shell tools. Bearer auth on every Flask request. /api/exec returns 403 regardless of caller.

Subscription-billed

Paste a year-long OAuth token into Settings. The container injects CLAUDE_CODE_OAUTH_TOKEN into every spawned investigator. No ANTHROPIC_API_KEY required.

Single-volume persistence

OAuth token, auto-generated Kali bearer, audit logs, per-paste workspace, case files — every operator-state artefact lives under /data. Container is otherwise immutable.

Per-investigator MCP

Each Claude Code subprocess gets its own MCP server registration in its scratch dir at spawn time. The operator's global Claude config is never touched.

Glass-card streaming UI

Live SSE event stream — indicator detection, structured findings, tool-call chips, severity rollups, follow-up chat. Per-card abort + per-paste kill switches.

Evidence-grade case files

Every active-mode Kali tool invocation appended to /data/workspace/audit.jsonl. cases/ holds long-form material ready for law-enforcement handover.

Trail of Bits skills

YARA-X rule authoring for hash investigations. Chain-specific scanners for wallet pivots. Burp Suite parser for victim-supplied evidence. Skill-driven workflow upgrades per indicator type.

Deploy anywhere

The image is an artefact, not a deploy script. Any host that accepts NET_RAW / NET_ADMIN / /dev/net/tun works. Operator state migrates with the volume.

Authorisation-first

Never scan anything you don't own or have written permission to test. No hack-back. No vigilantism. Investigate, document, hand to law enforcement.

Security boundary

Five hardening patches, all inherited from the upstream audit.

The vendored Kali MCP fork ships with five Winter Soldier security patches applied before the bridge ever talks to Flask. They survive the unified-container consolidation byte-for-byte; the boundary just moved from host networking to container loopback.

01Arbitrary-shell tools stripped

zebbern_exec, exec_stream, send_input, read_output in mcp_tools/command_exec.py are removed. Only health and system_network_info survive from that module.

02/api/exec neutralised at Flask

Both /api/exec and /api/command endpoints return 403 regardless of caller. Defense in depth: even if the MCP layer's patches were bypassed, the backend itself refuses to run arbitrary shell.

03Bearer-token auth on every endpoint

Flask refuses to start without WS_KALI_API_TOKEN set; every request from the MCP client carries the token in Authorization: Bearer …. The token is auto-generated on first boot and lives in /data/.config/kali-api-token (mode 600).

04Per-tool passive/active classification

A _MODULE_CLASSIFICATION dict drives passive-vs-active loading. Passive mode filters every invasive module out at registration time — Claude literally cannot see those tools. Active mode logs every invocation to audit.jsonl.

05Build from reviewed source

Container always builds locally from the source pinned in REVIEWED_COMMIT.txt — never pulls GHCR latest. Upstream CI pushes on every main commit; that image may not match the code that was audited in this repo.

Tech stack

Boring choices where it matters; sharp ones where it counts.

Next.js 16React 19TypeScript (strict)Tailwind v4Kali Linux (kali-rolling)Dockers6-overlayPython 3FlaskClaude OAuthMCPPlaywright (headless)
Rules of engagement (non-negotiable)
  • Authorisation is mandatory. Never scan, intrude on, or attack any system you do not own or have explicit written permission to test.
  • Legal tools and sources only. OSINT means publicly or legitimately accessible. No stolen credentials, no illicit data brokers, no unauthorised access.
  • No hack-back. No vigilantism. Retaliation against identified attackers is illegal in most jurisdictions and destroys evidence. Investigate → document → hand off to law enforcement.
  • Protect the victim twice. Redact PII before sharing. Treat their data with at least the care you'd want for your own.
  • No fabrication. Unverified leads are labelled as such. Attribution needs corroboration — single-source claims are hypotheses, not conclusions.
Engagement-only · not for resale or redistribution

Winter Soldier bundles a full Kali Linux pentest stack with an AI investigator pipeline. It is a dual-use toolkit, and Alchatex does not license or sell it as an off-the-shelf product. There is no public download, no SaaS tenant, and no reseller channel.

Engagements are scoped per client, gated on demonstrated written authorisation to test the targets in question, and delivered as a private container build owned by the operator commissioning the work. Redistribution, sublicensing, or transfer of the image to third parties is not permitted under those terms.

The case study on this page exists so you can decide whether the capability fits the work you have. The artefact itself does not.

Have a case that needs a console?

Whether you're responding to a phishing incident, helping a small business through a ransomware tail, hardening a stack before launch, or running an authorised pen-test engagement — let's talk about how Winter Soldier (or a tailored variant) fits.