Whitepaper 08: Architecting Autonomous Penetration Testing — A Systems Design Approach

Author: Khushal Suthar Date: June 2026 Series: Autonomous Penetration Testing with AI Agents Category: Systems Design & Reference Architecture

Executive Summary

Building an autonomous penetration testing system is a systems design problem, not a prompt engineering problem. The agent is not a single model call; it is a distributed system with state management, tool execution, context orchestration, safety enforcement, and human-in-the-loop interfaces. This paper presents a reference architecture for autonomous pentesting, decomposed into six layers with defined responsibilities, interfaces, and failure modes. The architecture integrates five innovations developed across this whitepaper series — the Tri-Con 3-Layer Index (WP01), the Token Engine (WP02), the Custom Orchestrator (WP03), the Phase Map Architecture (WP04), and the Skill-Based Platform (WP05) — into a single coherent system where each innovation occupies a precise architectural position and interacts with the others through well-defined contracts.

The architecture is informed by production experience and is designed to be implementable today with available components, while remaining adaptable as model capabilities evolve. We present a layered architecture diagram, component interaction flows, data flow between layers, deployment topology, cross-cutting concerns (security, observability, error handling), the adaptability properties that emerge from decoupling, and a comparison with monolithic AI pentesting approaches that treat the entire engagement as a single prompt chain.

1. The System as a Whole

An autonomous pentesting system is best understood as a control loop:

Observe → Orient → Decide → Act → Record → (repeat)

Observe: Execute a tool, capture output.
Orient: Interpret the output, update the world model, compress observations through Tri-Con.
Decide: Select the next action based on objectives, constraints, and phase-map topology.
Act: Execute the selected tool or sub-agent through the Skill Platform.
Record: Persist findings, update memory, log decisions, advance phase-map state.

This loop runs continuously for the duration of the engagement, at a rate of one cycle every few seconds to several minutes depending on tool execution time. The architecture's job is to implement this loop reliably, safely, and cost-effectively across hours of unattended operation.

The system is not monolithic. It decomposes into six layers, each with a distinct responsibility and each hosting one or more of the five innovations:

Execution

State

Cognitive

Memory

Orchestration

Interface

We examine each in turn, tracing how the five innovations compose into a unified system.

2. The Layered Architecture Diagram

┌──────────────────────────────────────────────────────────────────────────┐

│ INTERFACE LAYER │ │ Operator Console │ Reporting Engine │ Telemetry/Observability │ │ (real-time activity, world model browser, decision log, escalation) │ └──────────────────────────────┬───────────────────────────────────────────┘ │ (reads from all layers, writes escalations) ┌──────────────────────────────┴───────────────────────────────────────────┐ │ ORCHESTRATION LAYER │ │ │ │ ┌─────────────────────┐ ┌──────────────────────────────────────┐ │ │ │ Custom Orchestrator │ │ Phase Map Graph Walker │ │ │ │ (WP03) │ │ (WP04) │ │ │ │ - Capability reg. │ │ - Declarative engagement topology │ │ │ │ - Task matching │◄──►│ - Conditional gate evaluation │ │ │ │ - Result assessment │ │ - Parallel branches, nesting │ │ │ │ - Reassignment │ │ - Runtime modification │ │ │ └──────────┬───────────┘ └──────────────────────────────────────┘ │ │ │ delegates to sub-agents (recon, enum, exploit, lateral) │ └──────────────┼───────────────────────────────────────────────────────────┘ │ ┌──────────────┼───────────────────────────────────────────────────────────┐ │ COGNITIVE LAYER │ │ │ │ │ ┌──────────┴──────────┐ ┌──────────────────────────────────┐ │ │ │ Reasoning Pipeline │ │ Tri-Con Index (WP01) │ │ │ │ - Context assembly │───►│ L3 always loaded (~500-2K tok) │ │ │ │ - LLM call │ │ L2/L1 cascaded retrieval │ │ │ │ - Output validation │ │ on demand │ │ │ │ - Model router │ └──────────────────────────────────┘ │ │ └──────────┬──────────┘ │ │ │ all text flows through: │ │ ┌──────────┴──────────────────────────────────────────────┐ │ │ │ Token Engine (WP02) │ │ │ │ L1 Dedup → L2 Shorthand → L3 Wordlist → L4 Compress │ │ │ │ (18-35% token reduction, zero data loss, reversible) │ │ │ └──────────────────────────────────────────────────────────┘ │ └──────────────┼───────────────────────────────────────────────────────────┘ │ reads/writes state, retrieves from memory ┌──────────────┼───────────────────────────────────────────────────────────┐ │ STATE LAYER │ │ │ │ │ ┌──────────┴──────────┐ ┌──────────────────────────────────┐ │ │ │ World Model │ │ Phase Map State │ │ │ │ - Assets, findings │ │ - Active nodes, worklist │ │ │ │ - Credentials │ │ - Completed phases │ │ │ │ - Sessions, chains │ │ - Gate evaluation results │ │ │ │ - History (events) │ │ - Parallel branch tracking │ │ │ └──────────────────────┘ └──────────────────────────────────┘ │ │ Event-sourced, append-only, checkpointed │ └──────────────┬───────────────────────────────────────────────────────────┘ │ ┌──────────────┼───────────────────────────────────────────────────────────┐ │ MEMORY LAYER │ │ │ │ │ ┌──────────┴──────────────────────────────────────────────────┐ │ │ │ L1 Working: LLM context window (Tri-Con L3 + active drill) │ │ │ │ L2 Session: Engagement store + vector index + graph store │ │ │ │ (Tri-Con L1/L2 indexes, world model, chain graph) │ │ │ │ L3 Long-term: Knowledge base (CVEs, playbooks, heuristics) │ │ │ └──────────────────────────────────────────────────────────────┘ │ └──────────────┬───────────────────────────────────────────────────────────┘ │ ┌──────────────┼───────────────────────────────────────────────────────────┐ │ EXECUTION LAYER │ │ │ │ │ ┌──────────┴──────────────────────────────────────────────────┐ │ │ │ Skill-Based Platform (WP05) │ │ │ │ - Skill loader: reads skill.yaml manifests from library │ │ │ │ - Runtime composition: assembles skill components at runtime│ │ │ │ - Safety policy: scope checks, rate limits, authorization │ │ │ │ - Sandboxing: isolated containers per tool execution │ │ │ └──────────────────────────────────────────────────────────────┘ │ └──────────────────────────────────────────────────────────────────────────┘

This is a distributed system. Each layer communicates with adjacent layers through defined contracts — not through shared mutable state or implicit coupling. The innovations are not bolted on; they are structural components that define the layer's internal architecture. We now trace each layer in detail.

3. The Execution Layer

3.1 Responsibility

The execution layer runs tools. It is the system's hands — the only layer that touches the target environment. It must:

Execute tools deterministically and capture complete output
Enforce timeouts and resource limits
Handle failures gracefully (tool crashes, network errors, invalid arguments)
Provide a uniform interface to the cognitive layer regardless of tool type
Enforce safety as deterministic code, not LLM judgment

3.2 The Skill-Based Platform as Execution Substrate

The execution layer is built on the Skill-Based Platform (WP05). Rather than hardcoding tool integrations into the core binary, the platform's core is deliberately tiny — it does exactly four things: load skills, compose them into runnable modules at request time, enforce safety policy, and stream results. The knowledge of what to do lives entirely in skills; the knowledge of how to do it safely lives in the core.

Every tool the agent can run is defined as a skill package — a declarative manifest (skill.yaml) plus assets (knowledge documents, command definitions, pattern libraries, parser scripts, report templates). A skill's manifest declares its capabilities, input requirements, output characteristics, side effects, and dependencies on other skills. The execution layer reads these manifests at engagement start, materializes the skill components, and wires them into the runtime.

Skill Package (web/apache-path-traversal@1.2.0):

├── skill.yaml # Manifest: capabilities, params, safety rules ├── knowledge/overview.md # CVE-2021-41773 technical context ├── commands/nuclei-cve.yaml# Command spec: nuclei -t cves/2021/... ├── patterns/traversal.yaml # Matchers: 403 + ../ → Path Traversal ├── parser/parse.py # Sandboxed parser → normalized Finding objects └── report/template.j2 # Jinja2 template for report rendering

This design means that adding a new tool or technique is not an engineering project — it is a documentation project. A domain expert writes a skill package, validates it against the schema, and pushes it to the skill library. The core binary never changes. The attack surface for regression stays frozen at zero while the capability surface grows continuously.

3.3 Uniform Tool Interface

Every skill that produces commands exposes a uniform execution interface:

ToolExecutionResult:

tool_name: str skill_id: str # e.g. "web/apache-path-traversal@1.2.0" args: dict exit_code: int stdout: bytes stderr: bytes duration_ms: int metadata: dict # timestamps, target host, source agent structured_output: dict # parsed/normalized by skill parser

The structured_output field is produced by the skill's parser component — a sandboxed Python script or declarative jq/JSONata expression that transforms raw output into normalized Finding objects. This is the compact representation that the cognitive layer's Tri-Con indexer consumes. The raw stdout/stderr is retained in the state layer for audit but never loaded into model context directly.

3.4 Safety Enforcement

The execution layer is the last line of defense before actions touch the target. Safety enforcement is implemented in the platform core as deterministic code, not delegated to the LLM:

Scope boundaries: IP ranges, host lists, URL scopes. A tool invocation targeting an out-of-scope host is blocked before execution. The skill manifest declares supported_targets and the core cross-references against the engagement's scope definition.
Rule-of-engagement constraints: No DoS, no social engineering, no physical access, time windows. These are enforced by the core's safety policy engine, which reads the engagement rules and rejects non-compliant invocations.
Rate limiting: The core enforces per-target rate limits defined in the skill manifest's side_effects section. A skill that generates high traffic (e.g., directory brute-forcing) carries a higher rate limit than a single-request skill.
Authorization verification: Each high-risk action (exploit execution, credential dumping) must be traceable to an authorization in the rules of engagement. The core checks this before execution.

The LLM may decide to run an exploit; the execution layer checks whether that exploit is permitted against that target and refuses if not. Safety is a code path, not a prompt instruction.

3.5 Execution Isolation

Tools run in isolated execution contexts — containers, VMs, or sandboxes — for two reasons:

Containment: A compromised tool (e.g., a malicious exploit payload that backfires) cannot affect the agent's control plane. The skill's parser runs in the same sandbox, so even a malicious output cannot escape into the orchestrator.
Reproducibility: Isolated environments with known tool versions (pinned in the skill manifest) produce reproducible results.

The execution layer supports parallel tool execution where skills target independent hosts, and serial execution where skills share state (e.g., an exploit followed by a post-exploitation command on the same session). The orchestrator coordinates this through its task assignment system.

4. The State Layer

4.1 The World Model

The state layer maintains the system's world model: a structured, typed, queryable representation of everything the agent knows about the target environment. It is the single source of truth — not a bag of text that the LLM remembers, but a durable data structure that survives context eviction.

WorldModel:
assets: List[Asset]            # hosts, services, web apps findings: List[Finding]        # vulnerabilities, misconfigurations credentials: List[Credential]  # discovered creds, hashes, tokens sessions: List[Session]        # active shells, pivots attack_chains: List[Chain]     # multi-step paths under construction history: List[Action]          # executed actions and results (event log) scope: ScopeDefinition         # boundaries, rules of engagement objectives: List[Objective]    # what we're trying to achieve

Each entity has typed fields, provenance (which skill/action produced it), timestamps, and confidence scores. The world model is updated by the cognitive layer after each tool execution and is the substrate for all reasoning.

4.2 The Phase Map as Engagement Topology

Alongside the world model, the state layer maintains the Phase Map state — the runtime representation of the declarative engagement topology defined in WP04. A Phase Map is a directed graph where nodes are phases (recon, enumeration, vulnerability, exploitation, post-exploitation, reporting) or sub-phases (enum_smb, enum_http, enum_ssh), and edges are transitions with optional conditional gates.

The state layer tracks:

Active nodes: Phases currently being executed (supporting parallel branches)
Worklist: Phases queued for execution, pending gate evaluation
Completed phases: Phases that have finished, with their outcome (success, partial, failed)
Gate evaluation cache: Results of conditional gate checks, to avoid re-evaluation
Nesting context: If a pivot creates a nested engagement, the nested phase map and its relationship to the parent

The Phase Map is not a static plan — it is a living graph that the orchestrator walks dynamically. When a finding triggers a conditional gate (e.g., "if SQL injection found → transition to database extraction sub-phase"), the state layer records the gate evaluation, and the orchestrator enqueues the successor node. When post-exploitation reveals a missed service, the orchestrator can re-enter enumeration from a privileged position through a nested sub-map.

Phase Map State (mid-engagement example):
Active:   [enum_http, enum_smb]          (parallel branch) Worklist: [vuln_scan (gate: enum_complete)] Completed: [recon (success), enum_ssh (success)] Gates:    enum_complete → FALSE (enum_http still running) Nesting:  none active

4.3 Why a Structured World Model?

A common mistake is to let the world model be "whatever the LLM remembers." This works for demos and breaks for real engagements. A structured world model provides:

Queryability: "Give me all hosts running OpenSSH < 9.0" is a query, not a prompt. The cognitive layer issues structured queries rather than re-reading raw output.
Consistency: The same finding is represented once, not re-derived on each call. When a new nmap scan updates a service version, the old version is superseded, not duplicated.
Auditability: Every finding has provenance — which skill produced it, which agent requested it, at what timestamp. The report can cite the exact tool and output.
Persistence: The world model survives context eviction. When the LLM's context window is cleared and rebuilt from Tri-Con L3, the world model is not affected — it lives in the state layer, not in the LLM's attention.
Human readability: A human reviewer can inspect the world model to understand what the agent knows at any point.

4.4 Session and Chain Management

Session management tracks active access — shells, pivots, established credentials. A session is a resource that can be used, lost, or escalated. The state layer tracks session liveness (is the shell still responsive?) and session capability (what can this session do?). Losing a session mid-chain is a common failure; the state layer must detect it and signal the cognitive layer to re-establish access.

Attack chain management tracks multi-step paths. A chain is a graph of findings and sessions that connects an entry point to a target objective. The state layer stores chains as graphs, not linear narratives, because real attack paths branch: a credential may be usable on multiple hosts; a pivot may enable multiple next steps. The cognitive layer reasons over the chain graph to select the next step.

4.5 Concurrency and Consistency

In a multi-agent topology (see Section 6), multiple sub-agents update the world model concurrently. The state layer handles this through event sourcing: all updates are append-only events, and the world model is a projection of the event log. This provides complete auditability, allows replay (re-running the engagement from the event log to reproduce a finding or debug a failure), and resolves concurrent updates through deterministic merge rules — last-write-wins for independent fields, conflict resolution for shared entities.

Event sourcing is recommended for production systems. The overhead is modest, and the audit and replay capabilities are invaluable for debugging agent behavior, reproducing findings for report evidence, and recovering from orchestrator crashes.

5. The Cognitive Layer

5.1 Responsibility

The cognitive layer is where LLM reasoning happens. It receives the current world-model state (or a Tri-Con-compressed subset), the current objective, and produces a decision: what tool to run next, what hypothesis to pursue, what finding to record. It is the system's brain — but a brain that operates under severe constraints: finite context, finite tokens, finite cost budget.

The cognitive layer hosts two of the five innovations: the Tri-Con 3-Layer Index (WP01) for context management and the Token Engine (WP02) for token optimization. These are not optional add-ons; they are structural components that make long-horizon reasoning feasible at all.

5.2 Tri-Con: Cascaded Context Management

The Tri-Con 3-Layer Index solves the context-drowning problem: a single nmap scan can emit 12,000+ tokens; a full engagement easily generates millions. No context window can hold this volume while leaving room for reasoning. Tri-Con's solution is a three-tier cascaded index that compresses every tool output into progressively more compressed semantic layers:

An indexing agent sits alongside the orchestrator and processes every tool output asynchronously. Raw data is persisted to disk unaltered. Three progressively compressed semantic indexes are derived. The orchestrator operates exclusively on L3, keeping its steady-state context footprint to ~500–2000 tokens regardless of engagement length. When deeper detail is needed for a specific decision, it triggers cascaded retrieval — L3 → L2 → L1 → Raw — pulling only the granularity required, only for the finding in question, and only for the duration of the reasoning step.

Cognitive Cycle (Tri-Con-aware):
Assemble context: L3 snapshot (~850 tokens) + persistent context + active drill
LLM reasons over L3, decides: "Pursue vsftpd backdoor"
Drill-down: fetch L2 for FTP group (~312 tokens) → L1 for nmap (~1200 tokens)
LLM reasons over L3 + L2 + L1 (~2362 tokens), decides: "Run exploit"
After execution, indexing agent compresses new output → updates L1/L2/L3
L3 snapshot refreshed for next cycle

This architecture is what enables unbounded engagement length at bounded orchestrator context cost. Without Tri-Con, the cognitive layer would drown in its own observations by turn 20. With Tri-Con, it can run for thousands of turns across hours, always reasoning over a complete but compressed picture of everything discovered so far.

5.3 The Token Engine: Reversible Token Optimization

While Tri-Con manages what enters the context window, the Token Engine manages how efficiently tokens are used within it. The Token Engine is a four-stage reversible pipeline that reduces LLM token consumption by 18–35% with zero data loss — every transformation is reversible and the original input can be reconstructed bit-for-bit.

Total (compounded)

18–35%

The Token Engine sits in the cognitive layer's text pipeline. On the send side (before the LLM call), input text — whether it is a Tri-Con L1 summary, an L2 group, or the L3 snapshot itself — flows through L1→L2→L3→L4 compression, producing a compressed payload and a metadata sidecar. On the receive side (after the LLM call), output text flows through the inverse pipeline, restoring the original.

The engine is implemented in pure Python with no external LLM calls in the compression path. The only runtime dependency is a tokenizer (tiktoken or equivalent). This means the token savings are achieved at zero quality cost and zero latency cost — the LLM receives the same semantic content in fewer tokens and produces the same decisions at lower API spend.

In practice, a naive engagement costing $1,240 in token spend is reduced to approximately $68–$90 after Tri-Con compression (eliminating the need to load raw output) combined with Token Engine optimization (compressing what does enter context). This is the difference between an economically viable autonomous pentesting system and one that costs more per engagement than a human consultant.

5.4 The Reasoning Pipeline

A single cognitive cycle is not one LLM call. It is a pipeline:

Context assembly: Gather the L3 snapshot, persistent context (objectives, scope), and any active drill-down (L2/L1 for the finding under consideration). Run all text through the Token Engine.
Primary reasoning call: The LLM produces a decision — tool selection, finding interpretation, plan update, or gate evaluation.
Output validation: Structured-output validation against a schema. The Token Engine decompresses the output; if malformed, retry with simplified context.
Safety pre-check: The proposed action is checked against rules of engagement. The execution layer does the final check, but the cognitive layer does an early check to avoid wasted calls.
State update: The decision and its rationale are recorded in the world model's event log.

Each step is instrumented. Failures at any step are logged and handled through the error-handling taxonomy (retry, fallback, human escalation).

5.5 Model Selection

Not all cognitive calls require the same model. The cognitive layer includes a model router that selects the model for each call based on:

Call type: Planning and exploit selection → frontier model. Enumeration interpretation → mid-tier. Classification and routing → fast model.
Stakes: High-stakes decisions (actions that could disrupt a service) → frontier. Low-stakes (interpreting a benign scan result) → mid-tier.
Context size: Large context calls (multi-finding synthesis) → model with appropriate window. Small context (L3-only reasoning) → cheaper model.

The router is a cheap, rule-based or small-model classifier — not itself an LLM call in the hot path. The Tri-Con + Token Engine combination makes model tiering more effective: because the L3 snapshot is small, the orchestrator can use a frontier model for high-stakes planning calls without prohibitive cost, while sub-agents handling routine enumeration use mid-tier models on their own bounded contexts.

5.6 Reasoning Patterns

The cognitive layer supports several reasoning patterns, selected by the orchestrator based on the phase-map node and situation:

Reactive: Tool produced output → interpret → decide next tool. Fast, single-call. Used in enumeration sub-phases.
Deliberative: Multiple findings accumulated → synthesize → identify chains → plan multi-step attack. Multi-call, may involve sub-agents. Used in vulnerability and exploitation phases.
Exploratory: Objective is broad ("characterize this network") → generate hypotheses → test each. Multi-call, branching. Used in reconnaissance.
Defensive: Something went wrong (session lost, exploit failed) → diagnose → recover. Reactive with fallback. Used when error handling triggers.

Each pattern has a different call profile, context requirement, and cost. The orchestrator selects the pattern based on the phase-map node's configuration; the cognitive layer executes it.

6. The Memory Layer

6.1 Three-Tier Memory Architecture

The memory layer implements a hierarchical memory architecture. Critically, the Tri-Con indexes (WP01) serve as the structural backbone of this hierarchy — they are not a separate system bolted onto memory, but the memory's primary indexing mechanism:

This mapping is intentional. Tri-Con's L1/L2/L3 compression tiers and the memory layer's L1/L2/L3 storage tiers share a common insight: information should be stored at multiple granularities and retrieved at the granularity the current decision requires. The Tri-Con indexing agent that compresses tool output is simultaneously building the session memory's structured index. There is no separate "memory writer" — the indexing agent IS the memory writer.

6.2 L2: The Engagement Memory

The engagement memory (session memory, not to be confused with Tri-Con's L2 contextual tier) is the most operationally critical component. It holds the world model for the current engagement and provides:

Structured queries: "All open ports on 10.0.0.7," "All credentials with confidence > 0.8" — served from the world model's typed entities.
Semantic search: "Findings related to SMB misconfiguration" — served from the vector index built over Tri-Con L1/L2 entries.
Graph traversal: "What attack chains include host 10.0.0.7?" — served from the chain graph store.
Temporal queries: "Findings from the last hour," "Findings from enumeration phase" — served from the event log with phase-map metadata.
Cascaded retrieval: L3 → L2 → L1 → Raw — the Tri-Con drill-down protocol, served from the on-disk index stores.

The engagement memory is backed by a combination of a document store (for structured data), a vector index (for semantic search), and a graph store (for chain traversal). In practice, a single PostgreSQL database with pgvector and a graph extension (Apache AGE) can serve all three, though specialized stores (Qdrant, Neo4j) offer better performance at scale.

6.3 L3: The Knowledge Base

The long-term memory persists beyond a single engagement. It holds:

Historical findings: From prior engagements (with customer consent and PII redaction), forming a corpus that the agent can search for transfer learning.
CVE and exploit knowledge: Structured vulnerability data, exploit availability, reliability ratings — sourced from public databases and enriched by the agent's own experience.
Playbooks: Reusable attack patterns ("how to exploit a misconfigured Jenkins," "how to pivot through a Windows jump host") — authored by security engineers and refined by the agent through engagement experience.
Skill knowledge documents: The knowledge/ component of every skill package (WP05) is loaded into the knowledge base at engagement start, giving the agent domain-specific context for each active skill.
Tool heuristics: What tools work well for what targets, learned from experience — feeding back into the orchestrator's capability registry.

The knowledge base enables transfer learning without fine-tuning: the agent retrieves relevant experience from prior engagements and applies it to the current one. This is where autonomous pentesting gets better with scale — each engagement makes the next one more efficient. A new CVE discovered in the wild becomes a new skill package in the library, a new entry in the knowledge base, and a new capability in the orchestrator's registry — all without code changes.

6.4 Memory Consistency

Memory updates must be consistent across tiers. When a finding is added to L2 (session memory), the Tri-Con indexing agent updates the L1/L2/L3 indexes synchronously — the agent may need to retrieve the finding on the next cognitive cycle. Long-term memory (L3 knowledge base) updates are asynchronous, batched at engagement end. The memory layer handles the case where session memory and long-term memory disagree (e.g., a finding in the current engagement contradicts a historical pattern) by surfacing the conflict to the cognitive layer rather than silently resolving it.

7. The Orchestration Layer

7.1 Responsibility

The orchestration layer manages the agent topology: which sub-agent is running, what its objective is, how sub-agents communicate, and when to escalate. It is the system's executive function. The orchestration layer hosts two innovations: the Custom Orchestrator (WP03) for capability-aware task assignment and the Phase Map (WP04) for declarative engagement topology.

7.2 The Custom Orchestrator: Capability-Aware Routing

The Custom Orchestrator replaces the free-form tool-calling loop used by every existing LLM agent framework. Rather than handing the model a list of tool schemas and asking it to decide which tool to invoke, the orchestrator maintains a capability registry — a structured database of what each skill actually does, what inputs it requires, what outputs it produces, what side effects it has, and what downstream consumers its output feeds.

The orchestrator operates in two phases:

Phase 1 — Tool Understanding: At engagement start, the orchestrator loads capability maps from every skill in the active skill set. Each capability map (defined in the skill's skill.yaml) captures not just the interface signature but the full operational profile: capabilities with proficiency ratings, input requirements, output characteristics (volume, format, noise profile), side effects (traffic generated, IDS detectability, destructive potential), downstream consumers (which Tri-Con tier the output feeds), and alternatives. The orchestrator builds a capability registry indexed by capability tags, input types, output types, and side-effect classes.

Phase 2 — Task Assignment: The orchestrator decomposes the current engagement objective (derived from the phase-map node) into atomic task specifications, matches each task's requirements against the capability registry using a weighted scoring algorithm, and selects the optimal skill — or chain of skills — for the task. After the assigned agent executes, the orchestrator assesses the results against the task's success criteria and can reassign to an alternative skill or agent if the outcome is insufficient.

Orchestrator Task Assignment Cycle: Phase Map node → objective: "Enumerate HTTP services on 10.0.0.5" Decompose into task spec: {capability: web_directory_enum, target: 10.0.0.5, ...} Match against capability registry → scored candidates: gobuster: 0.91 (expert, high output, needs wildcard tuning) feroxbuster: 0.88 (expert, recursive, better soft-404) ffuf: 0.74 (flexible, but overpowered for this task) Select: gobuster (highest score, matches stealth requirements) Inject context via Tri-Con: L3 snapshot + drill to relevant L2 Execute: gobuster runs via Skill Platform Assess: output parsed by skill parser → 47 directories found → SUCCESS Decide: advance to next task (vuln_scan on found directories)

This capability-aware routing eliminates the four failure modes that plague free-form tool-calling: wrong tool selection (model picks by name familiarity, not capability fit), parameter misuse (model doesn't know tool-specific tuning), context pollution (model injects raw high-volume output), and side-effect blindness (model runs noisy tools against production during business hours). The capability registry encodes all of this as structured data that the matching algorithm consults deterministically.

7.3 The Phase Map: Declarative Engagement Topology

The Phase Map (WP04) provides the orchestrator with a declarative, graph-based engagement topology that it interprets at runtime. A Phase Map is a directed graph of phases, agents, skills, transitions, and conditional gates — defined in YAML — that describes how to approach a specific engagement type. Each phase map defines:

Phases (nodes): recon, enumeration, vulnerability, exploitation, post-exploitation, reporting — or sub-phases like enum_smb, enum_http.
Agent assignments: Which specialist agent runs in each phase (recon agent, web exploitation agent, privilege escalation agent).
Skill assignments: Which skills are available within each phase. This prevents the agent from running exploit tools during enumeration — the skill set is scoped to the phase.
Transitions (edges): Ordering between phases, with optional conditional gates.
Parallel branches: Service-specific enumeration runs concurrently.
Nesting: Pivot engagements re-enter the graph from a new target via sub_map references.

The orchestrator's graph walker maintains a worklist of active nodes, executes each node's assigned agents with the node's assigned skills, evaluates outgoing gates against the finding graph, and enqueues successor nodes whose gates pass. The graph is walked dynamically — branching, merging, backtracking, and nesting based on findings rather than following a fixed sequence.

Phase Map: OWASP Web (simplified)
[recon] ──gate: host_alive──► [enum_http] │ ┌────────┼────────┐ ▼        ▼        ▼ [enum_dirs] [enum_tech] [enum_auth] │        │        │ └────────┼────────┘ ▼ [vuln_scan] gate: vulns_found? ├── yes ──► [exploit] │             │ │             ▼ │       [post_exploit] │             │ │             ▼ └── no ──►  [report]

Pre-built phase maps ship with the platform for common engagement types: OWASP Web, IoT Assessment, PTES Network, Active Directory, API Testing, Red Team. These are selectable at engagement start. Custom phase maps can be authored by security analysts without touching orchestrator code — the workflow is described, not programmed. Phases can be skipped, added, or reordered mid-engagement through a runtime modification protocol, allowing the agent to adapt to emerging findings without restarting.

7.4 Single-Agent vs. Multi-Agent

A single-agent architecture has one cognitive loop that handles all phases. It is simpler to build and debug but hits the context window crisis and does not parallelize.

A multi-agent architecture has specialized sub-agents coordinated by the orchestrator. It is more complex but:

Parallelizes independent work: Enumerate two subnets simultaneously, each in its own sub-agent with its own bounded context.
Isolates context: Each sub-agent has a bounded, coherent context — it does not see the full L3 snapshot, only the subset relevant to its phase-map node.
Enables model tiering per sub-agent: Exploitation sub-agent uses a frontier model; enumeration sub-agent uses a mid-tier model. The orchestrator's model router makes this decision per assignment.
Improves resilience: One sub-agent's failure does not crash the system. The orchestrator detects the failure, assesses it, and either retries, reassigns, or escalates.

The recommended architecture for production is multi-agent with a thin orchestrator. The orchestrator holds only high-level state (current phase-map node, active sub-agents, engagement objectives, budget) and delegates all detailed work. It is stateless — all state lives in the state layer. If the orchestrator crashes, it can resume from the last persisted phase-map state and world-model checkpoint.

7.5 Escalation and Human-in-the-Loop

The orchestrator must know when to escalate to a human:

Safety-critical decisions: Actions that could cause service disruption (DoS-adjacent exploits, production database access).
Ambiguity: The agent has multiple plausible paths and insufficient information to choose.
Authorization boundaries: Actions outside the current rules of engagement that may be permissible with customer approval.
Stuck states: The agent has not made progress in N cycles (repeated failures, no new findings).

Escalation is not failure; it is a designed interaction point. The orchestrator packages the current state, the decision point, and the options into a human-readable brief and pauses. The human's decision is recorded in the event log and the agent resumes. The interface layer surfaces the escalation in real time through the operator console.

8. The Interface Layer

8.1 The Operator Console

Even a fully autonomous system needs an operator console — a human-readable view of what the agent is doing, in real time. The console shows:

Live activity feed: Current tool executions, LLM calls, findings as they are discovered — streaming from the execution layer and cognitive layer telemetry.
World model browser: Interactive view of assets, findings, chains, sessions — read directly from the state layer.
Phase Map visualizer: The current engagement topology, with active nodes highlighted, completed nodes marked, and gate evaluation status visible. This is the most important view for understanding where the agent is in the engagement.
Decision log: Every decision the agent made, with rationale, context (the L3 snapshot at decision time), and cost. This is the audit trail.
Escalation queue: Pending human decisions, with the packaged brief for each.
Cost tracker: Token spend (pre- and post-Token Engine compression), call count, engagement cost-to-date against budget.

The console is not optional. Without it, the operator is flying blind, and "autonomous" becomes "unobservable." An unobservable autonomous system with access to exploitation tools is a liability.

8.2 Reporting

The final deliverable of a pentest is a report. The interface layer generates the report from the world model at engagement end (or on demand). The report is not a stream of LLM prose; it is a structured document generated from findings:

Executive summary: LLM-generated from the world model's high-level state, human-reviewed.
Findings list: From the world model's structured findings, each with provenance, severity, and evidence reference.
Evidence: Tool outputs (raw, from disk), screenshots, session logs — linked by provenance to the finding.
Attack chain narratives: From the chain graph, LLM-narrated. The graph structure ensures the narrative covers all branches, not just the successful path.
Remediation recommendations: From finding templates (defined in skill packages' report/ component) plus LLM elaboration.

Structured generation ensures completeness (every finding is reported), consistency (format is uniform), and auditability (every claim traces to evidence). The LLM's role is narration and synthesis, not authorship from scratch. The skill platform's report templates (Jinja2/Markdown) ensure that findings from each skill are rendered in the appropriate format.

8.3 Observability and Telemetry

The system emits telemetry for debugging, cost analysis, and quality improvement:

Per-call logs: Input context (token count pre- and post-Token Engine), output decision, model used, latency, cost. These feed the cost tracker and the context-health metrics.
Tool execution logs: Skill ID, args, result, duration, target, sandbox ID. These feed the execution layer's audit log.
Decision traces: Why did the agent choose skill X over skill Y? What capability scores did each receive? What context (L3 snapshot + drill) was considered? These feed agent improvement analysis.
Finding provenance: Which skill, which agent, which cognitive cycle produced each finding? These feed the report's evidence chain.
Phase Map traversal log: Which nodes were visited, in what order, what gates passed or failed, what branches were taken. This feeds engagement quality analysis and phase-map optimization.
Cost telemetry: Real-time spend tracking against budget, with Token Engine savings quantified separately. This feeds the budget enforcement system.

This telemetry is the substrate for post-engagement analysis, agent improvement, and the context-health and cost metrics. It is also the basis for the continuous learning loop: tool heuristics learned during an engagement are written back to the long-term memory (L3 knowledge base) and improve the next engagement.

9. Component Interaction Flows

9.1 A Single Cognitive Cycle (End-to-End)

To show how the layers and innovations interact, we trace a single cognitive cycle from observation to action:

STEP 1: OBSERVE (Execution Layer) Orchestrator assigns task: "Enumerate HTTP on 10.0.0.5" → Skill Platform loads skill: web/http-dir-enum@1.2.0 → Execution layer runs gobuster in sandbox → Result: 47 directories found (raw: 8,200 tokens) → Skill parser normalizes → structured_output: [{path, status}, ...] STEP 2: ORIENT (Cognitive + Memory Layers) Tri-Con indexing agent processes structured_output: → Persist raw to disk: raw/eng_001/...gobuster.out → Generate L1 entry: structured summary (~1,200 tokens) → Update L2 group "web_dirs_10.0.0.5": merge with existing (~380 tokens) → Update L3 entry: one-liner (~80 tokens) → Notify orchestrator: L3 delta available State layer: world model updated with 47 new Asset entries (paths) event log appended with Action record STEP 3: DECIDE (Cognitive + Orchestration Layers) Orchestrator: phase-map node "enum_http" → evaluate gates → Gate "enum_complete": still running (parallel branch not finished) → Continue in current node Cognitive layer assembles context: → L3 snapshot (~850 tokens) + persistent context (~200 tokens) → Token Engine compresses: 1050 → ~780 tokens (26% savings) → LLM call (mid-tier model): "47 dirs found, /admin and /config.php.bak are interesting. Next: vuln scan on these paths." Output validation: schema-valid → decompress via Token Engine Safety pre-check: vuln scan within scope → PASS STEP 4: ACT (Orchestration → Execution) Orchestrator: decompose into task spec → Match against capability registry: nuclei scores 0.93 → Assign to vuln_agent with skill web/nuclei-scan@3.1.0 Execution layer: load skill, run nuclei in sandbox → (cycle repeats from STEP 1)

STEP 5: RECORD (State + Memory + Interface) State layer: event log updated, world model checkpointed Memory layer: L2 session store updated synchronously Interface layer: operator console streams live activity cost tracker updates: +$0.03 this cycle

9.2 Cascaded Retrieval Flow

When the cognitive layer needs deeper detail than the L3 snapshot provides:

Orchestrator LLM sees L3 entry:
"[FTP] 10.10.10.5:21 vsftpd 2.3.4 — CVE-2011-2523 backdoor candidate" → Decides to pursue this finding
Drill-down request: l2_grp_ftp_10.10.10.5 → Memory layer retrieves L2 group (~312 tokens) → Token Engine compresses: 312 → ~230 tokens
Still need exact nmap script output? → Drill-down request: l1_20260621_001427_nmap → Memory layer retrieves L1 entry (~1,200 tokens) → Token Engine compresses: 1200 → ~880 tokens
Still need raw banner? → Drill-down request: raw/eng_001/...nmap.out → Memory layer reads from disk, extracts relevant section
Total context for this reasoning step: L3 (~850) + L2 (~230) + L1 (~880) = ~1,960 tokens (vs. 124,600 if all raw output were loaded — 98.4% reduction)

9.3 Phase Map Transition Flow

When a finding triggers a conditional gate:

Vuln scan completes → finding: "SQL injection in /api/users?id="

→ State layer: Finding added to world model → Tri-Con: L1/L2/L3 updated → Orchestrator: evaluate outgoing gates from [vuln_scan] node → Gate "vulns_found": TRUE → Gate "vuln_type == sqli": TRUE → Transition: enqueue [exploit_sqli] node → Orchestrator: activate [exploit_sqli] node → Assign exploit_agent with skill web/sqli-exploit@1.0.0 → Inject context: L3 + drill to sqli finding L2 → Execution: sqlmap runs in sandbox

9.4 Escalation Flow

Orchestrator detects: exploit may cause DoS on production DB

→ Safety pre-check: "destructive_confirmed" flag on skill → Orchestrator: escalate to human → Interface layer: escalation appears in operator console Brief: "Proposed: time-based blind SQLi on /api/users Risk: heavy DB queries, possible table lock Target: production DB (10.0.0.5:3306) Options: (a) proceed with --risk=1, (b) use --risk=0 (c) skip this finding, (d) schedule for off-hours" → Human selects: (d) schedule for off-hours → Decision recorded in event log → Orchestrator: defer task, advance to next finding

10. Data Flow Between Layers

The architecture's data flow is unidirectional at the cycle level but bidirectional across cycles. Understanding the data contracts between layers is essential for implementation:

┌─────────┐

│INTERFACE│ └────┬────┘ escalations ▲ │ ▼ telemetry, reports ┌────┴────┐ │ORCHESTR.│ └────┬────┘ task specs, agent ▲ │ ▼ assignments, decisions assignments │ ┌────┴────┐ │COGNITIVE│ └────┬────┘ L3 snapshots, ▲ │ ▼ decisions, drill requests drill results │ ┌────┴────┐ │ STATE │ └────┬────┘ queries, events ▲ │ ▼ world model updates, checkpoints │ phase-map state ┌────┴────┐ │ MEMORY │ └────┬────┘ retrieved ctx ▲ │ ▼ indexed entries, knowledge │ knowledge updates ┌────┴────┐ │EXECUTION│ └─────────┘ ▲ tool results │ (structured_output)

Key data contracts:

Execution → Cognitive: ToolExecutionResult with structured_output. Raw output goes to state layer, not cognitive.
Cognitive → State: World-model update events (append-only). Each event carries provenance: which agent, which skill, which cognitive cycle.
Cognitive → Memory: Tri-Con index updates (L1/L2/L3 entries). The indexing agent writes these as it processes tool output.
State → Cognitive: World-model query results, L3 snapshot (via memory layer's Tri-Con store), drill-down responses (L2/L1/raw).
Orchestration → Cognitive: Task specifications with context injection payload (which L3 entries are relevant, which skills are available for this phase-map node).
Orchestration → Execution: Skill IDs and arguments, sandbox configuration, safety policy context.
All layers → Interface: Telemetry streams (per-call logs, tool execution logs, decision traces, cost data, phase-map traversal log).

The Token Engine operates within the Cognitive layer's data flow — it transforms text as it enters and exits the LLM call, but the data contracts between layers are in uncompressed (original) form. This ensures that other layers never need to know about compression; the Token Engine is transparent to the rest of the system.

11. Deployment Architecture

A production deployment of this architecture is a distributed system. Each layer can be deployed as an independent service, scaled independently, failed over independently, and upgraded independently.

┌─────────────────────────────────────────────────────────────────┐

│ DEPLOYMENT TOPOLOGY │ │ │ │ ┌──────────────────────────────────────────────────────────┐ │ │ │ Operator Console (Web UI) │ │ │ │ WebSocket → telemetry stream; REST → world model query │ │ │ └──────────────────────────┬───────────────────────────────┘ │ │ │ │ │ ┌──────────────────────────┴───────────────────────────────┐ │ │ │ Orchestrator Service (stateless, horizontally scalable) │ │ │ │ - Phase-map graph walker │ │ │ │ - Capability registry (loaded at startup) │ │ │ │ - Task assignment engine │ │ │ │ - Sub-agent process manager │ │ │ └──┬───────────┬───────────┬───────────┬───────────┬───────┘ │ │ │ │ │ │ │ │ │ ┌──┴──┐ ┌──┴──┐ ┌──┴──┐ ┌──┴──┐ ┌──┴──┐ │ │ │Recon│ │Enum │ │Exploit│ │Later.│ │Report│ │ │ │Agent│ │Agent│ │Agent │ │Agent │ │Agent│ │ │ └──┬──┘ └──┬──┘ └──┬──┘ └──┬──┘ └──┬──┘ │ │ │ │ │ │ │ │ │ ┌──┴───────────┴───────────┴───────────┴───────────┴───┐ │ │ │ Cognitive Service (per-agent LLM call pipeline) │ │ │ │ - Tri-Con indexing agent (async queue consumer) │ │ │ │ - Token Engine (inline in LLM call path) │ │ │ │ - Model router │ │ │ └──┬──────────────────────────────────┬──────────────────┘ │ │ │ │ │ │ ┌──┴──────────────────┐ ┌──────────┴──────────────────┐ │ │ │ State Service │ │ Memory Service │ │ │ │ (stateful) │ │ (stateful) │ │ │ │ - PostgreSQL: │ │ - L2: pgvector + AGE │ │ │ │ world model, │ │ (session memory) │ │ │ │ event log, │ │ - L3: Qdrant + document │ │ │ │ phase-map state │ │ store (long-term memory) │ │ │ │ - Redis: checkpoint │ │ - Raw output: S3/disk │ │ │ └──┬───────────────────┘ └─────────────────────────────┘ │ │ │ │ │ ┌──┴────────────────────────────────────────────────────────┐ │ │ │ Execution Service │ │ │ │ - Skill loader (pulls from Skill Library/Marketplace) │ │ │ │ - Container orchestrator (Docker/K8s for sandboxing) │ │ │ │ - Safety policy engine │ │ │ │ - Skill parser runtime (sandboxed Python) │ │ │ └────────────────────────────────────────────────────────────┘ │ │ │ │ ┌────────────────────────────────────────────────────────────┐ │ │ │ Skill Library (shared, versioned) │ │ │ │ - Git repo or OCI registry │ │ │ │ - Marketplace for community skills │ │ │ │ - On-prem mirror for air-gapped deployments │ │ │ └────────────────────────────────────────────────────────────┘ │ └──────────────────────────────────────────────────────────────────┘

11.1 Scaling Properties

Orchestrator and sub-agents are stateless and horizontally scalable. Multiple engagements can run on the same orchestrator cluster; sub-agents are spawned per task and terminated on completion.
State and Memory services are stateful and require persistent storage. They scale vertically (bigger database) and through read replicas. For high-concurrency deployments, the state service can be sharded by engagement ID.
Execution service scales with sandbox capacity. Each tool execution consumes a container; the container orchestrator handles scheduling and resource limits.
Cognitive service scales with LLM API rate limits. The model router distributes calls across providers to avoid hitting a single provider's rate limit. The Token Engine reduces the token volume per call, effectively increasing the number of calls possible within a budget.

11.2 Deployment Modes

Full distributed: All services on separate nodes. For production multi-engagement deployments.
Single-node: All services in one process, backed by SQLite and local disk. For development, single-engagement, or on-prem deployments.
Air-gapped: Skill Library mirrored locally; LLM served by an on-prem model (e.g., a self-hosted Llama or Mistral deployment). The architecture is the same; only the LLM provider and skill library source change.

12. Cross-Cutting Concerns

12.1 Security of the Agent Itself

The agent is a privileged system: it holds credentials, runs exploits, and has access to target infrastructure. It must be secured as carefully as the targets it tests:

Credential vault: Discovered credentials are stored in an encrypted vault (e.g., HashiCorp Vault or an OS keychain), not in plaintext in the world model. The world model stores a reference to the vault entry, not the credential itself.
Output sanitization: Reports and logs are redacted of sensitive data (passwords, PII, session tokens) before leaving the system. The redaction rules are defined in skill manifests — each skill declares what patterns in its output constitute sensitive data.
API key isolation: Model API keys are held by the cognitive service only. A compromised tool in the execution layer's sandbox cannot exfiltrate API keys — they are not present in the execution environment.
Sandbox containment: Tool execution and skill parsers run in containers with no network access except to the target. A malicious skill package (if one were to pass validation) cannot phone home.
Audit log: Every action is logged immutably in the state layer's event log. In the event of a dispute, the audit log is the source of truth. The event log is append-only and tamper-evident (chained hashes).
Skill provenance: Every skill package is signed by its author and verified at load time. The skill library's marketplace enforces signing; on-prem deployments can pin specific skill versions.

12.2 Observability

Observability is not a feature; it is a requirement for safe autonomous operation. The system emits structured telemetry at every layer:

Metrics: Token consumption (pre/post Token Engine), LLM call count and latency, tool execution count and duration, findings discovered per phase, cost per engagement, context-health score (L3 size, drill frequency, context eviction rate).
Logs: Structured JSON logs for every cognitive cycle, tool execution, and orchestrator decision. Logs carry correlation IDs that trace a single decision from orchestrator assignment through cognitive call to tool execution to finding creation.
Traces: Distributed tracing across services — orchestrator → cognitive → execution → state → memory — with timing at each hop. This is essential for diagnosing latency issues in long engagements.
Dashboards: The operator console provides real-time dashboards for active engagements and historical dashboards for post-engagement analysis. Key metrics: engagement progress (phase-map completion), cost trajectory, finding rate, context-health trend.

12.3 Error Handling

An autonomous system running for hours will encounter failures. The architecture handles these through a layered error taxonomy:

Tool failure (transient)

Tool failure (persistent)

LLM failure (malformed output)

LLM failure (rate limit)

Session loss

Agent stuck

Orchestrator crash

The system is crash-resilient: if any service crashes, it can resume from the last persisted state. The state layer checkpoints the world model and phase-map state at each finding (or at each cycle if findings are sparse). This is a trade-off between checkpoint overhead and recovery granularity — for most engagements, per-finding checkpointing is sufficient.

12.4 Idempotency and Replay

Tool executions should be idempotent where possible (running the same scan twice produces the same result). The system supports replay — re-running the engagement from the event log to reproduce a finding or debug a failure. Event sourcing in the state layer enables this: the event log is the complete record of every action and its result, and the world model is a deterministic projection of that log.

Replay is essential for three scenarios:

Debugging: "Why did the agent decide to exploit service X?" — replay the cognitive cycle with the exact context that was present at decision time.
Report evidence: "Show me the exact nmap output that produced this finding." — replay to the event and read the raw output from disk.
Quality assurance: "Would the agent have found this vulnerability with a different model?" — replay with a different model router configuration.

13. How the Five Innovations Compose

The five innovations are not independent components that happen to coexist. They are architectural elements that compose into a single system, each addressing a failure mode that the others cannot:

13.1 The Composition Matrix

Tri-Con (WP01)

Token Engine (WP02)

Custom Orchestrator (WP03)

Phase Map (WP04)

Skill Platform (WP05)

13.2 Interaction Chains

The innovations form interaction chains where each enables the next:

Tri-Con → Token Engine: Tri-Con compresses what enters the context (from millions of raw tokens to a ~850-token L3 snapshot). The Token Engine compresses how efficiently those tokens are represented (18–35% further reduction). Together, they reduce a 124,600-token raw corpus to ~620 tokens in the LLM's context — a 99.5% reduction with zero information loss.

Skill Platform → Custom Orchestrator: The Skill Platform's skill.yaml manifests include capability declarations — what the skill does, its inputs, outputs, side effects, and alternatives. The Custom Orchestrator's Phase 1 loads these declarations into its capability registry. Without the Skill Platform's structured manifests, the orchestrator would have no capability data to build its registry from. Without the orchestrator's capability-aware routing, the Skill Platform's skills would be selected by the LLM based on name familiarity, not capability fit.

Phase Map → Custom Orchestrator: The Phase Map defines what to do in each phase (which agents, which skills, which gates). The Custom Orchestrator decides how to do it (which specific skill to assign, with what parameters, to which agent). The phase map provides the graph; the orchestrator walks it. Without the phase map, the orchestrator would have no topology to walk — it would be a capability-aware router with no engagement structure. Without the orchestrator, the phase map would be a static plan with no adaptive execution.

Tri-Con → Custom Orchestrator: The orchestrator's context injection protocol is built on Tri-Con. When the orchestrator assigns a task to a sub-agent, it injects context via Tri-Con: the L3 snapshot (always), the relevant L2 group (if the task targets a specific finding), and the L1 entry (if detailed tool output is needed). Without Tri-Con, the orchestrator would have no way to give a sub-agent a coherent, bounded context — it would either dump everything (context drowning) or give nothing (blind execution).

Skill Platform → Tri-Con: The skill platform's parser component produces structured_output — normalized Finding objects. This is what the Tri-Con indexing agent consumes to generate L1 entries. Without the skill platform's parsers, Tri-Con would have to parse raw tool output itself, which would require tool-specific parsing logic baked into the cognitive layer. With the skill platform, each skill brings its own parser, and Tri-Con indexes the already-structured output.

13.3 The Full Composition

In a single cognitive cycle, all five innovations interact:

Phase Map (State + Orchestration): determines the current phase-map node and its assigned agents and skills.
Custom Orchestrator (Orchestration): decomposes the node's objective into tasks, matches against the capability registry (built from Skill Platform manifests), and selects the optimal skill.
Skill Platform (Execution): loads the selected skill, runs its command in a sandbox, and produces structured_output via the skill's parser.
Tri-Con (Cognitive + Memory): the indexing agent compresses the structured output into L1/L2/L3 entries. The L3 snapshot is refreshed.
Token Engine (Cognitive): compresses the L3 snapshot (and any drill-down) before the LLM call. Decompresses the LLM's output afterward.
Cognitive Layer: the LLM reasons over the compressed context, produces a decision.
Orchestrator: assesses the result against success criteria, decides to accept (advance), reassign (return to step 2), retry, or escalate.
Phase Map: if the current node is complete, evaluates gates and transitions to the next node.

This cycle repeats for the duration of the engagement, each cycle costing ~$0.01–0.05 in LLM tokens (post-optimization), producing ~1–5 new findings, and advancing the phase-map state.

14. Adaptability Through Decoupling

This architecture is designed for 2026-era models and tools, but it must adapt as capabilities evolve. The key design choice that enables adaptability is decoupling — each layer communicates with adjacent layers through defined contracts, not through shared implementation. This means any component can be replaced without touching the others:

14.1 Model Adaptability

The cognitive layer is decoupled from the model through the model router. When a new model with a 10M-token context window arrives, the Tri-Con L1/L2/L3 boundary shifts (more can fit in L1 working memory, less drill-down needed) but the architecture does not change. The model router is updated to include the new model; the reasoning pipeline is unchanged. When a new provider offers a cheaper mid-tier model, the router routes enumeration calls to it automatically.

The Token Engine is tokenizer-aware: when a new model uses a different tokenizer, the engine is re-parameterized with that tokenizer and the same 18–35% savings apply. No architectural change.

14.2 Tool Adaptability

The execution layer is decoupled from tools through the Skill Platform's uniform interface. When a new tool (e.g., a novel BLE pentesting tool) is needed, a domain expert writes a skill package, pushes it to the library, and the execution layer can run it. The orchestrator's capability registry is updated from the new skill's manifest. No core code change. No regression risk.

When a tool is updated (e.g., nmap 7.95 → 8.0), the skill package is versioned (network/nmap-scan@2.0.0), and the old version remains available for engagements that require reproducibility. The semver system ensures that breaking changes are major-version bumps and dependent skills declare their compatibility ranges.

14.3 Methodology Adaptability

The orchestration layer is decoupled from engagement topology through the Phase Map. When a new engagement type is needed (e.g., "cloud container pentest"), a security analyst authors a new phase map YAML and selects it at engagement start. No orchestrator code change. The graph walker interprets the new topology at runtime.

When a methodology evolves (e.g., OWASP WSTG adds a new testing category), the phase map is updated to include a new node or sub-phase. Existing engagements running on the old phase map version are unaffected — phase maps are versioned and pinned per engagement.

14.4 Memory Adaptability

The memory layer is decoupled from storage through the L1/L2/L3 abstractions. When a better vector store becomes available, the L2 session memory's vector index is swapped. When a graph database with better traversal performance appears, the chain graph store is swapped. The cognitive layer's retrieval interface is unchanged — it asks for L2 groups and L1 entries by ID; it does not know which store serves them.

14.5 Deployment Adaptability

The deployment topology is decoupled from the architecture. The same codebase runs as a single process (development), a multi-service deployment on a single node (small team), or a distributed deployment across a cluster (production). The layer contracts are the same whether they are in-process function calls or network API calls. This means a team can start with a single-process deployment and scale to distributed as engagement volume grows, without re-architecting.

15. Comparison with Monolithic AI Pentesting Approaches

The architecture presented here stands in deliberate contrast to the monolithic approach taken by existing AI pentesting frameworks — PentestGPT, HackingBuddyGPT, CAI, and PentAG. These systems share a common pattern: a single LLM agent (or a small fixed set of agents) is given a list of tool schemas and asked to run the entire engagement in a free-form tool-calling loop. The differences between them are in prompt engineering and tool packaging, not in architecture.

15.1 Architectural Comparison

Context management

Token efficiency

Tool selection

Engagement workflow

Tool extensibility

State management

Multi-agent

Safety

Observability

Recovery

Adaptability

Cost per engagement

15.2 The Monolithic Failure Cascade

The monolithic approach fails not because individual components are bad, but because the architecture creates failure cascades — a weakness in one area amplifies weaknesses in others:

No context management → context drowns by turn 20 → LLM loses early findings → makes exploitation decisions on stale data → wrong exploit → failed engagement.
No token optimization → raw output sent to LLM → $1,200/engagement → economically unviable for continuous operation → engagement time-limited → incomplete testing.
No capability-aware routing → LLM picks wrong tool → wrong parameters → noisy scan on production → IDS alert → engagement terminated.
Linear pipeline → no parallelism → 40 services enumerated sequentially → engagement takes 3× longer → cost overruns → incomplete coverage.
Hardcoded tools → customer asks for new protocol → engineering project → 3-month wait → customer leaves.
No observability → agent makes bad decision → operator can't see why → can't fix it → repeats on next engagement.

The layered architecture breaks these cascades by isolating each concern. Context drowning is solved by Tri-Con before it reaches the LLM. Cost explosion is solved by the Token Engine before it reaches the budget. Wrong tool selection is solved by the orchestrator before it reaches the target. Linear workflow is solved by the phase map before it reaches the engagement timeline. Tool extensibility is solved by the skill platform before it reaches the engineering backlog. Observability is solved by the interface layer before it reaches the operator's blind spot.

15.3 When Monolithic Is Sufficient

The monolithic approach is not universally wrong. It is sufficient for:

Short engagements: A single-host CTF challenge, a 30-minute ad-hoc scan, a demonstration. The context window does not drown; the cost is acceptable; the workflow is simple.
Single-domain testing: A web-only scan with a fixed toolset. No methodology diversity needed.
Research and prototyping: Exploring whether LLMs can pentest at all. The monolithic approach is faster to build and iterate on.

The layered architecture is necessary when the requirements exceed what a monolithic approach can sustain: multi-hour engagements, multi-domain testing, economic viability for continuous operation, regulatory audit requirements, multi-agent parallelism, and continuous capability growth without regression risk.

16. Conclusion

Architecting an autonomous pentesting system is a distributed systems problem with a cognitive core. The architecture presented here — six layers with clear responsibilities, five innovations precisely positioned within them, structured state, hierarchical memory, multi-agent orchestration, and strong safety enforcement — is not the only possible design, but it is one that has been validated by the constraints of real engagements: context limits, cost pressure, safety requirements, and the need for observability.

The system is complex, but the complexity is organized. Each layer can be built, tested, and improved independently. Each innovation can be understood in isolation and composed into the whole. The cognitive layer — the part that gets the most attention — is perhaps 20% of the total system by code volume. The other 80% — state, memory, execution, orchestration, interfaces — is what makes the cognitive layer's decisions reliable, safe, and useful.

The five innovations compose into a system that is greater than the sum of its parts. Tri-Con makes long-horizon reasoning feasible. The Token Engine makes it affordable. The Custom Orchestrator makes it competent. The Phase Map makes it adaptable. The Skill Platform makes it extensible. Together, they transform autonomous pentesting from a research demonstration into a production-capable system.

In the next paper, we zoom into the orchestration layer and examine the design patterns that coordinate AI security agents at scale — the orchestration topologies, delegation protocols, and coordination algorithms that make multi-agent pentesting reliable under real-world conditions.

This whitepaper is part of a series on autonomous penetration testing with AI agents. For the full series index and related work, see the accompanying documentation.

Whitepaper Title Innovation ----------------------------- WP01 The Tri-Con 3-Layer Index Cascaded context management WP02 The Token Engine Reversible token optimization WP03 The Custom Orchestrator Capability-aware task assignment WP04 The Phase Map Architecture Declarative engagement topology WP05 The Skill-Based Platform Never-changing core + skill library WP06 The Context Window Crisis Diagnosis and mitigation strategies WP07 Model Selection and Cost Optimization Tiered model routing WP08 Architecting Autonomous Pentesting This paper — system integration WP09 Orchestration Patterns at Scale Multi-agent coordination