Whitepaper 08: Architecting Autonomous Penetration Testing — A Systems Design Approach
Author: Khushal Suthar Date: June 2026 Series: Autonomous Penetration Testing with AI Agents Category: Systems Design & Reference Architecture
Executive Summary
Building an autonomous penetration testing system is a systems design problem, not a prompt engineering problem. The agent is not a single model call; it is a distributed system with state management, tool execution, context orchestration, safety enforcement, and human-in-the-loop interfaces. This paper presents a reference architecture for autonomous pentesting, decomposed into six layers with defined responsibilities, interfaces, and failure modes. The architecture integrates five innovations developed across this whitepaper series — the Tri-Con 3-Layer Index (WP01), the Token Engine (WP02), the Custom Orchestrator (WP03), the Phase Map Architecture (WP04), and the Skill-Based Platform (WP05) — into a single coherent system where each innovation occupies a precise architectural position and interacts with the others through well-defined contracts.
The architecture is informed by production experience and is designed to be implementable today with available components, while remaining adaptable as model capabilities evolve. We present a layered architecture diagram, component interaction flows, data flow between layers, deployment topology, cross-cutting concerns (security, observability, error handling), the adaptability properties that emerge from decoupling, and a comparison with monolithic AI pentesting approaches that treat the entire engagement as a single prompt chain.
1. The System as a Whole
An autonomous pentesting system is best understood as a control loop:
Observe → Orient → Decide → Act → Record → (repeat)
- Observe: Execute a tool, capture output.
- Orient: Interpret the output, update the world model, compress observations through Tri-Con.
- Decide: Select the next action based on objectives, constraints, and phase-map topology.
- Act: Execute the selected tool or sub-agent through the Skill Platform.
- Record: Persist findings, update memory, log decisions, advance phase-map state.
This loop runs continuously for the duration of the engagement, at a rate of one cycle every few seconds to several minutes depending on tool execution time. The architecture's job is to implement this loop reliably, safely, and cost-effectively across hours of unattended operation.
The system is not monolithic. It decomposes into six layers, each with a distinct responsibility and each hosting one or more of the five innovations:
We examine each in turn, tracing how the five innovations compose into a unified system.
2. The Layered Architecture Diagram
┌──────────────────────────────────────────────────────────────────────────┐
│ INTERFACE LAYER │ │ Operator Console │ Reporting Engine │ Telemetry/Observability │ │ (real-time activity, world model browser, decision log, escalation) │ └──────────────────────────────┬───────────────────────────────────────────┘ │ (reads from all layers, writes escalations) ┌──────────────────────────────┴───────────────────────────────────────────┐ │ ORCHESTRATION LAYER │ │ │ │ ┌─────────────────────┐ ┌──────────────────────────────────────┐ │ │ │ Custom Orchestrator │ │ Phase Map Graph Walker │ │ │ │ (WP03) │ │ (WP04) │ │ │ │ - Capability reg. │ │ - Declarative engagement topology │ │ │ │ - Task matching │◄──►│ - Conditional gate evaluation │ │ │ │ - Result assessment │ │ - Parallel branches, nesting │ │ │ │ - Reassignment │ │ - Runtime modification │ │ │ └──────────┬───────────┘ └──────────────────────────────────────┘ │ │ │ delegates to sub-agents (recon, enum, exploit, lateral) │ └──────────────┼───────────────────────────────────────────────────────────┘ │ ┌──────────────┼───────────────────────────────────────────────────────────┐ │ COGNITIVE LAYER │ │ │ │ │ ┌──────────┴──────────┐ ┌──────────────────────────────────┐ │ │ │ Reasoning Pipeline │ │ Tri-Con Index (WP01) │ │ │ │ - Context assembly │───►│ L3 always loaded (~500-2K tok) │ │ │ │ - LLM call │ │ L2/L1 cascaded retrieval │ │ │ │ - Output validation │ │ on demand │ │ │ │ - Model router │ └──────────────────────────────────┘ │ │ └──────────┬──────────┘ │ │ │ all text flows through: │ │ ┌──────────┴──────────────────────────────────────────────┐ │ │ │ Token Engine (WP02) │ │ │ │ L1 Dedup → L2 Shorthand → L3 Wordlist → L4 Compress │ │ │ │ (18-35% token reduction, zero data loss, reversible) │ │ │ └──────────────────────────────────────────────────────────┘ │ └──────────────┼───────────────────────────────────────────────────────────┘ │ reads/writes state, retrieves from memory ┌──────────────┼───────────────────────────────────────────────────────────┐ │ STATE LAYER │ │ │ │ │ ┌──────────┴──────────┐ ┌──────────────────────────────────┐ │ │ │ World Model │ │ Phase Map State │ │ │ │ - Assets, findings │ │ - Active nodes, worklist │ │ │ │ - Credentials │ │ - Completed phases │ │ │ │ - Sessions, chains │ │ - Gate evaluation results │ │ │ │ - History (events) │ │ - Parallel branch tracking │ │ │ └──────────────────────┘ └──────────────────────────────────┘ │ │ Event-sourced, append-only, checkpointed │ └──────────────┬───────────────────────────────────────────────────────────┘ │ ┌──────────────┼───────────────────────────────────────────────────────────┐ │ MEMORY LAYER │ │ │ │ │ ┌──────────┴──────────────────────────────────────────────────┐ │ │ │ L1 Working: LLM context window (Tri-Con L3 + active drill) │ │ │ │ L2 Session: Engagement store + vector index + graph store │ │ │ │ (Tri-Con L1/L2 indexes, world model, chain graph) │ │ │ │ L3 Long-term: Knowledge base (CVEs, playbooks, heuristics) │ │ │ └──────────────────────────────────────────────────────────────┘ │ └──────────────┬───────────────────────────────────────────────────────────┘ │ ┌──────────────┼───────────────────────────────────────────────────────────┐ │ EXECUTION LAYER │ │ │ │ │ ┌──────────┴──────────────────────────────────────────────────┐ │ │ │ Skill-Based Platform (WP05) │ │ │ │ - Skill loader: reads skill.yaml manifests from library │ │ │ │ - Runtime composition: assembles skill components at runtime│ │ │ │ - Safety policy: scope checks, rate limits, authorization │ │ │ │ - Sandboxing: isolated containers per tool execution │ │ │ └──────────────────────────────────────────────────────────────┘ │ └──────────────────────────────────────────────────────────────────────────┘
This is a distributed system. Each layer communicates with adjacent layers through defined contracts — not through shared mutable state or implicit coupling. The innovations are not bolted on; they are structural components that define the layer's internal architecture. We now trace each layer in detail.
3. The Execution Layer
3.1 Responsibility
The execution layer runs tools. It is the system's hands — the only layer that touches the target environment. It must:
3.2 The Skill-Based Platform as Execution Substrate
The execution layer is built on the Skill-Based Platform (WP05). Rather than hardcoding tool integrations into the core binary, the platform's core is deliberately tiny — it does exactly four things: load skills, compose them into runnable modules at request time, enforce safety policy, and stream results. The knowledge of what to do lives entirely in skills; the knowledge of how to do it safely lives in the core.
Every tool the agent can run is defined as a skill package — a declarative manifest (skill.yaml) plus assets (knowledge documents, command definitions, pattern libraries, parser scripts, report templates). A skill's manifest declares its capabilities, input requirements, output characteristics, side effects, and dependencies on other skills. The execution layer reads these manifests at engagement start, materializes the skill components, and wires them into the runtime.
Skill Package (web/apache-path-traversal@1.2.0):
├── skill.yaml # Manifest: capabilities, params, safety rules ├── knowledge/overview.md # CVE-2021-41773 technical context ├── commands/nuclei-cve.yaml# Command spec: nuclei -t cves/2021/... ├── patterns/traversal.yaml # Matchers: 403 + ../ → Path Traversal ├── parser/parse.py # Sandboxed parser → normalized Finding objects └── report/template.j2 # Jinja2 template for report rendering
This design means that adding a new tool or technique is not an engineering project — it is a documentation project. A domain expert writes a skill package, validates it against the schema, and pushes it to the skill library. The core binary never changes. The attack surface for regression stays frozen at zero while the capability surface grows continuously.
3.3 Uniform Tool Interface
Every skill that produces commands exposes a uniform execution interface:
ToolExecutionResult:
tool_name: str skill_id: str # e.g. "web/apache-path-traversal@1.2.0" args: dict exit_code: int stdout: bytes stderr: bytes duration_ms: int metadata: dict # timestamps, target host, source agent structured_output: dict # parsed/normalized by skill parser
The structured_output field is produced by the skill's parser component — a sandboxed Python script or declarative jq/JSONata expression that transforms raw output into normalized Finding objects. This is the compact representation that the cognitive layer's Tri-Con indexer consumes. The raw stdout/stderr is retained in the state layer for audit but never loaded into model context directly.
3.4 Safety Enforcement
The execution layer is the last line of defense before actions touch the target. Safety enforcement is implemented in the platform core as deterministic code, not delegated to the LLM:
supported_targets and the core cross-references against the engagement's scope definition.side_effects section. A skill that generates high traffic (e.g., directory brute-forcing) carries a higher rate limit than a single-request skill.The LLM may decide to run an exploit; the execution layer checks whether that exploit is permitted against that target and refuses if not. Safety is a code path, not a prompt instruction.
3.5 Execution Isolation
Tools run in isolated execution contexts — containers, VMs, or sandboxes — for two reasons:
The execution layer supports parallel tool execution where skills target independent hosts, and serial execution where skills share state (e.g., an exploit followed by a post-exploitation command on the same session). The orchestrator coordinates this through its task assignment system.
4. The State Layer
4.1 The World Model
The state layer maintains the system's world model: a structured, typed, queryable representation of everything the agent knows about the target environment. It is the single source of truth — not a bag of text that the LLM remembers, but a durable data structure that survives context eviction.
WorldModel:
assets: List[Asset] # hosts, services, web apps findings: List[Finding] # vulnerabilities, misconfigurations credentials: List[Credential] # discovered creds, hashes, tokens sessions: List[Session] # active shells, pivots attack_chains: List[Chain] # multi-step paths under construction history: List[Action] # executed actions and results (event log) scope: ScopeDefinition # boundaries, rules of engagement objectives: List[Objective] # what we're trying to achieve
Each entity has typed fields, provenance (which skill/action produced it), timestamps, and confidence scores. The world model is updated by the cognitive layer after each tool execution and is the substrate for all reasoning.
4.2 The Phase Map as Engagement Topology
Alongside the world model, the state layer maintains the Phase Map state — the runtime representation of the declarative engagement topology defined in WP04. A Phase Map is a directed graph where nodes are phases (recon, enumeration, vulnerability, exploitation, post-exploitation, reporting) or sub-phases (enum_smb, enum_http, enum_ssh), and edges are transitions with optional conditional gates.
The state layer tracks:
The Phase Map is not a static plan — it is a living graph that the orchestrator walks dynamically. When a finding triggers a conditional gate (e.g., "if SQL injection found → transition to database extraction sub-phase"), the state layer records the gate evaluation, and the orchestrator enqueues the successor node. When post-exploitation reveals a missed service, the orchestrator can re-enter enumeration from a privileged position through a nested sub-map.
Phase Map State (mid-engagement example):
Active: [enum_http, enum_smb] (parallel branch) Worklist: [vuln_scan (gate: enum_complete)] Completed: [recon (success), enum_ssh (success)] Gates: enum_complete → FALSE (enum_http still running) Nesting: none active
4.3 Why a Structured World Model?
A common mistake is to let the world model be "whatever the LLM remembers." This works for demos and breaks for real engagements. A structured world model provides:
4.4 Session and Chain Management
Session management tracks active access — shells, pivots, established credentials. A session is a resource that can be used, lost, or escalated. The state layer tracks session liveness (is the shell still responsive?) and session capability (what can this session do?). Losing a session mid-chain is a common failure; the state layer must detect it and signal the cognitive layer to re-establish access.
Attack chain management tracks multi-step paths. A chain is a graph of findings and sessions that connects an entry point to a target objective. The state layer stores chains as graphs, not linear narratives, because real attack paths branch: a credential may be usable on multiple hosts; a pivot may enable multiple next steps. The cognitive layer reasons over the chain graph to select the next step.
4.5 Concurrency and Consistency
In a multi-agent topology (see Section 6), multiple sub-agents update the world model concurrently. The state layer handles this through event sourcing: all updates are append-only events, and the world model is a projection of the event log. This provides complete auditability, allows replay (re-running the engagement from the event log to reproduce a finding or debug a failure), and resolves concurrent updates through deterministic merge rules — last-write-wins for independent fields, conflict resolution for shared entities.
Event sourcing is recommended for production systems. The overhead is modest, and the audit and replay capabilities are invaluable for debugging agent behavior, reproducing findings for report evidence, and recovering from orchestrator crashes.
5. The Cognitive Layer
5.1 Responsibility
The cognitive layer is where LLM reasoning happens. It receives the current world-model state (or a Tri-Con-compressed subset), the current objective, and produces a decision: what tool to run next, what hypothesis to pursue, what finding to record. It is the system's brain — but a brain that operates under severe constraints: finite context, finite tokens, finite cost budget.
The cognitive layer hosts two of the five innovations: the Tri-Con 3-Layer Index (WP01) for context management and the Token Engine (WP02) for token optimization. These are not optional add-ons; they are structural components that make long-horizon reasoning feasible at all.
5.2 Tri-Con: Cascaded Context Management
The Tri-Con 3-Layer Index solves the context-drowning problem: a single nmap scan can emit 12,000+ tokens; a full engagement easily generates millions. No context window can hold this volume while leaving room for reasoning. Tri-Con's solution is a three-tier cascaded index that compresses every tool output into progressively more compressed semantic layers:
An indexing agent sits alongside the orchestrator and processes every tool output asynchronously. Raw data is persisted to disk unaltered. Three progressively compressed semantic indexes are derived. The orchestrator operates exclusively on L3, keeping its steady-state context footprint to ~500–2000 tokens regardless of engagement length. When deeper detail is needed for a specific decision, it triggers cascaded retrieval — L3 → L2 → L1 → Raw — pulling only the granularity required, only for the finding in question, and only for the duration of the reasoning step.
Cognitive Cycle (Tri-Con-aware):
Assemble context: L3 snapshot (~850 tokens) + persistent context + active drill
LLM reasons over L3, decides: "Pursue vsftpd backdoor"
Drill-down: fetch L2 for FTP group (~312 tokens) → L1 for nmap (~1200 tokens)
LLM reasons over L3 + L2 + L1 (~2362 tokens), decides: "Run exploit"
After execution, indexing agent compresses new output → updates L1/L2/L3
L3 snapshot refreshed for next cycle
This architecture is what enables unbounded engagement length at bounded orchestrator context cost. Without Tri-Con, the cognitive layer would drown in its own observations by turn 20. With Tri-Con, it can run for thousands of turns across hours, always reasoning over a complete but compressed picture of everything discovered so far.
5.3 The Token Engine: Reversible Token Optimization
While Tri-Con manages what enters the context window, the Token Engine manages how efficiently tokens are used within it. The Token Engine is a four-stage reversible pipeline that reduces LLM token consumption by 18–35% with zero data loss — every transformation is reversible and the original input can be reconstructed bit-for-bit.
The Token Engine sits in the cognitive layer's text pipeline. On the send side (before the LLM call), input text — whether it is a Tri-Con L1 summary, an L2 group, or the L3 snapshot itself — flows through L1→L2→L3→L4 compression, producing a compressed payload and a metadata sidecar. On the receive side (after the LLM call), output text flows through the inverse pipeline, restoring the original.
The engine is implemented in pure Python with no external LLM calls in the compression path. The only runtime dependency is a tokenizer (tiktoken or equivalent). This means the token savings are achieved at zero quality cost and zero latency cost — the LLM receives the same semantic content in fewer tokens and produces the same decisions at lower API spend.
In practice, a naive engagement costing $1,240 in token spend is reduced to approximately $68–$90 after Tri-Con compression (eliminating the need to load raw output) combined with Token Engine optimization (compressing what does enter context). This is the difference between an economically viable autonomous pentesting system and one that costs more per engagement than a human consultant.
5.4 The Reasoning Pipeline
A single cognitive cycle is not one LLM call. It is a pipeline:
Each step is instrumented. Failures at any step are logged and handled through the error-handling taxonomy (retry, fallback, human escalation).
5.5 Model Selection
Not all cognitive calls require the same model. The cognitive layer includes a model router that selects the model for each call based on:
The router is a cheap, rule-based or small-model classifier — not itself an LLM call in the hot path. The Tri-Con + Token Engine combination makes model tiering more effective: because the L3 snapshot is small, the orchestrator can use a frontier model for high-stakes planning calls without prohibitive cost, while sub-agents handling routine enumeration use mid-tier models on their own bounded contexts.
5.6 Reasoning Patterns
The cognitive layer supports several reasoning patterns, selected by the orchestrator based on the phase-map node and situation:
Each pattern has a different call profile, context requirement, and cost. The orchestrator selects the pattern based on the phase-map node's configuration; the cognitive layer executes it.
6. The Memory Layer
6.1 Three-Tier Memory Architecture
The memory layer implements a hierarchical memory architecture. Critically, the Tri-Con indexes (WP01) serve as the structural backbone of this hierarchy — they are not a separate system bolted onto memory, but the memory's primary indexing mechanism:
This mapping is intentional. Tri-Con's L1/L2/L3 compression tiers and the memory layer's L1/L2/L3 storage tiers share a common insight: information should be stored at multiple granularities and retrieved at the granularity the current decision requires. The Tri-Con indexing agent that compresses tool output is simultaneously building the session memory's structured index. There is no separate "memory writer" — the indexing agent IS the memory writer.
6.2 L2: The Engagement Memory
The engagement memory (session memory, not to be confused with Tri-Con's L2 contextual tier) is the most operationally critical component. It holds the world model for the current engagement and provides:
The engagement memory is backed by a combination of a document store (for structured data), a vector index (for semantic search), and a graph store (for chain traversal). In practice, a single PostgreSQL database with pgvector and a graph extension (Apache AGE) can serve all three, though specialized stores (Qdrant, Neo4j) offer better performance at scale.
6.3 L3: The Knowledge Base
The long-term memory persists beyond a single engagement. It holds:
knowledge/ component of every skill package (WP05) is loaded into the knowledge base at engagement start, giving the agent domain-specific context for each active skill.The knowledge base enables transfer learning without fine-tuning: the agent retrieves relevant experience from prior engagements and applies it to the current one. This is where autonomous pentesting gets better with scale — each engagement makes the next one more efficient. A new CVE discovered in the wild becomes a new skill package in the library, a new entry in the knowledge base, and a new capability in the orchestrator's registry — all without code changes.
6.4 Memory Consistency
Memory updates must be consistent across tiers. When a finding is added to L2 (session memory), the Tri-Con indexing agent updates the L1/L2/L3 indexes synchronously — the agent may need to retrieve the finding on the next cognitive cycle. Long-term memory (L3 knowledge base) updates are asynchronous, batched at engagement end. The memory layer handles the case where session memory and long-term memory disagree (e.g., a finding in the current engagement contradicts a historical pattern) by surfacing the conflict to the cognitive layer rather than silently resolving it.
7. The Orchestration Layer
7.1 Responsibility
The orchestration layer manages the agent topology: which sub-agent is running, what its objective is, how sub-agents communicate, and when to escalate. It is the system's executive function. The orchestration layer hosts two innovations: the Custom Orchestrator (WP03) for capability-aware task assignment and the Phase Map (WP04) for declarative engagement topology.
7.2 The Custom Orchestrator: Capability-Aware Routing
The Custom Orchestrator replaces the free-form tool-calling loop used by every existing LLM agent framework. Rather than handing the model a list of tool schemas and asking it to decide which tool to invoke, the orchestrator maintains a capability registry — a structured database of what each skill actually does, what inputs it requires, what outputs it produces, what side effects it has, and what downstream consumers its output feeds.
The orchestrator operates in two phases:
Phase 1 — Tool Understanding: At engagement start, the orchestrator loads capability maps from every skill in the active skill set. Each capability map (defined in the skill's skill.yaml) captures not just the interface signature but the full operational profile: capabilities with proficiency ratings, input requirements, output characteristics (volume, format, noise profile), side effects (traffic generated, IDS detectability, destructive potential), downstream consumers (which Tri-Con tier the output feeds), and alternatives. The orchestrator builds a capability registry indexed by capability tags, input types, output types, and side-effect classes.
Phase 2 — Task Assignment: The orchestrator decomposes the current engagement objective (derived from the phase-map node) into atomic task specifications, matches each task's requirements against the capability registry using a weighted scoring algorithm, and selects the optimal skill — or chain of skills — for the task. After the assigned agent executes, the orchestrator assesses the results against the task's success criteria and can reassign to an alternative skill or agent if the outcome is insufficient.
Orchestrator Task Assignment Cycle:
Phase Map node → objective: "Enumerate HTTP services on 10.0.0.5"
Decompose into task spec: {capability: web_directory_enum, target: 10.0.0.5, ...}
Match against capability registry → scored candidates:
gobuster: 0.91 (expert, high output, needs wildcard tuning) feroxbuster: 0.88 (expert, recursive, better soft-404) ffuf: 0.74 (flexible, but overpowered for this task)
Select: gobuster (highest score, matches stealth requirements)
Inject context via Tri-Con: L3 snapshot + drill to relevant L2
Execute: gobuster runs via Skill Platform
Assess: output parsed by skill parser → 47 directories found → SUCCESS
Decide: advance to next task (vuln_scan on found directories)
This capability-aware routing eliminates the four failure modes that plague free-form tool-calling: wrong tool selection (model picks by name familiarity, not capability fit), parameter misuse (model doesn't know tool-specific tuning), context pollution (model injects raw high-volume output), and side-effect blindness (model runs noisy tools against production during business hours). The capability registry encodes all of this as structured data that the matching algorithm consults deterministically.
7.3 The Phase Map: Declarative Engagement Topology
The Phase Map (WP04) provides the orchestrator with a declarative, graph-based engagement topology that it interprets at runtime. A Phase Map is a directed graph of phases, agents, skills, transitions, and conditional gates — defined in YAML — that describes how to approach a specific engagement type. Each phase map defines:
sub_map references.The orchestrator's graph walker maintains a worklist of active nodes, executes each node's assigned agents with the node's assigned skills, evaluates outgoing gates against the finding graph, and enqueues successor nodes whose gates pass. The graph is walked dynamically — branching, merging, backtracking, and nesting based on findings rather than following a fixed sequence.
Phase Map: OWASP Web (simplified)
[recon] ──gate: host_alive──► [enum_http] │ ┌────────┼────────┐ ▼ ▼ ▼ [enum_dirs] [enum_tech] [enum_auth] │ │ │ └────────┼────────┘ ▼ [vuln_scan] gate: vulns_found? ├── yes ──► [exploit] │ │ │ ▼ │ [post_exploit] │ │ │ ▼ └── no ──► [report]
Pre-built phase maps ship with the platform for common engagement types: OWASP Web, IoT Assessment, PTES Network, Active Directory, API Testing, Red Team. These are selectable at engagement start. Custom phase maps can be authored by security analysts without touching orchestrator code — the workflow is described, not programmed. Phases can be skipped, added, or reordered mid-engagement through a runtime modification protocol, allowing the agent to adapt to emerging findings without restarting.
7.4 Single-Agent vs. Multi-Agent
A single-agent architecture has one cognitive loop that handles all phases. It is simpler to build and debug but hits the context window crisis and does not parallelize.
A multi-agent architecture has specialized sub-agents coordinated by the orchestrator. It is more complex but:
The recommended architecture for production is multi-agent with a thin orchestrator. The orchestrator holds only high-level state (current phase-map node, active sub-agents, engagement objectives, budget) and delegates all detailed work. It is stateless — all state lives in the state layer. If the orchestrator crashes, it can resume from the last persisted phase-map state and world-model checkpoint.
7.5 Escalation and Human-in-the-Loop
The orchestrator must know when to escalate to a human:
Escalation is not failure; it is a designed interaction point. The orchestrator packages the current state, the decision point, and the options into a human-readable brief and pauses. The human's decision is recorded in the event log and the agent resumes. The interface layer surfaces the escalation in real time through the operator console.
8. The Interface Layer
8.1 The Operator Console
Even a fully autonomous system needs an operator console — a human-readable view of what the agent is doing, in real time. The console shows:
The console is not optional. Without it, the operator is flying blind, and "autonomous" becomes "unobservable." An unobservable autonomous system with access to exploitation tools is a liability.
8.2 Reporting
The final deliverable of a pentest is a report. The interface layer generates the report from the world model at engagement end (or on demand). The report is not a stream of LLM prose; it is a structured document generated from findings:
report/ component) plus LLM elaboration.Structured generation ensures completeness (every finding is reported), consistency (format is uniform), and auditability (every claim traces to evidence). The LLM's role is narration and synthesis, not authorship from scratch. The skill platform's report templates (Jinja2/Markdown) ensure that findings from each skill are rendered in the appropriate format.
8.3 Observability and Telemetry
The system emits telemetry for debugging, cost analysis, and quality improvement:
This telemetry is the substrate for post-engagement analysis, agent improvement, and the context-health and cost metrics. It is also the basis for the continuous learning loop: tool heuristics learned during an engagement are written back to the long-term memory (L3 knowledge base) and improve the next engagement.
9. Component Interaction Flows
9.1 A Single Cognitive Cycle (End-to-End)
To show how the layers and innovations interact, we trace a single cognitive cycle from observation to action:
STEP 1: OBSERVE (Execution Layer)
Orchestrator assigns task: "Enumerate HTTP on 10.0.0.5" → Skill Platform loads skill: web/http-dir-enum@1.2.0 → Execution layer runs gobuster in sandbox → Result: 47 directories found (raw: 8,200 tokens) → Skill parser normalizes → structured_output: [{path, status}, ...]
STEP 2: ORIENT (Cognitive + Memory Layers) Tri-Con indexing agent processes structured_output: → Persist raw to disk: raw/eng_001/...gobuster.out → Generate L1 entry: structured summary (~1,200 tokens) → Update L2 group "web_dirs_10.0.0.5": merge with existing (~380 tokens) → Update L3 entry: one-liner (~80 tokens) → Notify orchestrator: L3 delta available State layer: world model updated with 47 new Asset entries (paths) event log appended with Action record
STEP 3: DECIDE (Cognitive + Orchestration Layers) Orchestrator: phase-map node "enum_http" → evaluate gates → Gate "enum_complete": still running (parallel branch not finished) → Continue in current node Cognitive layer assembles context: → L3 snapshot (~850 tokens) + persistent context (~200 tokens) → Token Engine compresses: 1050 → ~780 tokens (26% savings) → LLM call (mid-tier model): "47 dirs found, /admin and /config.php.bak are interesting. Next: vuln scan on these paths." Output validation: schema-valid → decompress via Token Engine Safety pre-check: vuln scan within scope → PASS
STEP 4: ACT (Orchestration → Execution) Orchestrator: decompose into task spec → Match against capability registry: nuclei scores 0.93 → Assign to vuln_agent with skill web/nuclei-scan@3.1.0 Execution layer: load skill, run nuclei in sandbox → (cycle repeats from STEP 1)
STEP 5: RECORD (State + Memory + Interface) State layer: event log updated, world model checkpointed Memory layer: L2 session store updated synchronously Interface layer: operator console streams live activity cost tracker updates: +$0.03 this cycle
9.2 Cascaded Retrieval Flow
When the cognitive layer needs deeper detail than the L3 snapshot provides:
Orchestrator LLM sees L3 entry:
"[FTP] 10.10.10.5:21 vsftpd 2.3.4 — CVE-2011-2523 backdoor candidate" → Decides to pursue this finding
Drill-down request: l2_grp_ftp_10.10.10.5 → Memory layer retrieves L2 group (~312 tokens) → Token Engine compresses: 312 → ~230 tokens
Still need exact nmap script output? → Drill-down request: l1_20260621_001427_nmap → Memory layer retrieves L1 entry (~1,200 tokens) → Token Engine compresses: 1200 → ~880 tokens
Still need raw banner? → Drill-down request: raw/eng_001/...nmap.out → Memory layer reads from disk, extracts relevant section
Total context for this reasoning step: L3 (~850) + L2 (~230) + L1 (~880) = ~1,960 tokens (vs. 124,600 if all raw output were loaded — 98.4% reduction)
9.3 Phase Map Transition Flow
When a finding triggers a conditional gate:
Vuln scan completes → finding: "SQL injection in /api/users?id="
→ State layer: Finding added to world model → Tri-Con: L1/L2/L3 updated → Orchestrator: evaluate outgoing gates from [vuln_scan] node → Gate "vulns_found": TRUE → Gate "vuln_type == sqli": TRUE → Transition: enqueue [exploit_sqli] node → Orchestrator: activate [exploit_sqli] node → Assign exploit_agent with skill web/sqli-exploit@1.0.0 → Inject context: L3 + drill to sqli finding L2 → Execution: sqlmap runs in sandbox
9.4 Escalation Flow
Orchestrator detects: exploit may cause DoS on production DB
→ Safety pre-check: "destructive_confirmed" flag on skill → Orchestrator: escalate to human → Interface layer: escalation appears in operator console Brief: "Proposed: time-based blind SQLi on /api/users Risk: heavy DB queries, possible table lock Target: production DB (10.0.0.5:3306) Options: (a) proceed with --risk=1, (b) use --risk=0 (c) skip this finding, (d) schedule for off-hours" → Human selects: (d) schedule for off-hours → Decision recorded in event log → Orchestrator: defer task, advance to next finding
10. Data Flow Between Layers
The architecture's data flow is unidirectional at the cycle level but bidirectional across cycles. Understanding the data contracts between layers is essential for implementation:
┌─────────┐
│INTERFACE│ └────┬────┘ escalations ▲ │ ▼ telemetry, reports ┌────┴────┐ │ORCHESTR.│ └────┬────┘ task specs, agent ▲ │ ▼ assignments, decisions assignments │ ┌────┴────┐ │COGNITIVE│ └────┬────┘ L3 snapshots, ▲ │ ▼ decisions, drill requests drill results │ ┌────┴────┐ │ STATE │ └────┬────┘ queries, events ▲ │ ▼ world model updates, checkpoints │ phase-map state ┌────┴────┐ │ MEMORY │ └────┬────┘ retrieved ctx ▲ │ ▼ indexed entries, knowledge │ knowledge updates ┌────┴────┐ │EXECUTION│ └─────────┘ ▲ tool results │ (structured_output)
Key data contracts:
ToolExecutionResult with structured_output. Raw output goes to state layer, not cognitive.The Token Engine operates within the Cognitive layer's data flow — it transforms text as it enters and exits the LLM call, but the data contracts between layers are in uncompressed (original) form. This ensures that other layers never need to know about compression; the Token Engine is transparent to the rest of the system.
11. Deployment Architecture
A production deployment of this architecture is a distributed system. Each layer can be deployed as an independent service, scaled independently, failed over independently, and upgraded independently.
┌─────────────────────────────────────────────────────────────────┐
│ DEPLOYMENT TOPOLOGY │ │ │ │ ┌──────────────────────────────────────────────────────────┐ │ │ │ Operator Console (Web UI) │ │ │ │ WebSocket → telemetry stream; REST → world model query │ │ │ └──────────────────────────┬───────────────────────────────┘ │ │ │ │ │ ┌──────────────────────────┴───────────────────────────────┐ │ │ │ Orchestrator Service (stateless, horizontally scalable) │ │ │ │ - Phase-map graph walker │ │ │ │ - Capability registry (loaded at startup) │ │ │ │ - Task assignment engine │ │ │ │ - Sub-agent process manager │ │ │ └──┬───────────┬───────────┬───────────┬───────────┬───────┘ │ │ │ │ │ │ │ │ │ ┌──┴──┐ ┌──┴──┐ ┌──┴──┐ ┌──┴──┐ ┌──┴──┐ │ │ │Recon│ │Enum │ │Exploit│ │Later.│ │Report│ │ │ │Agent│ │Agent│ │Agent │ │Agent │ │Agent│ │ │ └──┬──┘ └──┬──┘ └──┬──┘ └──┬──┘ └──┬──┘ │ │ │ │ │ │ │ │ │ ┌──┴───────────┴───────────┴───────────┴───────────┴───┐ │ │ │ Cognitive Service (per-agent LLM call pipeline) │ │ │ │ - Tri-Con indexing agent (async queue consumer) │ │ │ │ - Token Engine (inline in LLM call path) │ │ │ │ - Model router │ │ │ └──┬──────────────────────────────────┬──────────────────┘ │ │ │ │ │ │ ┌──┴──────────────────┐ ┌──────────┴──────────────────┐ │ │ │ State Service │ │ Memory Service │ │ │ │ (stateful) │ │ (stateful) │ │ │ │ - PostgreSQL: │ │ - L2: pgvector + AGE │ │ │ │ world model, │ │ (session memory) │ │ │ │ event log, │ │ - L3: Qdrant + document │ │ │ │ phase-map state │ │ store (long-term memory) │ │ │ │ - Redis: checkpoint │ │ - Raw output: S3/disk │ │ │ └──┬───────────────────┘ └─────────────────────────────┘ │ │ │ │ │ ┌──┴────────────────────────────────────────────────────────┐ │ │ │ Execution Service │ │ │ │ - Skill loader (pulls from Skill Library/Marketplace) │ │ │ │ - Container orchestrator (Docker/K8s for sandboxing) │ │ │ │ - Safety policy engine │ │ │ │ - Skill parser runtime (sandboxed Python) │ │ │ └────────────────────────────────────────────────────────────┘ │ │ │ │ ┌────────────────────────────────────────────────────────────┐ │ │ │ Skill Library (shared, versioned) │ │ │ │ - Git repo or OCI registry │ │ │ │ - Marketplace for community skills │ │ │ │ - On-prem mirror for air-gapped deployments │ │ │ └────────────────────────────────────────────────────────────┘ │ └──────────────────────────────────────────────────────────────────┘
11.1 Scaling Properties
11.2 Deployment Modes
12. Cross-Cutting Concerns
12.1 Security of the Agent Itself
The agent is a privileged system: it holds credentials, runs exploits, and has access to target infrastructure. It must be secured as carefully as the targets it tests:
12.2 Observability
Observability is not a feature; it is a requirement for safe autonomous operation. The system emits structured telemetry at every layer:
12.3 Error Handling
An autonomous system running for hours will encounter failures. The architecture handles these through a layered error taxonomy:
The system is crash-resilient: if any service crashes, it can resume from the last persisted state. The state layer checkpoints the world model and phase-map state at each finding (or at each cycle if findings are sparse). This is a trade-off between checkpoint overhead and recovery granularity — for most engagements, per-finding checkpointing is sufficient.
12.4 Idempotency and Replay
Tool executions should be idempotent where possible (running the same scan twice produces the same result). The system supports replay — re-running the engagement from the event log to reproduce a finding or debug a failure. Event sourcing in the state layer enables this: the event log is the complete record of every action and its result, and the world model is a deterministic projection of that log.
Replay is essential for three scenarios:
13. How the Five Innovations Compose
The five innovations are not independent components that happen to coexist. They are architectural elements that compose into a single system, each addressing a failure mode that the others cannot:
13.1 The Composition Matrix
13.2 Interaction Chains
The innovations form interaction chains where each enables the next:
Tri-Con → Token Engine: Tri-Con compresses what enters the context (from millions of raw tokens to a ~850-token L3 snapshot). The Token Engine compresses how efficiently those tokens are represented (18–35% further reduction). Together, they reduce a 124,600-token raw corpus to ~620 tokens in the LLM's context — a 99.5% reduction with zero information loss.
Skill Platform → Custom Orchestrator: The Skill Platform's skill.yaml manifests include capability declarations — what the skill does, its inputs, outputs, side effects, and alternatives. The Custom Orchestrator's Phase 1 loads these declarations into its capability registry. Without the Skill Platform's structured manifests, the orchestrator would have no capability data to build its registry from. Without the orchestrator's capability-aware routing, the Skill Platform's skills would be selected by the LLM based on name familiarity, not capability fit.
Phase Map → Custom Orchestrator: The Phase Map defines what to do in each phase (which agents, which skills, which gates). The Custom Orchestrator decides how to do it (which specific skill to assign, with what parameters, to which agent). The phase map provides the graph; the orchestrator walks it. Without the phase map, the orchestrator would have no topology to walk — it would be a capability-aware router with no engagement structure. Without the orchestrator, the phase map would be a static plan with no adaptive execution.
Tri-Con → Custom Orchestrator: The orchestrator's context injection protocol is built on Tri-Con. When the orchestrator assigns a task to a sub-agent, it injects context via Tri-Con: the L3 snapshot (always), the relevant L2 group (if the task targets a specific finding), and the L1 entry (if detailed tool output is needed). Without Tri-Con, the orchestrator would have no way to give a sub-agent a coherent, bounded context — it would either dump everything (context drowning) or give nothing (blind execution).
Skill Platform → Tri-Con: The skill platform's parser component produces structured_output — normalized Finding objects. This is what the Tri-Con indexing agent consumes to generate L1 entries. Without the skill platform's parsers, Tri-Con would have to parse raw tool output itself, which would require tool-specific parsing logic baked into the cognitive layer. With the skill platform, each skill brings its own parser, and Tri-Con indexes the already-structured output.
13.3 The Full Composition
In a single cognitive cycle, all five innovations interact:
structured_output via the skill's parser.This cycle repeats for the duration of the engagement, each cycle costing ~$0.01–0.05 in LLM tokens (post-optimization), producing ~1–5 new findings, and advancing the phase-map state.
14. Adaptability Through Decoupling
This architecture is designed for 2026-era models and tools, but it must adapt as capabilities evolve. The key design choice that enables adaptability is decoupling — each layer communicates with adjacent layers through defined contracts, not through shared implementation. This means any component can be replaced without touching the others:
14.1 Model Adaptability
The cognitive layer is decoupled from the model through the model router. When a new model with a 10M-token context window arrives, the Tri-Con L1/L2/L3 boundary shifts (more can fit in L1 working memory, less drill-down needed) but the architecture does not change. The model router is updated to include the new model; the reasoning pipeline is unchanged. When a new provider offers a cheaper mid-tier model, the router routes enumeration calls to it automatically.
The Token Engine is tokenizer-aware: when a new model uses a different tokenizer, the engine is re-parameterized with that tokenizer and the same 18–35% savings apply. No architectural change.
14.2 Tool Adaptability
The execution layer is decoupled from tools through the Skill Platform's uniform interface. When a new tool (e.g., a novel BLE pentesting tool) is needed, a domain expert writes a skill package, pushes it to the library, and the execution layer can run it. The orchestrator's capability registry is updated from the new skill's manifest. No core code change. No regression risk.
When a tool is updated (e.g., nmap 7.95 → 8.0), the skill package is versioned (network/nmap-scan@2.0.0), and the old version remains available for engagements that require reproducibility. The semver system ensures that breaking changes are major-version bumps and dependent skills declare their compatibility ranges.
14.3 Methodology Adaptability
The orchestration layer is decoupled from engagement topology through the Phase Map. When a new engagement type is needed (e.g., "cloud container pentest"), a security analyst authors a new phase map YAML and selects it at engagement start. No orchestrator code change. The graph walker interprets the new topology at runtime.
When a methodology evolves (e.g., OWASP WSTG adds a new testing category), the phase map is updated to include a new node or sub-phase. Existing engagements running on the old phase map version are unaffected — phase maps are versioned and pinned per engagement.
14.4 Memory Adaptability
The memory layer is decoupled from storage through the L1/L2/L3 abstractions. When a better vector store becomes available, the L2 session memory's vector index is swapped. When a graph database with better traversal performance appears, the chain graph store is swapped. The cognitive layer's retrieval interface is unchanged — it asks for L2 groups and L1 entries by ID; it does not know which store serves them.
14.5 Deployment Adaptability
The deployment topology is decoupled from the architecture. The same codebase runs as a single process (development), a multi-service deployment on a single node (small team), or a distributed deployment across a cluster (production). The layer contracts are the same whether they are in-process function calls or network API calls. This means a team can start with a single-process deployment and scale to distributed as engagement volume grows, without re-architecting.
15. Comparison with Monolithic AI Pentesting Approaches
The architecture presented here stands in deliberate contrast to the monolithic approach taken by existing AI pentesting frameworks — PentestGPT, HackingBuddyGPT, CAI, and PentAG. These systems share a common pattern: a single LLM agent (or a small fixed set of agents) is given a list of tool schemas and asked to run the entire engagement in a free-form tool-calling loop. The differences between them are in prompt engineering and tool packaging, not in architecture.
15.1 Architectural Comparison
15.2 The Monolithic Failure Cascade
The monolithic approach fails not because individual components are bad, but because the architecture creates failure cascades — a weakness in one area amplifies weaknesses in others:
The layered architecture breaks these cascades by isolating each concern. Context drowning is solved by Tri-Con before it reaches the LLM. Cost explosion is solved by the Token Engine before it reaches the budget. Wrong tool selection is solved by the orchestrator before it reaches the target. Linear workflow is solved by the phase map before it reaches the engagement timeline. Tool extensibility is solved by the skill platform before it reaches the engineering backlog. Observability is solved by the interface layer before it reaches the operator's blind spot.
15.3 When Monolithic Is Sufficient
The monolithic approach is not universally wrong. It is sufficient for:
The layered architecture is necessary when the requirements exceed what a monolithic approach can sustain: multi-hour engagements, multi-domain testing, economic viability for continuous operation, regulatory audit requirements, multi-agent parallelism, and continuous capability growth without regression risk.
16. Conclusion
Architecting an autonomous pentesting system is a distributed systems problem with a cognitive core. The architecture presented here — six layers with clear responsibilities, five innovations precisely positioned within them, structured state, hierarchical memory, multi-agent orchestration, and strong safety enforcement — is not the only possible design, but it is one that has been validated by the constraints of real engagements: context limits, cost pressure, safety requirements, and the need for observability.
The system is complex, but the complexity is organized. Each layer can be built, tested, and improved independently. Each innovation can be understood in isolation and composed into the whole. The cognitive layer — the part that gets the most attention — is perhaps 20% of the total system by code volume. The other 80% — state, memory, execution, orchestration, interfaces — is what makes the cognitive layer's decisions reliable, safe, and useful.
The five innovations compose into a system that is greater than the sum of its parts. Tri-Con makes long-horizon reasoning feasible. The Token Engine makes it affordable. The Custom Orchestrator makes it competent. The Phase Map makes it adaptable. The Skill Platform makes it extensible. Together, they transform autonomous pentesting from a research demonstration into a production-capable system.
In the next paper, we zoom into the orchestration layer and examine the design patterns that coordinate AI security agents at scale — the orchestration topologies, delegation protocols, and coordination algorithms that make multi-agent pentesting reliable under real-world conditions.
This whitepaper is part of a series on autonomous penetration testing with AI agents. For the full series index and related work, see the accompanying documentation.