AI-Driven · Agentic

Pen Testing That Reasons Like an Attacker

Scanners run signatures. Human testers run tools. Scrutex AI agents think — observing your environment, reasoning about what to try next, chaining vulnerabilities into real attack paths, and adapting when one route fails. Exactly what a skilled adversary does. At machine speed.

Intelligence-led, not signature-led Multi-step attack chain simulation Adaptive — re-routes around blocked paths
The fundamental difference

Agents reason. Scripts don’t.

Every other automated pen testing tool is a smarter script. Scrutex deploys agents that make decisions — and that changes everything about what they find.

Agentic AI (Scrutex)

Observes, decides, adapts, persists

The agent has goals, not a script. At every step it evaluates what it found, decides what to try next, re-routes around blocked paths, and keeps going — exactly like a skilled attacker working a target.

  • Discovers an open port → reasons about what service is likely → probes intelligently
  • One exploit fails → pivots to a different technique for the same goal
  • Finds a low-severity issue → evaluates if it chains to something critical
  • Achieves foothold → autonomously attempts privilege escalation and lateral movement
  • Encounters a WAF → adapts payload encoding and retry strategy
Traditional Automation

Executes a predefined list and stops

Traditional scanners run a fixed playbook. When a check returns nothing, they move on. They cannot chain findings, adapt strategy, or reason about context.

  • \u2717 No reasoning between steps — each check is independent
  • \u2717 Cannot chain low-severity findings into critical attack paths
  • \u2717 Does not adapt when an approach is blocked
  • \u2717 Stops at the perimeter — no post-exploitation reasoning
  • \u2717 Reports vulnerabilities in isolation, not as attack narratives
How agents work

Five phases. One continuous reasoning loop.

At every phase, the agent reasons about what it found and decides what to do next. There is no fixed playbook — only goals, observations, and decisions.

01 / Reconnaissance

Observe & Map

Agent enumerates the full external attack surface. It reasons about what each finding implies — not just whether a port is open, but what that port likely runs and why it matters.

discovered: 47 subdomainsinferred: staging env exposeddecision: prioritise api.staging
02 / Discovery

Enumerate & Hypothesise

The agent forms hypotheses about likely weaknesses based on what it observes — tech stack, auth patterns, endpoint structure. It prioritises what to test first based on impact probability.

observed: JWT in cookiehypothesis: alg confusiondecision: test HS/RS256 swap
03 / Exploitation

Attempt & Adapt

The agent attempts its hypothesis. If it fails, it reasons about why and selects the next most likely approach. It never stops at the first failure — it adapts until it either succeeds or exhausts options.

attempt: RS256→HS256 swapresult: WAF blockedadapt: encode + retry → SUCCESS
04 / Escalation

Chain & Pivot

With a foothold established, the agent reasons about what else it can reach — privileged APIs, internal data, other users’ sessions. It builds a chain, turning individual findings into a full attack narrative.

escalated: user → admindiscovered: /internal/adminchained: IDOR + priv-esc path
05 / Reporting

Narrate & Evidence

Every finding is reported as a full attack narrative — the reasoning chain, each step taken, evidence captured, impact assessed, and remediation steps provided. MITRE ATT&CK techniques tagged throughout.

severity: Critical (CVSS 9.8)technique: T1550.001evidence: session + screenshot
Agent capabilities

What the agent can do autonomously

These are the reasoning and execution capabilities that separate agentic testing from automated scanning.

CAP-01

Contextual reasoning

The agent understands what it finds — inferring technology stacks, auth patterns, and likely weaknesses from observable signals, not just matching signatures.

CAP-02

Vulnerability chaining

The agent connects low-severity findings into high-impact attack paths. An information disclosure plus a misconfigured API plus a predictable token becomes a full account takeover chain.

CAP-03

Adaptive strategy

When a technique is blocked — by a WAF, rate limiter, or patched endpoint — the agent selects the next best approach rather than marking the check failed and moving on.

CAP-04

Goal-directed persistence

The agent is given objectives — compromise the admin panel, access customer data, escalate to root — and keeps working toward those goals across multiple sessions if needed.

CAP-05

Multi-surface coverage

A single agent session covers external attack surface, web applications, APIs, and authentication flows — correlating findings across surfaces to find cross-system attack paths.

CAP-06

TTP-driven approach

Agents are loaded with the TTPs of threat actors targeting your sector. They test what your specific adversaries are known to try, not just generic checks.

CAP-07

Attack narrative generation

Every finding is reported as a step-by-step attack narrative — the agent’s reasoning, what it tried, what worked, what the attacker could do next. Not a CVE list — a story.

CAP-08

Guardrailed autonomy

You define the scope, rules of engagement, and exclusion zones. The agent operates with full autonomy within those bounds — no manual approvals for each step.

CAP-09

Continuous re-testing

After remediation, the agent re-runs the same attack chains to verify fixes. No scheduling delays, no billable hours — re-test runs on demand in hours, not weeks.

Full test coverage

Every surface. Every standard.

Complete coverage across OWASP, MITRE ATT&CK, and API security — with findings mapped to the compliance frameworks your regulators require.

OWASP Top 10 — Web Applications

2021 · All 10 categories

A01 — Broken Access ControlCovered
A02 — Cryptographic FailuresCovered
A03 — Injection (SQL, NoSQL, Cmd, LDAP)Covered
A04 — Insecure DesignCovered
A05 — Security MisconfigurationCovered
A06 — Vulnerable & Outdated ComponentsCovered
A07 — Identification & Auth FailuresCovered
A08 — Software & Data IntegrityPartial
A09 — Logging & Monitoring FailuresCovered
A10 — Server-Side Request ForgeryCovered

OWASP API Security Top 10

2023 · REST, GraphQL, gRPC

API1 — Broken Object Level AuthorisationCovered
API2 — Broken AuthenticationCovered
API3 — Broken Object Property AuthCovered
API4 — Unrestricted Resource ConsumptionCovered
API5 — Broken Function Level AuthCovered
API6 — Unrestricted Access to Sensitive FlowsCovered
API7 — Server-Side Request ForgeryCovered
API8 — Security MisconfigurationCovered
API9 — Improper Inventory ManagementCovered
API10 — Unsafe Consumption of APIsPartial

MITRE ATT&CK Coverage

Enterprise + Cloud matrices

Reconnaissance (TA0043)Covered
Resource Development (TA0042)Partial
Initial Access (TA0001)Covered
Execution (TA0002)Covered
Persistence (TA0003)Covered
Privilege Escalation (TA0004)Covered
Defence Evasion (TA0005)Partial
Credential Access (TA0006)Covered
Discovery (TA0007)Covered
Collection & Exfiltration (TA0009/10)Covered

External Attack Surface

Continuous + on-demand

Subdomain enumeration & takeoverCovered
Open port & service exposureCovered
TLS/SSL misconfigurationCovered
Exposed credentials & sensitive filesCovered
Cloud misconfiguration (S3, blobs)Covered
Email security (SPF, DKIM, DMARC)Covered
Supply chain & third-party exposurePartial

Findings mapped to your regulatory obligations

TIBAS / GL20iCAST / C-RAFAPRA CPS 234Essential EightDORA TLPTSAMA FEERMAS TRMPCI DSS v4.0ISO 27001:2022SOC 2 Type IINIST CSF 2.0CWE Top 25
Why agentic

The honest comparison

Scanners are fast but shallow. Manual testers are deep but slow and expensive. Agentic testing gives you depth at speed.

CapabilityVuln ScannerManual Pen TestScrutex Agentic
Reasons about findings in context\u2014
Chains vulnerabilities into attack paths\u2014
Adapts when a technique is blocked\u2014
TTP-driven by real threat actor intelligence\u2014Sometimes
Attack narrative with reasoning trail\u2014Varies
Full OWASP Web + API Top 10Partial
MITRE ATT&CK mapped output\u2014Sometimes
Compliance-ready report\u2014Varies
On-demand re-testing after fixes\u2014
Turnaround timeHours4–12 weeksDays
Approx. cost$1K–$5K/yr$20K–$150KSubscription
Human-in-the-loop options

When you need a human behind the report

Regulatory submissions, board sign-off, and formal TLPT requirements sometimes need a qualified practitioner. We’ve built that in.

Expert co-sign

A qualified security practitioner reviews agent findings, validates critical attack chains, and co-signs the report for regulatory or board use.

Regulatory submission package

Full TIBAS / iCAST / APRA-formatted documentation prepared by our team, ready for direct submission to your regulator.

Purple team debrief

Joint attacker-defender session replaying every agent decision — your blue team sees exactly what the agent found and why, improving detection playbooks.

Verified closure

Post-remediation re-run with formal closure attestation confirming each identified risk has been addressed. On-demand, no scheduling delays.

See an agent test your environment live

Book a demo. We\u2019ll run the agent against your external attack surface and show you its reasoning chain in real time.