K² Adversarial Context Demo: Multimodal Threat Intelligence Case Study

K² ships scoped threat context to your adversarial generator.

Adversarial generators need scoped, current, traceable context. K² is the knowledge layer that supplies it. This case study shows how an existing generator can retrieve threat patterns, policy boundaries, target-system context, and past-finding precedent with citations on every fact and lineage on every evaluation-plan row.

This is a demo of what can be built with K², not a standalone red-teaming or AI-security product. Keep PyRIT, NeMo Guardrails, garak, commercial platforms, internal harnesses, scorers, guardrails, and GRC workflows in place; K² supplies governed, cited context before those tools act.

Try the demo in 10 minutes Run a pilot on your stack

K² stays under the test generator. Your red-team platform, scorer, guardrails, and drift detection remain in place. MCP-capable clients, including Claude Agent SDK, call get_evaluation_plan.

Context path

Role-separated corporaThreats, policy, target facts, and findings keep their roles.

Named agentsEach worker retrieves only the evidence it owns.

MCP boundaryThe external generator receives a cited plan before it acts.

Planned public workload

~150public threat patterns in the planned workload

3context-layer arms, with no commercial vendor comparison

4lineage fields checked on each plan entry

These are methodology anchors, not outcome claims. The published v1 should replace them with frozen run results.

K² primitives

Collections, metadata filters, Agents, Knowledge Feed, Pipeline, and MCP keep context scoped before generation.

MCP integrations

Works with MCP-capable agent and evaluation stacks, including Claude Agent SDK.

K² exposes the read-only get_evaluation_plan tool. The calling stack keeps ownership of the agent loop, adversarial generation, execution, scoring, guardrails, and review workflow.

Claude Agent SDK stdio MCPPyRITNeMo GuardrailsgarakInternal harnessesAI security platforms

Architecture first

K² connects adversarial generators to scoped, cited threat context.

The topology mirrors the sibling context demo: role-separated corpora, named agents, Knowledge Feed, Pipeline, MCP boundary, and the customer's existing tool on the right side of the diagram.

Customer knowledgeThreats, policy, target facts, findings

K² primitivesCollections, Agents, Feed, Pipeline

MCP serverCited evaluation plan JSON

Existing generatorPyRIT, NeMo, internal, or vendor

Customer stackExecution, scoring, reviewer, GRC

Role-separated adversarial context architecture

K² keeps retrieved facts role-aware, composes a plan through bounded agents, and hands that plan to an existing red-team tool over MCP.

K² is the knowledge layer underneath adversarial testing

The page is intentionally complementary. AI security platforms keep the adversarial outcome; K² supplies the scoped context that makes their generation step easier to review.

What this demo claims

K² makes adversarial test generation more scoped, current, and traceable by improving pre-generation context.
Every candidate test seed can trace back to the threat pattern, policy clause, and target fact that justified it.
Threat intelligence stays fresh through Knowledge Feeds without rewriting the customer generator.

What this demo does not claim

K² does not score adversarial outcomes.
K² does not enforce runtime guardrails.
K² does not detect production drift on its own.
K² does not replace an eval framework, red-team platform, or AI security product.
K² does not produce compliance attestation.

For AI security platforms: K² is infrastructure under your product

Your scanner, scorer, reviewer UI, reporting workflow, and customer relationship stay yours. K² improves the context payload before your generator creates tests.

What a partner can sell with K² underneath

Cited test rationale, fresh threat-feed inclusion, policy-scope explanations, and regression-targeting without rebuilding a knowledge layer inside the AI security product.

Cited test rationaleEvery plan line can explain which public threat pattern, policy clause, target fact, and finding justified it.

Fresh threat inclusionNew public threat intel and resurfaced regressions can enter the next plan without changing the partner generator.

Policy-scope explanationsReviewers see why a seed is in scope, review-required, or excluded before the platform generates inputs.

Reusable context substrateThe buyer-facing AI security product keeps its UI, workflows, scoring, reports, and commercial relationship.

PyRITOpen-source illustrative generator for the quickstart; K² hands it a cited plan before prompt generation.

NeMo GuardrailsScenario and red-team flows can consume scoped plan rows while keeping execution downstream.

garakScanner-style probes can use K² lineage to explain why a probe set applies to a target.

Internal harnessPlatform teams can keep bespoke runners and attach K² citations to plan rows.

Commercial AI security platformPartner products can brand the workflow while K² supplies the knowledge substrate.

The core insight: adversarial facts have roles

Generators perform better when they know whether a fact is a threat pattern, policy boundary, target-system fact, or past finding. K² preserves that role through collections, filters, agents, and citations.

Threat patterns propose riskPublic techniques, modalities, model classes, and origin citations.

Policy scope bounds the planEnvironment rules, severity definitions, and review gates.

Target facts create relevanceAccepted modalities, tools, prompts, model version, and surface.

Past findings carry memoryPrior success, mitigation, resurfacing, and regression status.

Business-relevant storyline: enterprise support copilot

The demo keeps SupportBot synthetic, but the buyer pattern is concrete: a support copilot with RAG, tools, uploaded images, PII boundaries, and repeated regression risk.

Enterprise support copilotRAG, ticket creation, uploaded screenshots, and customer-policy boundaries create a realistic multimodal test target.

Why it mattersSupport copilots naturally combine PII rules, tool-use restrictions, retrieval injection, image context, and regression history.

Pilot pathReplace the synthetic SupportBot profile with one production support assistant already covered by the customer red-team workflow.

Why K² fitsThe same plan contract can support support, finance, healthcare, and SOC-assistant pilots.

Concrete integration contract

The partner or customer tool calls one plan endpoint and receives structured context. It still owns prompt generation, execution, scoring, guardrails, and reporting.

{
  "tool": "get_evaluation_plan",
  "arguments": {
    "target_id": "target-supportbot-v2.3",
    "modalities": ["text", "image"],
    "environment": "staging",
    "max_plan_entries": 8,
    "include_watchlist": true
  }
}

{
  "evaluation_plan": [
    {
      "seed_id": "seed-0142-regression",
      "generator_hint": "indirect-prompt-injection",
      "lineage": {
        "threat": "threat-2024-0142",
        "policy": "policy-fin-001",
        "target_fact": "target-supportbot-v2.3.tools_enabled",
        "past_finding": "finding-supportbot-2026-03-018"
      },
      "citations": [
        {"id": "threat-2024-0142", "source": "public-research"},
        {"id": "policy-fin-001", "source": "customer-policy"},
        {"id": "target-supportbot-v2.3", "source": "customer-target"},
        {"id": "finding-supportbot-2026-03-018", "source": "customer-finding"}
      ],
      "boundary": "customer_generator_creates_final_inputs"
    }
  ]
}

What K² should answer before generation

Before the red-team tool generates a single adversarial input, K² should answer questions like these with citations from the customer's threat, policy, target, and findings corpora.

Ask K²Which threat patterns apply to a vision-capable chat assistant on text-plus-image inputs?

Ask K²Which severity band does each pattern carry under our policy?

Ask K²Which patterns have already been mitigated for this target?

Ask K²Which past findings are due for regression-style re-testing?

Ask K²Which proposed seeds would step outside the agreed scope?

K² primitives value map

Each card maps one K² primitive to the adversarial-context value it delivers.

Collections

Separate roles stay queryable

Threats, policy, target context, and findings are indexed as different corpora instead of one prompt pile.

Demo corpora: adv-threats, adv-policy, adv-target, adv-findings.
Reviewers can see why each fact was retrieved.

Metadata filters

Scope before generation

Modality, model class, severity, environment, and mitigation filters narrow retrieval before the generator sees context.

Example: text plus image, vlm, staging, high severity.
Less irrelevant context reaches the prompt.

Hybrid search

Semantic plus exact match

K² can match attack-family language while preserving exact tags, benchmark names, and citation identifiers.

Useful for OWASP categories and paper-derived technique names.
Prevents taxonomy terms from being washed out.

Agents

Bounded context workers

Threat, policy, target, and strategist agents each answer one part of the plan with explicit corpus access.

The strategist consumes cited upstream outputs.
Adversarial input generation remains external.

Knowledge Feed

Freshness without rework

New public threats and resurfaced regressions promote into scoped corpora so future plans include them automatically.

Public threat intel lands in adv-threats.
Regression watchlist tags keep old failures visible.

Pipeline + MCP

Auditable handoff

A declared topology exposes one MCP endpoint that hands a cited evaluation plan to the customer tool.

Tool contract: get_evaluation_plan(...).
Downstream scorer and reviewer stay in place.

Two ways to act on the demo

Developers need a short reproducible setup. Enterprise buyers need a controlled pilot against their current red-team workflow. The same context layer supports both paths.

Developer quickstartTry the demo in 10 minutes

Load the public corpora bundle, connect the MCP server, and inspect one cited evaluation plan.

Enterprise pilotRun a pilot on your stack

Freeze 10 to 20 plans, compare context fidelity, and keep your existing scorer unchanged.

Jump to the boundary proof

Inspect how the plan line, citation panel, and benchmark framing keep K² below the adversarial outcome.