Developer path

Try K² adversarial context in 10 minutes

Load the public adversarial corpora bundle, connect the K² MCP server to an existing generator, and inspect a cited evaluation plan before any adversarial input is generated.

This quickstart validates a platform pattern built with K² primitives. It is not a replacement red-team product; your generator, scorer, guardrails, and review workflow stay downstream.

Quickstart steps

The local MCP smoke path runs from the public bundle. Live K² ingestion remains a pilot step.

Clone and inspect the public bundle

The bundle contains public threat-pattern records, example policy scope, a synthetic target profile, past findings, and MCP examples.

git clone https://github.com/knowledge2-ai/k2-adversarial-context-demo.git

Run the local MCP server

The public bundle now includes a dependency-free stdio MCP server that returns a cited plan from the sample corpora.

printf '%s\n' \
  '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2025-06-18","capabilities":{},"clientInfo":{"name":"smoke","version":"0"}}}' \
  '{"jsonrpc":"2.0","id":2,"method":"tools/list","params":{}}' \
  | python scripts/k2_adversarial_mcp_server.py

Connect Claude Agent SDK or your generator

Claude Agent SDK, PyRIT, or any MCP-capable harness can call the same plan endpoint. Grant only the read-only plan tool.

allowedTools: ["mcp__k2-adversarial-context__get_evaluation_plan"]

Call get_evaluation_plan

Ask K² for a plan scoped to the synthetic SupportBot target, text-plus-image modalities, and staging environment.

get_evaluation_plan(
  target_id="target-supportbot-v2.3",
  modalities=["text", "image"],
  environment="staging"
)

Verify lineage survives handoff

Run one cited plan entry through your generator and confirm the threat, policy, target, and finding references remain attached downstream.

python scripts/pyrit_plan_smoke.py --plan-id seed-0142-regression --dry-run

MCP config snippets

All examples point at the same plan-only K² boundary. Keep live credentials out of committed files.

{
  "mcpServers": {
    "k2-adversarial-context": {
      "command": "python",
      "args": ["scripts/k2_adversarial_mcp_server.py"],
      "env": {
        "K2_ADV_BUNDLE_DIR": "docs/customer-demos/demo-adversarial-context/k2-assets"
      }
    }
  }
}

mcpServers: {
  "k2-adversarial-context": {
    command: "python",
    args: ["scripts/k2_adversarial_mcp_server.py"],
    env: {
      K2_ADV_BUNDLE_DIR: "docs/customer-demos/demo-adversarial-context/k2-assets"
    }
  }
},
allowedTools: ["mcp__k2-adversarial-context__get_evaluation_plan"]

plan = k2.get_evaluation_plan(
    target_id="target-supportbot-v2.3",
    modalities=["text", "image"],
    environment="staging",
)
# PyRIT or your harness owns prompt generation, execution, and scoring.

Integration contract

The quickstart should feel usable to a partner engineer: one request shape, one cited response shape, and no change to downstream scoring.

{
  "tool": "get_evaluation_plan",
  "arguments": {
    "target_id": "target-supportbot-v2.3",
    "modalities": ["text", "image"],
    "environment": "staging",
    "include_watchlist": true
  }
}

{
  "seed_id": "seed-0142-regression",
  "lineage": {
    "threat": "threat-2024-0142",
    "policy": "policy-fin-001",
    "target_fact": "target-supportbot-v2.3.tools_enabled",
    "past_finding": "finding-supportbot-2026-03-018"
  },
  "boundary": "customer_generator_creates_final_inputs"
}

Downstream adapter examples

PyRITUse the plan rows as scenario seeds while PyRIT owns prompt generation, targets, memory, and scoring.

NeMo GuardrailsTranslate cited plan rows into challenge definitions while keeping Guardrails execution downstream.

garakUse K2 lineage to select and explain probe families before garak runs probes against a target.

Internal harnessAttach threat, policy, target, and finding ids to each test case in your existing runner.

Commercial platformKeep the platform UI and reports buyer-facing while K2 supplies pre-generation context.

Expected response

The first query should return scoped, cited plan context, not generated adversarial prompts.

Successful shape

A target profile for SupportBot v2.3 with accepted modalities and tools.
Threat-pattern candidates filtered by modality and target model class.
Policy verdicts that mark in-scope, out-of-bounds, and review-required entries.
Plan lines with threat, policy, target, and past-finding citations.

What to watch for

Generated jailbreak prompts at this stage cross the K² boundary.
Plan lines without policy citations are not ready for reviewer handoff.
Target-irrelevant modalities should be filtered before generation.

Next steps

Once the cited plan works, the useful question is how this K²-built context layer maps to your current red-team process.

Replace the synthetic target profile with one production target you already red-team.

Freeze 10 to 20 evaluation plans before running a pilot.

Keep your existing generator, scorer, guardrails, and GRC workflow unchanged.