Public documentation

Glossary

Agent

The AI system being tested or protected.

API Key

A project-scoped workbench credential used by the CLI or CI to upload sanitized findings. Raw key values are shown once and stored hashed.

Attack Pack

A curated set of social-engineering scenarios for a failure category or regression suite.

Billing

Workbench subscription management. New customers choose Builder or Team before workspace creation, then use billing to review plan state, invoices, and subscription details.

CI Upload

A workbench upload from a CI job, usually using sanitized findings mode.

Evidence

The workbench investigation view that shows transcript evidence, failed turns, failed invariant, impact, and remediation.

Finding

A workbench work item derived from failed scenario evidence.

Full Transcript Opt-In

Upload mode that sends full transcript evidence only when both CLI and project policy opt in.

Included CLI

The local execution engine included with Builder and Team. It runs attack packs and scenarios in your environment, uses your supported LLM provider key for real adaptive attacker turns and judging, and stores local reports under .roleplay. Mock mode is for smoke tests only.

Hidden Context

Scenario context that defines policies, boundaries, or facts the agent should respect.

Judge

The evaluator that scores a transcript against success and failure criteria.

Provider

The simulator that plays the roleplayed user or attacker.

Project

A workbench container for one protected agent product area. Test runs, findings, API keys, agents, CI setup, and evidence are scoped to a project.

Run

One execution of a scenario.

Scenario

A YAML definition of a target, persona, goal, hidden context, success criteria, failure criteria, and judge settings.

Sanitized Findings

Default Cloud upload mode that uploads finding-level evidence without full transcript, raw scenario YAML, or metadata artifacts.

The built-in attack pack for repeatable social-engineering regression testing.

Specialized Vertical Packs

Built-in packs for specific people-facing agent workflows: Customer Relationship Agents, Sales Pipeline Agents, and Recruiting and HR Agents. These packs organize scenarios by business boundary, external actor, action risk, data sensitivity, and regression key.

Target

The interface Roleplay uses to send messages to the agent under test.

Workbench

The paid shared workspace for uploaded findings, run history, project API keys, billing, members, and triage. Builder is about $49/month and Team is about $199/month.

Workspace

The workbench account boundary for members, billing, settings, and projects.

Glossary

Glossary

Agent

API Key

Attack Pack

Billing

CI Upload

Evidence

Finding

Full Transcript Opt-In

Included CLI

Hidden Context

Judge

Provider

Project

Run

Scenario

Sanitized Findings

Social Engineering Core

Specialized Vertical Packs

Target

Workbench

Workspace