CLI Overview
CLI Overview
The roleplay CLI is the included local runner for the Roleplay Workbench. It runs tests in your environment, saves replayable local evidence, and uploads sanitized proof when you provide a workbench project key.
Basic Usage
roleplay setup
roleplay run social-engineering-core --target mock --provider mock --judge rules
roleplay report latest
roleplay replay latest
Use roleplay upload only after the workbench has created a project and project API key for you.
Commands
| Command | Purpose |
|---|---|
roleplay setup | Guided Workbench and local-runner setup. |
roleplay init | Scriptable starter config for CI or manual setup. |
roleplay scenario:create | Create a scenario from a built-in template. |
roleplay run | Run a scenario or the built-in attack pack. |
roleplay report | Print a saved report. |
roleplay replay | Replay a saved transcript. |
roleplay upload | Upload local runs to workbench. |
roleplay list | List local scenarios or runs. |
roleplay doctor | Check local, Workbench, provider, and judge readiness. |
roleplay mcp | Start a local MCP stdio server. |
Real Runs
Real HTTP or CLI targets require explicit attacker and judge choices:
roleplay run social-engineering-core \
--target http://localhost:3000/agent \
--provider <provider> \
--judge hybrid \
--project <project-id> \
--api-key <project-api-key> \
--fail-on critical
Provider identifiers are openai, anthropic, google, and openai-compatible. They are reference options, not defaults.
Judge Modes
rules: deterministic local judge for smoke/offline checks.semantic: provider-backed judge for transcript evaluation.hybrid: semantic judge plus deterministic guardrails, recommended for CI and serious real-agent tests.
Rules-only judging against real targets requires --allow-rules-only so it is not mistaken for full semantic evaluation.
JSON Output And Exit Codes
Use --json on supported commands for machine-readable output.
roleplay run social-engineering-core --target http://localhost:3000/agent --provider <provider> --judge hybrid --project <project-id> --api-key <project-api-key> --json
roleplay report latest --json
roleplay list runs --json
roleplay doctor --cloud --json
roleplay run exits non-zero when the run crosses the configured --fail-on threshold: warning, failed, or critical.
Output Directory
Local artifacts are stored in .roleplay/runs by default.
roleplay run .roleplay/scenarios/install-smoke.yml --out ./artifacts/roleplay
roleplay report latest --out ./artifacts/roleplay
roleplay upload all --out ./artifacts/roleplay
Use the same --out value across run, list, report, replay, and upload.