Command Reference
Command Reference
roleplay setup
Guided Workbench and local-runner setup.
roleplay setup
roleplay setup --project <project-id> --provider <provider> --judge hybrid --target http://localhost:3000/agent
The setup command writes safe placeholders to .env.example and .roleplay/config.json. It does not store raw provider keys by default.
Flags:
| Flag | Values | Default | Description |
|---|---|---|---|
--cloud-url | URL | ROLEPLAY_CLOUD_URL or https://app.roleplay.sh | Workbench URL. |
--project | string | ROLEPLAY_PROJECT_ID | Workbench project ID. |
--provider | provider | ROLEPLAY_LLM_PROVIDER | Attacker provider. |
--judge | rules, semantic, hybrid | ROLEPLAY_JUDGE_MODE or hybrid | Judge mode. |
--judge-provider | provider | ROLEPLAY_JUDGE_PROVIDER or --provider | Judge provider for semantic or hybrid mode. |
--target | URL | ROLEPLAY_TARGET_URL | HTTP target URL. |
--target-command | command | ROLEPLAY_TARGET_COMMAND | CLI target command. |
--yes, -y | boolean | false | Accept defaults without prompting. |
--json | boolean | false | Print JSON output only. |
roleplay init
Initialize Roleplay in the current repository for scripted setup.
roleplay init
roleplay init --json
Creates .roleplay/config.json, a local install-smoke scenario, and .roleplay/runs.
roleplay run
Run a local scenario or fetch an entitled Workbench attack pack.
roleplay run .roleplay/scenarios/install-smoke.yml
roleplay run social-engineering-core --target http://localhost:3000/agent --provider <provider> --judge hybrid --project <project-id> --api-key <project-api-key>
roleplay run social-engineering-core --target-command "node ./agent.js" --yes --provider <provider> --judge hybrid --project <project-id> --api-key <project-api-key>
roleplay run social-engineering-core --target mock --provider mock --judge rules
Flags:
| Flag | Values | Default | Description |
|---|---|---|---|
--target | URL or mock | ROLEPLAY_TARGET_URL | HTTP target URL for social-engineering-core, or mock for local smoke tests. |
--target-command | command | ROLEPLAY_TARGET_COMMAND | CLI target command for social-engineering-core. |
--endpoint | URL | ROLEPLAY_CLOUD_URL or https://app.roleplay.sh | Workbench URL for entitlement checks. |
--project | string | ROLEPLAY_PROJECT_ID | Workbench project ID. Required for real agent runs. |
--api-key | string | ROLEPLAY_API_KEY | Workbench API key. Required for real agent runs. |
--provider | provider | ROLEPLAY_LLM_PROVIDER | Shared attacker and judge provider. Required for real targets. |
--attacker-provider | provider | ROLEPLAY_ATTACKER_PROVIDER or --provider | Provider for adaptive attacker turns. |
--judge | rules, semantic, hybrid | ROLEPLAY_JUDGE_MODE | Judge mode. Required for real targets. |
--judge-provider | provider | ROLEPLAY_JUDGE_PROVIDER or --provider | Provider for semantic or hybrid judging. |
--allow-rules-only | boolean | false | Permit rules-only judging for real targets. |
--model | model name | ROLEPLAY_LLM_MODEL or provider default | Shared model. |
--attacker-model | model name | ROLEPLAY_ATTACKER_MODEL or --model | Model for attacker turns. |
--judge-model | model name | ROLEPLAY_JUDGE_MODEL, scenario judge.model, or --model | Model for judging. |
--llm-base-url | URL | ROLEPLAY_LLM_BASE_URL | Base URL for OpenAI-compatible providers. |
--max-turns | integer | scenario value | Override scenario max turns. |
--fail-on | warning, failed, critical | failed | Exit non-zero at this threshold. |
--json | boolean | false | Print JSON only. |
--out | path | .roleplay/runs | Run artifacts directory. |
--yes, -y | boolean | false | Allow local CLI target command execution. |
For social-engineering-core, provide one target source: --target, --target-command, ROLEPLAY_TARGET_URL, or ROLEPLAY_TARGET_COMMAND. Real attack-pack scenario bundles are fetched from the Workbench for entitled projects and are not bundled in the public CLI package.
Real runs, meaning any non-mock target or non-mock provider, require a valid Builder or Team project API key. Create one from onboarding or Monitor, then pass it with --project and --api-key or the matching environment variables.
Provider identifiers are openai, anthropic, google, and openai-compatible. They are reference options, not defaults.
Judge guidance:
rules: deterministic local judge for smoke/offline checks.semantic: provider-backed transcript evaluation.hybrid: semantic judge plus deterministic guardrails, recommended for CI and real-agent tests.
roleplay upload
Upload one run or all local runs to workbench.
roleplay upload latest --project <project-id> --api-key <project-api-key>
roleplay upload all --source ci --mode sanitized_findings
Uploads require a real workbench project ID and project API key from onboarding or Monitor.
roleplay report
Show a saved report.
roleplay report latest
roleplay report <runId> --json
roleplay report <runId> --markdown
Reports include judge metadata when available: mode, provider, model, and whether deterministic guardrails contributed findings.
roleplay replay
Replay a saved transcript.
roleplay replay latest
roleplay replay <runId> --no-delay
roleplay doctor
Check install, Workbench, provider, judge, and upload readiness.
roleplay doctor
roleplay doctor --cloud
roleplay doctor --cloud --project <project-id> --api-key <project-api-key>
Useful flags:
| Flag | Default | Description |
|---|---|---|
--json | false | Print JSON output. |
--cloud | false | Check Workbench /api/health. |
--cloud-url | ROLEPLAY_CLOUD_URL or https://app.roleplay.sh | Workbench base URL. |
--project | ROLEPLAY_PROJECT_ID | Project ID for API key verification. |
--api-key | ROLEPLAY_API_KEY | API key for verification. |
--provider | ROLEPLAY_LLM_PROVIDER | Attacker provider key check. |
--judge | ROLEPLAY_JUDGE_MODE | Judge mode readiness check. |
--judge-provider | ROLEPLAY_JUDGE_PROVIDER | Judge provider key check. |
Other Commands
roleplay scenario:create my-scenario
roleplay list scenarios
roleplay list runs
roleplay mcp