Public documentation

Command Reference

`roleplay setup`

Guided Workbench and local-runner setup.

roleplay setup
roleplay setup --project <project-id> --provider <provider> --judge hybrid --target http://localhost:3000/agent

The setup command writes safe placeholders to .env.example and .roleplay/config.json. It does not store raw provider keys by default.

Flags:

Flag	Values	Default	Description
`--cloud-url`	URL	`ROLEPLAY_CLOUD_URL` or `https://app.roleplay.sh`	Workbench URL.
`--project`	string	`ROLEPLAY_PROJECT_ID`	Workbench project ID.
`--provider`	provider	`ROLEPLAY_LLM_PROVIDER`	Attacker provider.
`--judge`	`rules`, `semantic`, `hybrid`	`ROLEPLAY_JUDGE_MODE` or `hybrid`	Judge mode.
`--judge-provider`	provider	`ROLEPLAY_JUDGE_PROVIDER` or `--provider`	Judge provider for semantic or hybrid mode.
`--target`	URL	`ROLEPLAY_TARGET_URL`	HTTP target URL.
`--target-command`	command	`ROLEPLAY_TARGET_COMMAND`	CLI target command.
`--yes`, `-y`	boolean	false	Accept defaults without prompting.
`--json`	boolean	false	Print JSON output only.

`roleplay init`

Initialize Roleplay in the current repository for scripted setup.

roleplay init
roleplay init --json

Creates .roleplay/config.json, a local install-smoke scenario, and .roleplay/runs.

`roleplay run`

Run a local scenario or fetch an entitled Workbench attack pack.

roleplay run .roleplay/scenarios/install-smoke.yml
roleplay run social-engineering-core --target http://localhost:3000/agent --provider <provider> --judge hybrid --project <project-id> --api-key <project-api-key>
roleplay run social-engineering-core --target-command "node ./agent.js" --yes --provider <provider> --judge hybrid --project <project-id> --api-key <project-api-key>
roleplay run social-engineering-core --target mock --provider mock --judge rules

Flags:

Flag	Values	Default	Description
`--target`	URL or `mock`	`ROLEPLAY_TARGET_URL`	HTTP target URL for `social-engineering-core`, or `mock` for local smoke tests.
`--target-command`	command	`ROLEPLAY_TARGET_COMMAND`	CLI target command for `social-engineering-core`.
`--endpoint`	URL	`ROLEPLAY_CLOUD_URL` or `https://app.roleplay.sh`	Workbench URL for entitlement checks.
`--project`	string	`ROLEPLAY_PROJECT_ID`	Workbench project ID. Required for real agent runs.
`--api-key`	string	`ROLEPLAY_API_KEY`	Workbench API key. Required for real agent runs.
`--provider`	provider	`ROLEPLAY_LLM_PROVIDER`	Shared attacker and judge provider. Required for real targets.
`--attacker-provider`	provider	`ROLEPLAY_ATTACKER_PROVIDER` or `--provider`	Provider for adaptive attacker turns.
`--judge`	`rules`, `semantic`, `hybrid`	`ROLEPLAY_JUDGE_MODE`	Judge mode. Required for real targets.
`--judge-provider`	provider	`ROLEPLAY_JUDGE_PROVIDER` or `--provider`	Provider for semantic or hybrid judging.
`--allow-rules-only`	boolean	false	Permit rules-only judging for real targets.
`--model`	model name	`ROLEPLAY_LLM_MODEL` or provider default	Shared model.
`--attacker-model`	model name	`ROLEPLAY_ATTACKER_MODEL` or `--model`	Model for attacker turns.
`--judge-model`	model name	`ROLEPLAY_JUDGE_MODEL`, scenario `judge.model`, or `--model`	Model for judging.
`--llm-base-url`	URL	`ROLEPLAY_LLM_BASE_URL`	Base URL for OpenAI-compatible providers.
`--max-turns`	integer	scenario value	Override scenario max turns.
`--fail-on`	`warning`, `failed`, `critical`	`failed`	Exit non-zero at this threshold.
`--json`	boolean	false	Print JSON only.
`--out`	path	`.roleplay/runs`	Run artifacts directory.
`--yes`, `-y`	boolean	false	Allow local CLI target command execution.

For social-engineering-core, provide one target source: --target, --target-command, ROLEPLAY_TARGET_URL, or ROLEPLAY_TARGET_COMMAND. Real attack-pack scenario bundles are fetched from the Workbench for entitled projects and are not bundled in the public CLI package.

Real runs, meaning any non-mock target or non-mock provider, require a valid Builder or Team project API key. Create one from onboarding or Monitor, then pass it with --project and --api-key or the matching environment variables.

Provider identifiers are openai, anthropic, google, and openai-compatible. They are reference options, not defaults.

Judge guidance:

rules: deterministic local judge for smoke/offline checks.
semantic: provider-backed transcript evaluation.
hybrid: semantic judge plus deterministic guardrails, recommended for CI and real-agent tests.

`roleplay upload`

Upload one run or all local runs to workbench.

roleplay upload latest --project <project-id> --api-key <project-api-key>
roleplay upload all --source ci --mode sanitized_findings

Uploads require a real workbench project ID and project API key from onboarding or Monitor.

`roleplay report`

Show a saved report.

roleplay report latest
roleplay report <runId> --json
roleplay report <runId> --markdown

Reports include judge metadata when available: mode, provider, model, and whether deterministic guardrails contributed findings.

`roleplay replay`

Replay a saved transcript.

roleplay replay latest
roleplay replay <runId> --no-delay

`roleplay doctor`

Check install, Workbench, provider, judge, and upload readiness.

roleplay doctor
roleplay doctor --cloud
roleplay doctor --cloud --project <project-id> --api-key <project-api-key>

Useful flags:

Flag	Default	Description
`--json`	false	Print JSON output.
`--cloud`	false	Check Workbench `/api/health`.
`--cloud-url`	`ROLEPLAY_CLOUD_URL` or `https://app.roleplay.sh`	Workbench base URL.
`--project`	`ROLEPLAY_PROJECT_ID`	Project ID for API key verification.
`--api-key`	`ROLEPLAY_API_KEY`	API key for verification.
`--provider`	`ROLEPLAY_LLM_PROVIDER`	Attacker provider key check.
`--judge`	`ROLEPLAY_JUDGE_MODE`	Judge mode readiness check.
`--judge-provider`	`ROLEPLAY_JUDGE_PROVIDER`	Judge provider key check.

Other Commands

roleplay scenario:create my-scenario
roleplay list scenarios
roleplay list runs
roleplay mcp