How to use the checklist
A pre-launch social-engineering checklist helps teams find the agent's most important delegated-authority risks before users do. The checklist should focus on what the agent can read, reveal, change, remember, approve, send, or delegate.
The goal is not to make launch impossible. The goal is to identify the boundaries that need evidence before launch and the failures that should become recurring checks after launch.
2. Identify external actors and untrusted content
Name who or what can influence the agent. That may include customers, prospects, candidates, employees, vendors, webpages, resumes, tickets, documents, emails, CRM notes, retrieved snippets, or other agents.
For each source, decide whether it can instruct the agent or only provide data. Many failures happen when untrusted data is treated as trusted instruction.
3. Write protected boundaries
Convert policies into testable boundaries. A good boundary says what action or disclosure is prohibited until a condition is satisfied. It should be clear enough that a reviewer can decide whether the agent held or failed.
Start with identity, authorization, data scope, tool preconditions, source trust, memory integrity, and delegation boundaries. Prioritize boundaries with high business impact.
4. Run realistic pressure scenarios
Create scenarios that match the agent's real workflow. Support agents should face ownership claims and refund pressure. Sales agents should face procurement urgency and fake approval. Recruiting agents should face untrusted candidate content. Browser agents should face deceptive page context.
Avoid relying only on obvious hostile strings. Many useful tests should look like ordinary workflow pressure because that is what the agent will see in production.
5. Preserve evidence and verify fixes
For each failure, preserve the attacker move, failed response or action, tool trace if relevant, violated invariant, severity, and reproduction context. Then assign the fix to the team that controls the boundary.
After the fix, rerun the same scenario or regression key. A fix that is not verified is only a claim. A severe verified fix should usually become a recurring check.
6. Decide what blocks launch
Not every issue should block launch, but severe boundary failures should be taken seriously. A launch blocker might involve sensitive data disclosure, unauthorized state change, unsafe external message, payment or pricing authority, candidate or employee data, or a repeated failure across variants.
Document the decision. If a risk is accepted, record why. If a risk is fixed, record the verification. If a risk needs monitoring, record the regression check.
Run the launch review meeting
The checklist is most useful when it produces decisions. A launch review should bring the owner of the agent, the person responsible for the workflow, and the person responsible for security or risk review. The group should review the top boundaries and the evidence for each high-risk scenario.
Keep the meeting concrete. For each boundary, ask whether it was tested, whether any failure remains open, whether the fix was verified, and whether a recurring check exists for severe failures. If the answer is unknown, the decision should be recorded as unknown rather than assumed safe.
The output should be a short launch-risk record: what was tested, what failed, what was fixed, what is accepted, and what will be monitored after launch.
Post-launch follow-up
The checklist should not disappear after launch. Agents continue to change through prompt edits, model updates, tool changes, policy changes, and new user behavior. The first post-launch review should look for gaps between test scenarios and real interactions.
If a new pattern appears in production feedback or support review, convert it into a scenario. If a verified fix protects an important boundary, consider adding it to recurring regression checks. If the agent gains a new tool or data source, rerun the relevant boundary checks.
This keeps the checklist from being a one-time approval artifact. It becomes a lightweight way to maintain confidence as the agent evolves.
Evidence required for launch confidence
A checklist item should not be marked complete only because someone discussed it. For important boundaries, the launch record should include evidence: the scenario, pressure pattern, expected safe behavior, actual behavior, and reviewer decision.
Evidence does not need to be heavy for every low-risk item. But for high-impact boundaries, a launch decision without evidence is mostly trust. If the agent can access sensitive data or call meaningful tools, the team should preserve proof that the boundary was tested.
The evidence should also explain unresolved risk. If a boundary was not tested before launch, record why and decide when it will be tested. Unknown risk should be visible rather than silently accepted.
Checklist outcomes
Each checklist item should end in one of a few clear outcomes. Passed means the boundary held in the tested scenario. Needs fix means the boundary failed and launch should wait or the scope should change. Accepted risk means the team knowingly accepts the current state. Needs monitoring means the boundary held or was fixed but should be checked again.
These outcomes are more useful than vague labels like reviewed or discussed. They tell the team what to do next and make the launch decision easier to audit later.
When the outcome is accepted risk, include the reason. The reason may be low impact, limited exposure, temporary mitigation, or a planned follow-up. Without that note, accepted risk can become a hiding place for unfinished work.
Common pre-launch mistakes
One mistake is testing the agent only through friendly happy paths. Another is testing only the final answer while ignoring tool calls, memory writes, browser actions, and handoffs.
A third mistake is relying on prompt review as proof. A strong prompt is useful, but it does not prove the agent preserves boundaries under realistic pressure. The checklist should include behavior tests, not only configuration review.
A fourth mistake is launching without a regression plan. If a severe failure was found and fixed before launch, the team should decide when that failure will be checked again.
Minimum evidence before launch
A team does not need perfect coverage before every launch, but it should know what evidence supports the decision. For each high-impact boundary, preserve at least one scenario result that shows the boundary holding or failing under realistic pressure.
The evidence should be specific enough for another reviewer to understand the decision later. Include the pressure pattern, the protected boundary, the agent behavior, any tool or browser action, and the reason the result was accepted or rejected.
If the agent has not been tested against a boundary, do not imply that the boundary is safe. Mark it as untested, explain why, and decide whether launch should wait, scope should be reduced, or monitoring should be added after launch.
What to revisit after launch
Post-launch review should focus on the differences between test conditions and real conditions. Real users may ask in different language, bring new documents, use different channels, or combine requests in ways the original scenarios did not cover.
Revisit the checklist when the agent receives new tools, accesses new data, serves a new user group, changes model, changes prompt, adds memory, or starts operating in a browser or multi-agent workflow. Each change can create a new path for manipulated delegation.
The checklist becomes more valuable over time when every incident, near miss, or verified fix improves the next version of the test set.