🏛️ Govern

Best practices checklist

For product and engineering teams.

Best-practice checklist for product & engineering teams

The one-page take-away. Each of these maps back to the pillars and frameworks covered in earlier lessons.

Best-practices checklist

Read-only reference. Copy as markdown for an AI, or open a worksheet to tick off items and track progress.

Agent Canvas filled and reviewed — All eight cells, with an explicit NOT for list. Canvas
REMIT worksheet completed — Named owner, envelope in code, monitoring plan, identity, trust level. Worksheet
System Prompt Builder run — Six ingredients in order; no single section longer than ~200 words. Builder
NIST GenAI Profile mapped — Which of the twelve risks apply, with explicit mitigations for each.
EU AI Act classification — Documented (prohibited / high / limited / minimal / GPAI; provider vs deployer).

Golden dataset of 20+ cases — Covering happy path, edge, adversarial, ambiguous, and handoff.
Five tests passed — Happy, Edge, Adversarial, Ambiguous, Handoff. Five tests
Red-team suite run — OWASP LLM Top 10 + your domain-specific attacks.
Circuit breakers wired — Spend caps, tool-call caps, canary failures trigger automatic halt.
Human oversight model chosen — Based on the risk × complexity matrix, and documented.
Monitoring dashboard live — Action logs, tool-call regressions, quality regressions, cost, and latency.
Kill-switch path — Exists outside the agent's runtime, and is known to on-call.

Per-deploy regression run — Full golden dataset executed on every deploy; alert on any regression.
Daily drift checks — Input distribution, output distribution, tool-call mix.
Weekly human review — Sampled traces read, patterns noted, rubric updated.
Authority review cadence — Monthly for new agents; quarterly for mature ones. Evidence-based promotion or demotion.
Incident response plan — If a circuit breaker fires: who gets paged, what they do, when the board finds out.