AI Agent Ops Leak Checklist | GnawClaw
Free checklist

AI Agent Ops Leak Checklist

Use this before trusting any AI-agent or automation workflow with real business work.

Cost leaks

  • Which model or tool runs each task?
  • Are cheap tasks routed to expensive models?
  • Are failed retries capped?
  • Is monthly spend visible in one place?
  • Are duplicate tools doing the same job?
  • Are unused subscriptions cancelled or scheduled for cancellation?
  • Is there an alert before spend crosses the monthly comfort line?

Reliability leaks

  • What happens if the preferred model or provider fails?
  • Is the fallback route tested, or only written in config?
  • Does the workflow fail closed when fallback also fails?
  • Is there a log proving completion?
  • Is there an alert when the workflow stops silently?
  • Is there a stale-lock or dead-queue check?
  • Does a human know how to rerun or roll back the workflow?

Security leaks

  • Are API keys or tokens present in files, screenshots, logs, or prompts?
  • Are browser sessions and OAuth credentials treated as sensitive?
  • Are external or public actions separated from internal drafts?
  • Are payment, trading, posting, and messaging actions approval-gated?
  • Can the agent access more than it needs?
  • Is there a clear do-not-touch list?

Workflow leaks

  • Who owns the workflow after it breaks?
  • What input starts it?
  • What output proves it worked?
  • Which system consumes the output?
  • Does a producer write data that nobody reads?
  • Are handoffs documented?
  • Is the automation still useful if one API changes?

Decision leaks

  • What decisions can the agent make alone?
  • What decisions require approval?
  • What is the maximum reversible change allowed?
  • What is the maximum spend allowed?
  • What is the escalation path?
  • Is the business owner alerted only after a repair attempt, or on every small failure?

Quick score

Give one point for every yes.

  • 0-10: fragile stack. Audit before adding more tools.
  • 11-20: useful but leaky. Fix reliability and cost visibility.
  • 21-30: operationally serious. Add monitoring and documentation.
  • 31+: strong. Keep testing fallbacks monthly.
Get the full audit