Skip to main content

Overview

CelerFlow continuously monitors your agents and generates risk alerts when it detects anomalous behavior. Alerts help you catch problems before they escalate — a sudden spike in token usage, an agent going offline, or an unusually high rate of blocked operations.

Alert types

TypeTriggerSeverityWhat to do
token_spikeToken usage exceeds 3x the 7-day averageHighCheck if the agent is in a loop or processing unexpectedly large inputs
latency_spikeTool call latency exceeds 3x the 7-day averageMediumCheck the target service for outages or rate limiting
high_block_rateMore than 30% of tool calls blocked in the last hourHighReview service permissions — the agent may need access to a new service
security_anomalyUnusual pattern of write operations to sensitive servicesCriticalInvestigate immediately — the agent may be compromised or misconfigured
offlineAgent hasn’t sent a health check in 30+ minutesLowCheck if the agent process is running. Run celerflow doctor for diagnostics

Where alerts appear

  • Workspace overview — the alert count badge on the stat card
  • Agent detail page — alerts specific to that agent
  • Notifications — if configured, alerts are forwarded to Telegram, Slack, or WhatsApp

Resolving alerts

Alerts can be resolved from the dashboard:
  1. Open the alert detail
  2. Review the evidence (trace data, timeline, metrics)
  3. Click Resolve with a resolution note
Resolved alerts are kept in the audit log for compliance purposes.

Alert evaluation

Risk alerts are evaluated periodically by a background job. The evaluation checks each agent’s recent metrics against its historical baseline. New agents (less than 7 days of data) use workspace-level baselines instead.