Skip to content

Alerting Troubleshooting

Why is my alert missing?

Start with these checks in order:

  1. Confirm the tenant itself is active.
  2. Confirm jobs.alerts.config.enabled is true.
  3. Confirm the specific alert definition inside jobs.alerts.config.alerts[] has enabled: true.
  4. Confirm the upstream runtime source exists and is fresh.
  5. Confirm the alert loop is no longer waiting on packages.

Typical root causes

Symptom Likely cause Where to look
no alerts at all jobs.alerts disabled tenant config and Worker Runtime
message alerts missing message sync stale or filtered Message Popups and worker runtime
iFlow alerts missing package or artifact sync stale Artifacts and worker runtime
keystore alerts missing keystore sync stale Keystore Entries
daily no-message alert missing weekday or window mismatch alert definition in daily_check

Concrete checks

Check 1: alert config exists

Expected structure:

  • jobs.alerts.config.enabled
  • jobs.alerts.config.repeat_interval
  • jobs.alerts.config.alerts[]

Typical example values:

  • repeat_interval = 60
  • type = alert_messages
  • type = alert_iflows
  • type = alert_keystore
  • type = alert_iflow_no_messages_daily

Check 2: upstream data is available

Alerting evaluates stored runtime data, not live CPI responses.

Check these related areas:

Check 3: worker dependency is satisfied

The alert loop explicitly waits for package completion first. If package sync is not considered done yet, alerts can be delayed even when alert config looks correct.

See Worker Runtime Troubleshooting.

Why do I only see acknowledged or outdated alerts?

Observed runtime states include:

  • alerted
  • acknowledged
  • outdated

Important meaning:

  • alerted means currently open in the alert lifecycle
  • acknowledged means a user changed the alert state
  • outdated means an earlier alert row was superseded by a newer row for the same trigger chain

Some overview queries explicitly exclude outdated, so a historical alert can exist in storage but not appear in the main operational overview.

Why is an alert counted as open for more than 48 hours?

Open-too-long filtering uses:

  • origin_time_alerted if present
  • otherwise time_alerted

That means a renewed alert chain can still be treated as long-open if it inherits the original open time.

Why did acknowledgement not solve the problem?

Acknowledgement changes alert state in persistence, but it does not clear the technical condition itself.

If the underlying problem persists:

  • the alert may remain logically relevant
  • later evaluations may keep producing alert-state continuity
  • the next place to inspect is the source runtime data, not the alert row itself

Fast diagnosis paths