Skip to main content

Health dashboard

The Health tab is the main system-level diagnostics page. It starts with a small set of summary cards, then shows detail sections for specific risks.

Use it when you want to answer questions like:

  • Is the instance under checkpoint or I/O pressure?
  • Are long transactions blocking cleanup?
  • Is wraparound risk building up?
  • Are replication slots or statement statistics becoming a problem?

Top-level cards

CardWhat it means
Cache hit ratioWhether reads are mostly served from cache or are falling through to disk more often than expected.
ConnectionsCurrent connection usage relative to the configured ceiling.
Oldest transactionWhether a long-running transaction may be blocking vacuum progress or tuple cleanup.
TXID wraparound riskWhether transaction ID age or multixact age is approaching a dangerous range.

The UI shows the current value, severity, and explanation. The exact internal scoring thresholds are not documented in detail because they may change as the product logic improves.

Detail sections

  • Checkpoint stats show timed vs requested checkpoints, checkpoint write/sync time, and related pressure signals. On managed providers, checkpoint frequency may partly reflect provider activity rather than only application writes.
  • Sequences at risk show sequences that are getting close to their maximum value.
  • Replication slots show WAL retained by non-temporary slots. This matters because an inactive or stalled slot can keep WAL files on disk for too long.
  • pg_stat_statements capacity shows whether the statement statistics store is evicting entries before they remain useful.
  • bgwriter maxwritten shows whether the background writer is repeatedly hitting its configured limit.
  • Vacuum horizon blockers show sessions with old snapshots that can hold back dead tuple cleanup.

Failure readiness

Two summary cards estimate recovery posture:

  • Crash Recovery (RTO) is a rough signal for how much recovery work may be needed after a crash.
  • Data Safety (RPO) is a rough signal based on replica state, sync mode, and WAL retention hazards.

These are operational summaries, not a substitute for tested disaster recovery procedures.

Snooze

Findings can be snoozed locally so accepted conditions do not keep interrupting the current investigation.

Caching

Health data is short-lived cached. Existing content stays visible while a fresh read is loaded in the background.