Skip to main content
Engineering

How the Reflex Brain repair cycle works

The Reflex Team7 min2 May 2026

If you only remember one sentence: the Brain is a policy engine on top of reflexd facts — it never guesses in a vacuum, and it never silently escalates privilege beyond what your team configured.

1. Detect

reflexd runs on your server. It collects host-level truth: systemd and process health, PHP-FPM pool pressure, nginx upstream errors, disk trends, queue depth, certificate expiry windows, and (for PHP) deeper runtime signals when the Zend extension is present.

Detection is not "one metric crossed a line". The agent normalises signals into structured events: what changed, how severe, what was running nearby (deploy markers, recent releases).

2. Evaluate

The Brain receives those events with your team policy: which playbooks exist, which severities auto-run, which always require human confirm, and which are blocked entirely on production.

Evaluation answers three questions in order:

  1. Is there a matching playbook? If not, you get a crisp alert — not a generic CPU graph.
  2. Is it safe under current policy? Risk tier, environment, and playbook metadata gate execution.
  3. Is this a duplicate storm? Debouncing stops fifty identical repairs during a single upstream outage.

3. Confirm or auto-run

Some teams want manual confirm for everything until they trust the defaults. Others graduate to auto-repair for well-understood incidents (disk cleanup for known-safe log paths, graceful FPM reload after crash loops) while keeping HIGH-risk actions paused.

The UI shows what will run, why it matches, and what rollback looks like when rollback exists.

4. Verify

After a repair action, reflexd re-checks the same signals: did the pool come back healthy? Did the listen queue drain? Did TLS handshake succeed with the renewed certificate?

If verification fails, the Brain stops patting itself on the back — it opens an incident with the verification context attached.

5. Audit

Every decision and shell boundary crosses an audit log: who approved, what command class ran, timestamps, and correlation IDs. That is not compliance theatre — it is how you answer "what did Reflex change at 03:12?" without grepping root history across forty boxes.

How this differs from a runbook wiki

Your wiki describes what a human should do. The Brain encodes what a machine may do, under caps you set, with evidence attached. The wiki still matters for novel failures; Reflex is for the boring repeats that steal sleep.

Next steps