What does self-healing mean in Reflex?

Detectors observe production signals, the Brain evaluates safe repair steps, and playbooks execute with an audit trail — not silent restarts.

How do deploys relate to repair?

Eligible tiers record deploy markers and health gates so regressions after a release correlate with metrics and rollback paths.

How Reflex works

A technical walkthrough of the five layers that monitor, diagnose and automatically repair your servers.

Five-layer architecture

Reflex is not a monitoring tool with a notification system bolted on. It is a full-stack observability and repair platform, built from the ground up to act — not just alert.

⌨

LAYER 1Reflex Local

Dev

⟳

LAYER 2Reflex Pipeline

Dev

⚙

LAYER 3Reflex Server

Runtime

◈

LAYER 4Reflex Brain

Brain

▣

LAYER 5Reflex Dashboard

LAYER 1

Reflex Local

Reflex Local runs directly in your development environment. The VS Code extension provides inline warnings when it detects configuration patterns known to cause server issues in production — for example, a PHP memory_limit that will cause OOM kills under load, or a queue timeout shorter than the slowest scheduled task. The CLI exposes the same checks as a command you can run manually or integrate into git hooks. The pre-commit hook runs a fast subset of checks — config validation, environment parity checks — before every commit reaches your repository.

The philosophy is simple: the cheapest incident is the one that never reaches production. Every misconfiguration caught in a developer's editor is one fewer 3am alert. Reflex Local integrates into the existing workflow without adding friction — warnings appear inline, the CLI command exits non-zero when issues are found, and the pre-commit hook is a one-line addition to any project.

Configuration validation covers the most common PHP, nginx, and MySQL settings that cause production incidents. The environment parity checks compare your local .env against a manifest of required production keys — so you don't deploy code that will silently fail because a key wasn't propagated.

What Reflex can repair in this layer:

→Configuration drift detection
→Environment parity warnings
→Pre-deploy config validation
→Memory limit misconfiguration detection

What Reflex cannot yet do:

→Runtime monitoring (that is Reflex Server)
→Database schema validation
→Container health checks

# Run Reflex Local checks before deploying
reflex check --env production --config ./config

# Output
✓  PHP memory_limit: 256M (recommended: 512M for this queue depth)
⚠  queue.timeout: 30s (your slowest job: crop_images averages 47s)
✗  APP_KEY not found in .env.production

LAYER 2

Reflex Pipeline

Reflex Pipeline is the deployment and provisioning layer. It understands the state of your infrastructure before a deployment runs, and it knows what a healthy deployment looks like on the other side. Deployments are atomic — code changes are staged, health-checked, and promoted or rolled back automatically.

The Brain is involved in deployment decisions: if a recent deployment is correlated with an anomaly pattern, Reflex can initiate an automatic rollback before the incident escalates. This is the layer that makes Reflex a Forge and DeployHQ replacement for teams who want the intelligence layer and the deployment layer in one place.

Pipeline integrates with your existing git workflow. Push to your deployment branch, and Pipeline handles the rest: SSH connection, dependency installation, migration execution, asset compilation, process restart, and health verification. The health check is not a simple HTTP 200 — it verifies that PHP-FPM workers are accepting requests, that queue workers are running, and that database connectivity is live. Only when all checks pass does the deployment get promoted. If any check fails, the atomic rollback fires automatically.

What Reflex can repair in this layer:

→Automatic rollback on failed deployments
→Health-check gating before promotion
→Zero-downtime release management
→Streaming deploy logs and change preview in the dashboard
→VCS webhooks for GitHub, GitLab, and Bitbucket
→Optional Laravel Forge server import

What Reflex cannot yet do:

→Database migration rollback (complex schema changes — planned)
→Multi-region deployment orchestration

# reflex.deploy.yml
deployment:
  strategy: atomic
  health_check:
    timeout: 30s
    checks:
      - php-fpm
      - queue-workers
      - database
  rollback_on_anomaly: true
  notify: slack

LAYER 3

Reflex Server

Reflex Server is the deepest telemetry layer — this is where the technical moat lives. The Go agent runs on every monitored server and collects standard metrics (CPU, memory, disk, network). On top of this, a PHP Zend extension instruments the PHP runtime directly — it can observe request lifecycle events, memory allocation patterns, and exception rates at the interpreter level. This is data that no external monitoring tool can access.

An nginx module provides request telemetry at the web server level. An eBPF base layer (Linux kernel 5.8+) provides system-call level observability for process lifecycle events. The combination of these four observability layers gives the Brain a complete picture of what is happening inside your server stack at any given moment.

The Go agent is a single static binary. It does not require a runtime environment, does not open inbound ports, and consumes less than 20MB of RAM at steady state. The PHP Zend extension is loaded via php.ini and adds approximately 2–4ms overhead per request — negligible in production workloads. All telemetry is transmitted over an encrypted outbound connection to the Brain cluster. No inbound firewall rules are required.

What Reflex can repair in this layer:

→PHP-FPM crashes and worker exhaustion
→Memory-related process failures (OOM)
→Disk full incidents
→Nginx upstream failures
→Database connection exhaustion

What Reflex cannot yet do:

→Windows Server monitoring
→Kubernetes pod orchestration (planned)
→Hardware fault detection

// Agent telemetry sample (emitted every 10s)
{
  "host": "prod-web-01",
  "timestamp": "2026-04-10T03:41:22Z",
  "php_fpm": {
    "active_processes": 8,
    "idle_processes": 4,
    "max_children_reached": false,
    "slow_requests": 0
  },
  "memory": {
    "used_pct": 73.4,
    "php_heap_mb": 412
  },
  "disk": { "root_used_pct": 61.2 },
  "nginx": { "active_connections": 43, "upstream_errors_1m": 0 }
}

LAYER 4

Reflex Brain

The Brain is the repair engine. It receives a continuous stream of telemetry from Reflex Server, feeds it through pattern detection models, and generates repair hypotheses when anomalies are detected. A hypothesis is a structured object: the identified root cause, the proposed repair action, a confidence score (0–100%), and a risk classification (LOW, MEDIUM, HIGH).

Every repair hypothesis is run through a dry-run simulation before execution. The dry-run checks: is the playbook applicable to this exact server state? Will the repair action have any side effects that are not accounted for? If the dry-run passes, the Brain executes and monitors the recovery curve.

The entire process — from anomaly detection to resolution — typically completes in under 90 seconds. Every action is written to the immutable audit log. The Brain's decision-making is fully transparent: you can inspect the hypothesis object, the confidence score at decision time, the dry-run result, and the recovery outcome for every action ever taken on every server in your fleet.

What Reflex can repair in this layer:

→All 10+ built-in playbooks (OOM, 502s, queue death, disk full, and more)
→Custom playbooks (Enterprise tier)

What Reflex cannot yet do:

→Hardware-level failures (disk failure, RAM failure)
→Network infrastructure issues outside the server
→Application-level bugs (code errors, not infrastructure)

// Brain hypothesis object
{
  "hypothesis_id": "hyp_01jbb3k9f4",
  "host": "prod-web-01",
  "detected_at": "2026-04-10T03:41:44Z",
  "root_cause": "PHP_FPM_OOM_KILL",
  "confidence": 94,
  "risk": "LOW",
  "playbook": "restart-php-fpm-with-memory-adjustment",
  "dry_run": { "passed": true, "side_effects": [] },
  "action": "EXECUTE",
  "resolved_at": "2026-04-10T03:42:51Z",
  "resolution_time_seconds": 67
}

LAYER 5

Reflex Dashboard

The Dashboard is the control plane for humans. It shows you what the Brain is doing, what it has done, and what the current health state of every server in your fleet looks like. The real-time activity feed gives you a running log of Brain decisions, repairs executed, and health state changes.

The repair history is a complete audit trail — you can see every action the Brain has taken, the confidence score at decision time, the dry-run result, and the recovery outcome. For agencies, the white-label option lets you surface this as your own branded server health product to clients. The RBAC system lets you grant granular access — a client can see their server's health without being able to trigger manual repairs or see other clients' data.

The Dashboard is a read/control layer. It does not perform repair actions — that is the Brain's job. But it gives you the confidence that the Brain is working correctly, and the ability to intervene manually when needed. Every repair can be paused, confirmed, or overridden from the Dashboard in real time.

What Reflex can repair in this layer:

→N/A — the Dashboard is a read/control layer. Repairs are executed by the Brain.

What Reflex cannot yet do:

→Mobile app (planned for 2026 Q3)
→Custom dashboard widgets
→Exported PDF reports (coming soon)

A repair cycle, step by step

Here is exactly what happens when the Brain detects an OOM kill on php-fpm.

Telemetry stream arrives

The Go agent on prod-web-01 transmits a telemetry payload. The PHP-FPM active_processes count has dropped to zero. Memory pressure spiked to 94% in the previous 30-second window.

Anomaly detection fires

The Brain's time-series anomaly detector flags the PHP-FPM process count as an anomaly. The pattern matches the OOM_KILL signature: memory spike followed by process count drop to zero.

Root cause hypothesis generated

The Brain generates a hypothesis: root_cause = PHP_FPM_OOM_KILL, confidence = 94%, risk = LOW. It selects the "restart-php-fpm-with-memory-adjustment" playbook.

Dry-run simulation

Before executing, the Brain runs a dry-run against the current server state. It checks: is PHP-FPM actually down? Is the server in a state where a restart is safe? Are there any active database transactions that would be interrupted? Dry-run passes.

Repair execution

The Brain instructs the agent to restart php-fpm with an adjusted pm.max_children value appropriate for the current memory headroom. The agent executes the command via its local secure executor.

Recovery curve monitoring

The Brain watches the next 3 telemetry payloads (30 seconds) to verify the recovery curve: PHP-FPM processes coming back online, memory pressure declining, nginx upstream errors clearing.

Outcome recorded

Resolution confirmed. The Brain writes the outcome to the immutable audit log: resolution_time_seconds = 67, outcome = RESOLVED. The hypothesis object is closed.

Notification sent

A summary notification is sent (Slack, email, or both — configurable). "The Brain resolved a PHP-FPM OOM kill on prod-web-01 at 03:42. Resolution time: 67 seconds." You wake up to a resolved incident, not an alert.

How the agent communicates

The Reflex agent uses a one-way token exchange. It connects outbound to the Brain cluster over TLS 1.3. No inbound ports are required. There is nothing to add to your firewall rules.

Token exchange

Each agent authenticates with a per-server token generated at installation time. Tokens can be rotated from the Dashboard at any time.

Data in transit

All telemetry is transmitted over TLS 1.3. Older TLS versions are not accepted. Certificate pinning is used for the Brain cluster.

Data at rest

Telemetry stored in the Brain cluster is encrypted at rest with AES-256. Log snippets are rotated on a configurable schedule.

What is collected

System metrics, PHP-FPM worker counts, nginx connection stats, process lifecycle events, and log snippets (configurable).

What is NOT collected

Application user data, database contents, environment variable values, or source code. Reflex has no access to your application data.

Ready to see it in action?

Start free on one server — no credit card. Install the agent in 60 seconds.

Start free — 1 server