THE PROBLEM
Why we built Reflex
It was 2:47am when my phone lit up. Not a monitoring alert — one of our own servers was down. The PHP-FPM process had OOM-crashed, nginx was returning 502s across the board, and the monitoring tool I was paying £40 a month for had dutifully sent me an email to tell me about it. Thanks.
I spent 45 minutes SSHing into the server, checking logs, restarting processes, waiting for workers to come back up. The fix took 30 seconds once I knew what it was. The diagnosis took 44 minutes and 30 seconds.
That's the problem Reflex solves.
Why self-healing, not just monitoring
Monitoring is a solved problem. There are dozens of tools that will tell you your server is down, some of them for free. What nobody had built — properly — was a system that would fix it. Not just alert and wait. Fix.
The Reflex approach is to treat every server incident as a pattern that has been seen before. Not all of them — truly novel incidents require human intervention. But most incidents? php-fpm ran out of memory, nginx lost its upstream, a disk filled up with log files, a queue worker died and nobody restarted it. These are solvable problems with known solutions. The Brain runs those solutions, automatically, while you sleep.
Technical choices
- →We chose Go for the agent because it compiles to a single binary, has minimal memory overhead, and doesn't require a runtime environment on the server.
- →We built a PHP Zend extension because PHP runtime telemetry is only available from inside the PHP process. No external tool can see what we see.
- →We chose an eBPF base layer because system-call level observability is the most reliable way to detect process lifecycle events — more reliable than polling and more efficient than kernel modules.
Dogfooding and public telemetry
We run Reflex on our own stacks first. Aggregated incident and repair summaries are published monthly so you can see what the Brain actually did in production — without turning internal experiments into pretend customer logos.
How we think about infrastructure
Repair > Alert
If a machine can fix it, a machine should fix it. Humans should be notified of resolutions, not problems.
Specificity over vagueness
Every repair action is logged with a confidence score, a dry-run result, and an outcome. No black boxes.
Don't break what works
Every repair runs in dry-run mode first. If validation fails, no action is taken.
Honest about limitations
The Brain can't fix everything. Hardware failures, application bugs, and novel incidents require human judgment. We're honest about this.
Privacy by design
We collect the minimum data needed to power the Brain. Server metrics and log snippets are not stored longer than necessary.
Mission
Make production Linux boring again: detect failure early, propose or run safe repairs with an audit trail, and tie every incident to deploy context so operators spend minutes verifying — not hours reconstructing timelines across three dashboards.
Built with
A small team building in public
Reflex is not a 50-person company pretending to be a startup. It is a UK-based engineering team that got tired of being paged at 2am and decided to fix the problem properly — first for our own fleets, then for everyone who runs Laravel (and friends) on real Linux servers. We build in public, ship weekly-ish, and read every serious support thread.
We are deliberately light on vanity headshots until more of the core team opts into public bios — enterprise buyers can start from the Trust centre, changelog, and incident reports for evidence-led evaluation.
Hiring
We are not running a high-volume recruiting pipeline yet. If you live in infrastructure, PHP internals, or Go systems programming and want to help build honest self-healing tooling, send a short note to careers@getreflex.dev with subject line "Careers — Reflex" and a link to something you shipped.