Systems Over Symptoms
There's a recurring meeting on many engineering teams. Someone presents an incident from the previous week. The team discusses what went wrong and identifies action items. Everyone feels good about preventing it from happening again.
Two weeks later, a similar incident occurs. Different symptoms, same underlying problem.
This is the symptom trap—and it's everywhere in engineering organizations.
Symptoms vs Systems
Symptoms are what you see: the outage, the bug, the missed deadline, the frustrated engineer.
Systems are what create the symptoms: the processes, incentives, constraints, and feedback loops that make those outcomes likely.
When you fix a symptom, you solve the immediate problem. When you fix a system, you prevent a class of problems.
The Urge to Treat Symptoms
Treating symptoms feels good. It's concrete. You can point to it and say "I fixed that."
Treating systems is harder. It's abstract. The payoff is delayed. You might not even see the problems you prevented because... they never happened.
But treating only symptoms creates a treadmill. You're constantly fighting fires instead of making the system fire-resistant.
Examples of System Fixes
Symptom: An engineer deploys a breaking change to production.
Symptom fix: Revert the change, chastise the engineer, add a checklist item.
System fix: Implement CI/CD gates that catch breaking changes automatically. Build a culture where staging is actually representative of production. Make rollbacks one-click affairs.
Symptom: A team misses their sprint commitment three times in a row.
Symptom fix: Ask them to work harder, extend the sprint, or reduce scope.
System fix: Examine why estimates are off. Is there enough discovery before commitment? Are dependencies clear? Is scope creep happening? Fix the planning process.
Symptom: Engineers are burning out and leaving.
Symptom fix: Offer them raises, send them to conferences, give them better titles.
System fix: Examine workload distribution. Are on-call rotations fair? Is there psychological safety to push back on unrealistic deadlines? Are managers supporting their teams or just driving them?
How to Think Systemically
Ask "Why" Five Times
When something goes wrong, don't stop at the immediate cause. Keep asking why until you get to the underlying system.
- Why did the bug make it to production? → No test caught it.
- Why didn't the test catch it? → The test environment is different from prod.
- Why is the test environment different? → It's expensive to keep them in sync.
- Why is it expensive? → We have too many microservices with different configs.
- Why do we have so many microservices? → We didn't set architecture guardrails.
Now you know the system problem: lack of architectural standards and environment parity.
Look for Patterns
One-off incidents are symptoms. Recurring incidents are system problems.
If you're having the same type of incident every quarter, the fix isn't another post-mortem. It's changing the system that allows those incidents to happen.
Study Incentives
People behave according to incentives. If teams are incentivized to ship features over quality, you'll get bugs. If they're incentivized to minimize costs over reliability, you'll get outages.
Fix the incentives, fix the behavior.
Map Feedback Loops
Good systems have feedback loops. Bad systems break them.
- Do engineers know when they've introduced technical debt?
- Do teams see the business impact of their work?
- Does leadership understand the engineering constraints?
If feedback loops are broken or slow, the system will drift into undesirable states.
Making the Shift
Moving from symptom-focused to system-focused requires:
Patience: Systems change slowly. The payoff isn't immediate.
Investment: Fixing systems often requires upfront cost for long-term gain.
Measurement: You need to track leading indicators, not just lagging ones.
Culture: Everyone needs to understand that preventing fires is more valuable than fighting them.
When to Treat Symptoms
I'm not saying never treat symptoms. Sometimes you need to:
- Stop the bleeding when there's an active incident
- Buy time while you work on the system fix
- Show progress to maintain trust
But always ask: "What's the system change that would prevent this class of problem?"
Then do that too.
Conclusion
Great engineering leaders think in systems. They understand that their job isn't to solve every problem that arises—it's to build systems that prevent problems from arising in the first place.
It's harder. It's less visible. But it's the only way to scale.
Systems over symptoms. Always.