Anya Petrova
Field notes on reliability, chaos, and the systems we keep alive.
A note from the editor
I have spent eleven years carrying a pager. This publication is what I learned in the quiet hours after the incident bridge cleared — about why systems fail, why our diagrams lie to us, and what it actually costs to be the person on call.
New essays land on Tuesdays. Long pieces, no listicles, no checklists you could have found anywhere else. If a post here didn't change how you think about your on-call rotation, I would rather you unsubscribe and tell me why.
Free. Weekly. Unsubscribe with one click.
Essay · Reliability
The Quiet Hour: What Nobody Tells You About the Pager
On-call is not a technical problem. It is a relationship — with your team, your sleep, your partner, and a small black device that has decided, against all evidence, that 3:47 a.m. is the right time to be honest with you.
Read the essayRecently Published
Full archive →ReliabilityMay 26, 202614 min
The Quiet Hour: What Nobody Tells You About the Pager
A meditation on being woken at 3:47 a.m. by a service that, at the time of paging, was already fine.
ProcessMay 19, 202611 min
Against the Runbook: Why I Stopped Trusting My Own Documentation
After an Auth0 outage taught me that step-by-step instructions were lying to me, I had to rewrite how our team teaches itself.
ArchitectureMay 12, 202617 min
A Eulogy for the Monolith I Spent Two Years Killing
The migration finished in March. The thing I miss is not the code. It is something quieter, and I want to name it before I forget.
ChaosMay 05, 202612 min
Chaos Engineering Without the Cosplay
We do not need a Netflix-branded Simian Army. We need someone willing to kill one node on a Wednesday afternoon and watch what happens.
CultureApril 28, 20269 min
On Blame, and Why "Blameless" Post-Mortems Are Often Neither
We adopted the template from the Google book in 2019. It took us four years to notice we were still firing people, just more politely.