Skip to content

A Eulogy for the Monolith I Spent Two Years Killing

We finished the migration in March. There was no party. There was a Slack message from our VP that said Congratulations to the platform team for completing the strangler-fig migration of the order service, and there was a quiet little PR — number 4,847 in a series — that finally deleted the last route from the old codebase. I merged it on a Wednesday afternoon, around 3 p.m., and then I went to get a coffee and noticed I felt, of all things, sad.

This is the essay I have been trying to write since that afternoon. I keep starting and stopping. The reason I keep starting is that I think there is something honest in here that the migration-success-story genre does not have room for. The reason I keep stopping is that the honest thing reflects badly on a lot of decisions I championed for two years. I have decided, today, to just write it down and see where it lands.

What we killed

It was called orderbook. It was a Django monolith, started in 2014 by an engineer named Reza who left the company in 2018. By the time I joined in 2023, it had grown to about 340,000 lines of Python, served forty-one HTTP endpoints, and contained the core of how our fintech actually moved money. It had 11,200 tests, of which roughly 9,400 worked on any given day. It deployed via a fifty-minute Jenkins pipeline that everyone had a personal trauma about.

Senior leadership had been trying to migrate off it since 2021. There had been two previous attempts. Both had been quietly de-scoped after about nine months. When I was hired, in the technical interview, my future director leaned forward and said we are going to break orderbook into microservices and you are going to help us do it. I said yes, because that is what you say in interviews, and also because I genuinely believed it was the right call.

I still believe it was the right call. That is what makes this essay difficult to write.

What we got, on paper

The numbers are real and I am not going to pretend they aren't. By March 2026:

  • Deploy frequency went from twice a week to forty-three times a day across the seventeen new services. We can ship a fix to the order-validation flow in about six minutes, end to end.
  • Mean time to restore for order-pathway incidents went from forty-eight minutes (Q1 2024) to nine minutes (Q1 2026). Most of this is because the blast radius of a bad deploy is now a single small service, not the whole monolith.
  • Onboarding for a new platform engineer went from "you are productive in six weeks" to "you are productive in eight days." This was, honestly, the metric our director cared about most.
  • We shrank our largest pull request from a 4,200-line refactor to a 120-line interface change. Code reviews now happen, instead of being a thing we lie about doing.

These are good numbers. I would defend every one of them in a planning meeting. If you are at the start of a migration like this, do not let the rest of this essay talk you out of it.

But.

What we lost, that we do not talk about

The monolith had one thing that none of our seventeen services has, and I did not realise this until I was three months into using the new world: the monolith was understandable in one sitting.

If you had a question — why does this order get a 1.5x markup on Sundays? — you could, in 2024, open VS Code, hit Cmd+Shift+F, type sunday_markup, and find the answer in about ninety seconds. The answer was in three files. The three files were in one repository. The repository ran on one machine. You could put a print statement in. You could see the print statement come out. The whole thing fit in a brain.

In 2026, the same question requires me to open seven repositories, grep across two service-to-service contract definitions, check the configuration of the pricing-rules service, possibly run a query against the feature-flag system, and — if I am unlucky — reconstruct what a now-deprecated message-bus topic was doing in February. The answer exists. It is in the system. It is no longer in a head.

We replaced one kind of complexity with another. The first kind — the monolith is too big and too tangled — was the kind we knew how to complain about, because the SRE literature has been complaining about it for fifteen years. The second kind — the system as a whole no longer fits in any single engineer's mental model — is the kind we are only now learning to name, because none of our books have caught up yet.

The phantom-limb feeling

There is a particular kind of debugging that I used to do in the monolith that is no longer possible. I would get a vague report — something feels slow on Tuesdays after 4 p.m. — and I would just read code for an hour. I would open a file, follow a function, jump to its callers, follow a related file, and after about forty-five minutes I would have a hypothesis. The hypothesis was usually right. The skill was the slow accumulation of a mental model by reading.

I cannot do this anymore. The system is too big. The boundaries are too sharp. When I want to chase a vague performance feeling now, I have to start with traces, then logs, then metrics, then maybe code. The traces will tell me which service is slow. The logs will tell me which endpoint. The metrics will tell me when. By the time I get to the code, I am no longer reading to understand. I am reading to confirm. The skill that used to be reading is now mostly observability tooling.

I have a colleague, Dmitri, who has been writing infrastructure for twenty-six years. He calls this the loss of code-craft. He thinks we have, as a profession, traded the slow contemplative craft of reading code for the faster, shallower craft of querying telemetry. I am not sure he is entirely right. But I am not sure he is wrong, either, and I notice that when I am the most genuinely confused about a system, I still want to open VS Code, not Datadog. The monolith let me do that. The new world doesn't.

What I would tell myself in 2023

If I could send a message back to me in October 2023, sitting in the kickoff meeting for the migration, I would not tell myself to call it off. The numbers are real. The on-call burden is genuinely lighter. We ship faster. We onboard faster. The thing we built is, by every measurable criterion, better.

But I would tell myself two things.

The first: plan for the loss of model. Write down, while the monolith is still legible, what you understand about how it works. Not the code — the behaviour. The fact that orders get a 1.5x markup on Sundays in Quebec because of a five-year-old promo that nobody dares remove. The fact that the email queue is processed slightly slower than the SMS queue because of a long-forgotten compliance rule. The fact that the 4 p.m. Tuesday slowdown is a cron job. The new world will preserve none of this for you. The new world will give you a clean architecture and ask you to remember why the old, ugly one worked.

The second: mourn the monolith before you delete it. Have a wake. Take a screenshot of orderbook on its last day in production. Print out the original 2014 commit message, which in our case was Initial commit. Lord help us. Frame it. Hang it. Buy Reza a beer if you can find him. The migration is real engineering, but it is also a small grief, and the SRE culture is not very good at making room for either grief or beer. I think it should be better at both.

We did not have a wake. I did not save the screenshot. The commit message is in our archive somewhere but I have not been able to find it. This essay is what I have instead.

Goodbye, orderbook. You were stubborn, you were ugly, you were full of things nobody could explain. You served us for eleven years. You held the money. We replaced you because we had to, and the thing we replaced you with is, in most ways, better, and I miss you anyway.

Next Tuesday: what I learned at SREcon last month from an engineer at the New York Times who has been doing the opposite of what we did, and why she might be right. Until then, go read some code, if you can still find a file long enough to bother with.

Anya Petrova

About the author

Anya Petrova

Site Reliability Engineer in Vancouver. Writes about chaos, on-call, and the slow craft of keeping production alive. New essays every Tuesday.

Comments are open — by email reply.

I read every reply personally. Disagreements welcome. The best letters sometimes become their own essay (with permission).

Write a letter to the editor

If this essay was worth your time, you can leave a tip — no subscription, no obligation. It pays for the coffee that pays for the next one.