Legacy Won’t Die on Its Own: How AI Agent Factories Are Changing the Business Case for IT Modernization
- 16 hours ago
- 10 min read
In almost every large organization, there is the same monster lurking somewhere: an old, bloated system without which “the company won’t get up in the morning,” but which no one understands anymore. Code written decades ago, no documentation, and the last “dinosaur” who could handle it is thinking about retirement. Until recently, modernizing such legacy systems was akin to open-heart surgery: expensive, time-consuming, and risky, so it was put off indefinitely.
Generative and agent-based AI won’t make the legacy system disappear. But for the first time, it’s actually changing the equation: it cuts modernization time by up to half, reduces the costs of technical debt, and shifts this task from the category of “necessary IT evil” to “strategic business lever.” There’s one condition: you have to stop thinking of modernization as rewriting code and start thinking in terms of an agent factory.
These are no longer just vendor promises. Empirical McKinsey research on developer productivity shows that generative AI tools cut the time spent writing new code and documenting existing code to less than half of the original, and legacy refactoring to about two-thirds (McKinsey, "Unleashing developer productivity with generative AI"). A separate McKinsey initiative—LegacyX, which uses specialized “squadrons” of AI agents to modernize systems in banks—reports a 20–30% acceleration already in the first projects, with further gains when transitioning from single tools to a multi-agent architecture.

Legacy as an anchor, not as the “natural state of affairs”
Let’s start with a sober assessment: legacy is not a “normal stage in a system’s life.” It is an anchor.
In sectors such as finance, insurance, government, and energy—the backbone of operations consists of systems that are over a dozen or even several dozen years old. Many of them were written in languages no longer taught in universities. Documentation is patchy or nonexistent, and the business logic has undergone dozens of “quick fixes.” Every change is painful.
From a business perspective, this means:
slow time-to-market – new products, channels, or regulatory requirements collide with a monolith that moves at the pace of “multi-year projects,”
integration issues – legacy systems don’t play nice with APIs, events, and real-time data, so bridges, synchronizations, and ETL processes are multiplying everywhere,
costs – maintaining a “technology museum” eats up a budget that could go toward development,
risk – the dwindling number of people who actually understand what’s going on inside increases the chance of serious outages and regulatory slip-ups.
This is no longer just an “IT department” problem. It is a problem of the organization’s competitive advantage.
Why traditional approaches to modernization have failed
Over the years, various paths have been tried that are now clearly seen as dead ends.
Big bang rewrite. Huge 5–7-year projects, hundreds of people, the promise: “We’ll rewrite everything to a modern platform and be done with it.” In practice: business requirements change along the way, the target architecture becomes outdated before completion, and migration risks rise as the gap between the old and new systems widens.
Lift & shift to the cloud. Moving a monolith one-to-one to cloud infrastructure. The result: some infrastructure costs decrease or are simplified, but technical debt and architectural complexity remain untouched, and the system continues to limit what can be done with data and processes.
Code and load. Rewriting code from one language to another without changing the logic or architecture. E.g., COBOL → Java, but: all the historical workarounds, “three ifs for an exception from 2007,” and strange paths are still there; no one asks whether this whole process is still necessary, and the business still doesn’t understand what’s really in there—only now it’s harder to admit that’s the case, because after all, “it’s a modern stack.”
The common denominator: a focus on technology (lines of code, infrastructure), not on the system’s intent and business value.
What does generative/agent-based AI really bring to the table?
It’s tempting to say: “OK, now we have AI, so it’ll write the code faster.” Except that the greatest value isn’t in just writing fast.
When used well, AI can:
understand the existing system: analyze code, logs, configurations, fragments of documentation, screenshots, and even transcripts of conversations with “dinosaurs,” and piece together a picture of what the system does,
describe this knowledge in simple language, in a form understandable to the business,
help separate business logic from technical logic, name processes, exceptions, and rules,
automate repetitive steps: generating new modules, tests, mappings, documentation, and refactoring legacy data schemas.
And here comes the second thing that classic approaches to modernization didn’t cover at all. These transcripts of conversations with the “dinosaurs”—with people who actually remember why that strange exception was added to the credit rule twenty years ago, and what happened when it was once disabled—are not just raw material for a discovery agent. Together with hundreds of other historical decisions, industry intuitions, and patterns of reasoning that the expert has built up over years of working in this specific system, they form the material for that expert’s cognitive twin: a model that not only knows their decisions but understands the way they made them. The system twin describes what the legacy system does. The cognitive twin describes why it does things this way and not another—and why the “dinosaur” hasn’t allowed this to change for twenty years. In legacy modernization, you need both. Without the system twin, you modernize blindly. Without the cognitive twin, you modernize precisely, but you lose the knowledge of why the existing solution looked the way it did—and you will likely recreate in the new code the same problems that the “dinosaur’s” workarounds suppressed for a decade.
This shifts the center of gravity: people stop being “transcription hands” and become designers of what the system should look like, while AI and agents handle the technical “grunt work.”
For regulated sectors, however, there is another dimension—the core code of banking, insurance, or administrative systems cannot simply be moved to a foreign cloud for an agent to analyze, so data sovereignty and the requirements of DORA and the EU AI Act become an integral part of the factory’s architecture, not an add-on.
A pitfall to avoid: AI as a turbocharged “code and load”
The simplest (and worst) way to use AI for modernization looks like this:
We take legacy code.
We feed it into a gen-AI tool.
We ask: "Rewrite this into modern language X / framework Y."
Deploy, done.
In the short term, it looks like a success: we have new code, new tests are running, everything in trendy technology. But underneath, there’s a process structure from the ’90s, historical workarounds, and “temporary solutions” carried over into the new world—zero new business value—just an engine in a nicer shell.
This is exactly the same as “lift & shift” to the cloud, only at the code level. From a business perspective—a costly operation that doesn’t address what really hurts.
The AI Agent Factory: A New Modernization Model
In the CDF methodology, which I outlined in the first article of this series, I call this pattern the agent factory—and it is a specific configuration of a team, tools, and processes, not just a rhetorical metaphor. Below, I’ll explain what this means in practice.
The real breakthrough begins when you stop thinking in terms of “modernization projects” and start thinking in terms of an agent factory.
From Project to Factory
Instead of assembling an ad hoc team, selecting tools, and devising an approach each time, you build a permanent capability:
a team that specializes in modernizing systems using AI agents,
a set of specialized agents that handle the successive steps of the modernization “production line,”
standardized processes, patterns, and metrics.
As a result, each subsequent system is analyzed faster, leverages the experience of previous projects, and generates new "building blocks" that can be reused.
What agents work in such a factory?
An example set:
Discovery/reverse-engineering agent – analyzes code, logs, and configurations; identifies dependencies; builds a map of modules and workflows. The result of its work is not an “analysis report,” but a living, dynamic representation of the existing system—what I will refer to in this text as the system twin.
Process description agent – based on the system analysis, creates a clear, step-by-step description of “what is happening” in business-friendly language.
Data mapping agent – catalogs fields, relationships, and data flows; identifies redundancies, inconsistencies, and “gaps” between systems.
Target state design agent – helps translate current functionality into the target architecture (e.g., modular, event-driven), and proposes decomposition.
Code and test generation agents – create new components, unit tests, integration tests, and contract tests in the target stack.
QA and security agents – scan for regressions, vulnerabilities, and compliance with internal standards.
Documentation agent – updates technical and architectural documentation, diagrams, and changelogs.
An orchestrator agent oversees all of this, managing the order of tasks, ensuring that the results of one agent serve as input for the next, and escalating issues to humans when something goes beyond established limits.
And now, the thing that changes the entire framework of thinking about legacy modernization. The output of the discovery agent—the system twin—is not post-hoc documentation. It is the subject of the factory’s further work. All subsequent steps (business decisions about what to keep, designing the target state, generating new code, testing) take place on the twin, not on the live system. This is a fundamental difference from classic modernization approaches: understanding the system is separated from its modification. You can make radical decisions—simplifications, restructurings, eliminating entire paths—without the risk of messing something up in production, which business owners won’t let you touch. Modernization ceases to be “open-heart surgery.” It becomes designing on a precise model, with verification on the live system only after decisions have been made.
The role of people: from “coding” to architects of change
The agent factory does not replace people—it redefines their work.
Business / Product Owner. They don’t need to read COBOL. They receive from the agents a “human-readable” description of existing processes, a map of functions and exceptions, and proposals for simplifications. Based on this, they decide which functions are still needed, what can be combined or discarded, and what the user experience and KPIs should look like in the new system.
Architect / lead engineer. Designs the target architecture (module boundaries, integration standards, security patterns), verifies the agents’ proposals, ensures consistency and avoids anti-patterns, sets guardrails: what the agent can change automatically, and what always requires review.
Dev/QA team. Reviews the generated code and tests, adds logic that the agent cannot generate correctly, designs edge cases, "strange cases," and tests from which the agent will learn.
Together, they function as a team of designers and quality controllers for the production line, rather than as a “manual assembly line.”
What the modernization "production line" with agents looks like
This can be viewed as several repetitive stages:
Discovery and understanding. Agents analyze code, logs, configurations, and documents; they generate a map of the system, workflows, dependencies, and a verbal description of processes.
Business decisions and target state design. IT + business workshops based on material from the agents; decisions: what to keep, what to simplify, what to remove; draft of the target architecture (modules, services, contracts).
Generation and migration. Agents generate new code, data mappings, and tests; the orchestrator compiles these into coherent PRs compatible with CI/CD pipelines, automated tests, security scans, and static code analysis.
Review and correction. The team reviews, corrects, and trains the agents (feedback loop); based on this, new rules and patterns are established in the factory.
Implementation and learning. Phased implementation (canary, A/B, batch migrations), collection of metrics: performance, stability, business impact; insights are fed back into the factory—strengthening future projects.
The factory as a permanent capability, not a one-off "Modernization Project X"
A key mindset shift: the agent factory isn’t built “for the sake of this one program.” It’s a capability that starts with one or two systems but ultimately covers the entire legacy portfolio and becomes a tool for continuously “unclogging” IT.
With each subsequent project, the quality of the agents improves (domain knowledge, context, patterns), the time and cost of entering a new system decrease, and the organization becomes less afraid of modernization—because it has the process, people, and tools that have done this many times before.
This is a huge difference compared to the classic model: a series of “modernization programs” that fizzle out once completed, leaving behind a bit of new code and even more fatigue.
What you need to do before launching your own agent factory
If you’re a CIO/CTO or responsible for transformation, a sensible first step is to ask a few simple questions:
What modernization projects do we have today? Are they still based on the “large team + long timeline + no AI plan” model? Do we intend to repeat this pattern in the coming years?
Which systems are good candidates for a “factory pilot”? Important, but not absolutely critical; representative (in terms of logic, technology, and complexity) of other parts of the landscape.
Who do we need on the factory team? Architects who understand both the legacy and target stacks; AI/agent specialists who can build and orchestrate pipelines; business-side product owners who aren’t afraid to dive into processes.
How will we measure success? Not just “how much code we rewrote,” but also: how much maintenance costs have dropped, how much time-to-change has been reduced, how stability has improved, and what this has done to the P&L and user experience.
What modernization projects should management question today
Looking at the big picture, it’s worth having a simple filter:
If someone comes with a plan saying, “We need a few years and hundreds of people; AI probably won’t help here”—raise a red flag.
If the modernization plan boils down to “we’ll port the code 1:1 to a new language”—ask: why?
If, in a discussion about modernization, no one can say how the work model, processes, and business value will change—go back to the drawing board.
Because the worst-case scenario for the coming years looks like this: we’ll spend a fortune rewriting legacy systems “just like before, only faster,” while the competition builds an agent factory that will systematically modernize its IT, accelerating year after year.
So the real question isn’t “can we afford an AI agent factory for modernization?” It is: can we afford to keep modernizing like we did in 2010, when the world around us is learning to do it 40–50% faster and much smarter?
A micro-pattern from practice
The quickest test of whether legacy modernization makes sense isn’t the question “Can AI rewrite this code?” It’s the question: Can a discovery agent, run on this system, generate a comprehensible description of the business processes the system performs—without help from the people who remember them? If so—you have material the agent factory can work on. If not—the problem isn’t the code. The problem is the lack of captured domain knowledge within the system itself. And before you start thinking about rewriting anything, you must first capture what currently exists only in the minds of a few people who are getting closer to retirement.
This series breaks down AI transformation in regulated sectors into seven layers:
These posts appear weekly on the product blogs allclouds.pl — genesis-ai.app/blog and savant-ai.app/blog. The entire series is a record of what I’ve learned from working in regulated sectors—decisions that had to be made faster than caution allowed, mistakes that taught me more than successes, intuition honed in conversations with no script, and the will to build something that doesn’t yet exist. |





Comments