I've been having the same conversation for the better part of a year. I get on a call with someone running a central tools team or a developer productivity org, and they say some version of: "We can generate PRs all day. That's not the problem. The problem is nobody's reviewing them."
Last week, I got four people who've been living this problem in the same room:
- Aubrey Chipman, Senior Engineer, Netflix
- Moses Nakamura, Senior Staff Engineer, Airbnb
- Prince Valluri, Principal Staff Software Engineer, LinkedIn
- Jonathan Schneider, CEO & Co-founder, Moderne, and creator of OpenRewrite
Here are the things that actually stuck with me.
Why automated PRs get rejected: the trust problem
It doesn't matter whether the PR was produced by a human or a bot. The receiving team perceives it as risk. The exact same change gets a much higher merge rate when the team triggers it themselves rather than finding it in their queue from outside.
At Airbnb, code review runs on social pressure and relationship. A real person asking you to look at something gets a very different response than a cron job dropping PRs into your inbox. When automation removes that human ask, Moses said, getting anyone to review the PR becomes genuinely hard.
Prince Valluri called it "migration fatigue." Back-to-back migrations, dependency bumps, and framework upgrades mean that eventually every PR, even a reasonable one, feels like more work arriving from upstream. The problem isn't any single change. It's the weight of all of them.
Building developer trust before the pull request opens
Stop trying to earn trust at review time. Earn it upstream.
Jonathan Schneider made the case for testing proposed changes across a whole business unit, not one repository at a time. Any single codebase has imperfect coverage, but across dozens of repos, their imperfect tests collectively cover each other's edges.
He called it "herd safety." One repo catches a failure that 50 others would have missed.
Netflix takes this further with something Aubrey Chipman called "shadow validation:" running the full validation stack before a PR is even opened. By the time a developer sees it, the change has already passed compilation, artifact creation, and all the pre-test phases. "“An external PR is kind of like unwelcome advice coming from an in-law. You're just looking for a reason to reject it."," Aubrey said.
At Airbnb, teams also build lightweight throwaway verification tools for specific migrations. During an email rendering migration, someone built a quick visual comparison showing how emails looked before and after. Nobody's shipping that tool to production. It answered the one question the test suite couldn't.
What a trustworthy automated code change actually looks like
Automated PRs fail partly because they look like no one is accountable for them.
Aubrey Chipman's fix: make the PR explain itself.
- Link to the git tags of changed versions.
- Include release notes for before and after.
- Name who to contact if something looks wrong.
It turns an anonymous automated change into evidence that a real person thought it through.
Prince Valluri made the same point about agent evals. When your agent setup changes, do you know if it got better? Scoring on objective dimensions like compile and test, and subjective ones like readability, feels slow. "But in the long run," he said, "Automated PRs fail partly because they look like no one is accountable for them."
Code review at scale: The new engineering bottleneck
If that's right, and I think it is, code review infrastructure is a strategic investment right now, not a maintenance concern.
Netflix has a "campaigns" concept: one view of all PRs from a specific migration, with merge rates, blocked states, and who to contact. LinkedIn runs AI-powered review agents that filter before a human ever picks it up, and tags PRs by confidence level and blast radius so reviewers can calibrate before diving in. Neither of those is a trivial thing to build. Both teams built it anyway.
The Dark Factory: Deterministic AI agents for code migration
The teams doing this well have stopped asking "how do we generate better changes" and started asking "how do we make engineers comfortable saying yes."
We're building something at Moderne that takes this further. We're calling it the dark factory.
Think of it as an agent-driven system running continuously against your code estate, working toward a goal you set, no one watching.
- Stay current on Java.
- Remediate vulnerabilities as they surface.
- Upgrade a framework across 300 repositories.
What makes it different from just running an agent is that each repository it processes makes the next run smarter. The 50th repo gets a better pass than the first.
It produces changes engineers actually trust because the underlying work is done by deterministic tools, not inference. Recipes that verify their own output. Changes you can read and understand. An agent driving the process without hallucinating through it.
Jonathan built the first one in about a day and a half.
More to share soon.
Join in on this conversation live at Code Remix Summit, Miami, May 11–13.

