The PR is the easy part: What Netflix, Airbnb, and LinkedIn taught me about automated code changes at scale

Rooz Mohazzabi
|
March 23, 2026
Contents

Key Takeaways

Last month, four engineers who've spent years running automated code migrations at some of the world's largest software organizations shared what actually works and what doesn't. Aubrey Chipman (Netflix), Moses Nakamura (Airbnb), Prince Valluri (LinkedIn), and Jonathan Schneider (Moderne) joined a panel about the gap between generating a pull request and getting it merged.

Here's what stuck:

  • Generating the PR is the easy part. Getting engineers to trust and merge it is the unsolved problem at every company on this panel.
  • Trust has to be built before the PR exists. Shadow validation, herd testing, and rich attribution all earn credibility upstream, not at review time.
  • Migration fatigue is real. Even a good automated change gets rejected when engineers have been hit with dozens before it. Volume erodes trust faster than quality builds it.
  • Code review is becoming the most important engineering activity, not the least. As AI writes more of the code, the thinking happens in review. Teams that treat it as overhead are going to fall behind.
  • Deterministic tools are what make AI-generated changes trustworthy at scale. Agents that hallucinate their way through a migration don't get merged. Changes that verify their own output do.

Watch the full panel on demand → Watch the recording

I've been having the same conversation for the better part of a year. I get on a call with someone running a central tools team or a developer productivity org, and they say some version of: "We can generate PRs all day. That's not the problem. The problem is nobody's reviewing them."

Last week, I got four people who've been living this problem in the same room:

  • Aubrey Chipman, Senior Engineer, Netflix
  • Moses Nakamura, Senior Staff Engineer, Airbnb
  • Prince Valluri, Principal Staff Software Engineer, LinkedIn
  • Jonathan Schneider, CEO & Co-founder, Moderne, and creator of OpenRewrite

Here are the things that actually stuck with me.

Why automated PRs get rejected: the trust problem

"An external PR is kind of like unwelcome advice coming from an in-law. You're just looking for a reason to reject it."

Jonathan Schneider, Moderne

It doesn't matter whether the PR was produced by a human or a bot. The receiving team perceives it as risk. The exact same change gets a much higher merge rate when the team triggers it themselves rather than finding it in their queue from outside.

At Airbnb, code review runs on social pressure and relationship. A real person asking you to look at something gets a very different response than a cron job dropping PRs into your inbox. When automation removes that human ask, Moses said, getting anyone to review the PR becomes genuinely hard.

Prince Valluri called it "migration fatigue." Back-to-back migrations, dependency bumps, and framework upgrades mean that eventually every PR, even a reasonable one, feels like more work arriving from upstream. The problem isn't any single change. It's the weight of all of them.

Building developer trust before the pull request opens

Stop trying to earn trust at review time. Earn it upstream.

Jonathan Schneider made the case for testing proposed changes across a whole business unit, not one repository at a time. Any single codebase has imperfect coverage, but across dozens of repos, their imperfect tests collectively cover each other's edges.

He called it "herd safety." One repo catches a failure that 50 others would have missed.

Netflix takes this further with something Aubrey Chipman called "shadow validation:" running the full validation stack before a PR is even opened. By the time a developer sees it, the change has already passed compilation, artifact creation, and all the pre-test phases. "“An external PR is kind of like unwelcome advice coming from an in-law. You're just looking for a reason to reject it."," Aubrey said.

At Airbnb, teams also build lightweight throwaway verification tools for specific migrations. During an email rendering migration, someone built a quick visual comparison showing how emails looked before and after. Nobody's shipping that tool to production. It answered the one question the test suite couldn't.

Automated Code Remediation at Scale O'Reilly ebook cover

Dive Deeper

Automated Code Remediation at Scale: The Role of AI in Application Modernization and Security

How top engineering teams are combining AI and deterministic automation to tackle tech debt and security at scale.

What a trustworthy automated code change actually looks like

Automated PRs fail partly because they look like no one is accountable for them.

Aubrey Chipman's fix: make the PR explain itself.

  • Link to the git tags of changed versions. 
  • Include release notes for before and after. 
  • Name who to contact if something looks wrong. 

It turns an anonymous automated change into evidence that a real person thought it through.

Prince Valluri made the same point about agent evals. When your agent setup changes, do you know if it got better? Scoring on objective dimensions like compile and test, and subjective ones like readability, feels slow. "But in the long run," he said, "Automated PRs fail partly because they look like no one is accountable for them."

Code review at scale: The new engineering bottleneck

"Before, the tippy-top coding was where you did most of your thinking. Now that the tippy-top coding is disappearing, you still need a place where you're doing your thinking about solutions. That's really becoming code review."

Moses Nakamura, Airbnb

If that's right, and I think it is, code review infrastructure is a strategic investment right now, not a maintenance concern.

Netflix has a "campaigns" concept: one view of all PRs from a specific migration, with merge rates, blocked states, and who to contact. LinkedIn runs AI-powered review agents that filter before a human ever picks it up, and tags PRs by confidence level and blast radius so reviewers can calibrate before diving in. Neither of those is a trivial thing to build. Both teams built it anyway.

The Dark Factory: Deterministic AI agents for code migration

The teams doing this well have stopped asking "how do we generate better changes" and started asking "how do we make engineers comfortable saying yes."

We're building something at Moderne that takes this further. We're calling it the dark factory.

Think of it as an agent-driven system running continuously against your code estate, working toward a goal you set, no one watching. 

  • Stay current on Java. 
  • Remediate vulnerabilities as they surface. 
  • Upgrade a framework across 300 repositories. 

What makes it different from just running an agent is that each repository it processes makes the next run smarter. The 50th repo gets a better pass than the first.

It produces changes engineers actually trust because the underlying work is done by deterministic tools, not inference. Recipes that verify their own output. Changes you can read and understand. An agent driving the process without hallucinating through it.

Jonathan built the first one in about a day and a half. 

More to share soon. 

Join in on this conversation live at Code Remix Summit, Miami, May 11–13.

See how Moderne handles this in practice

Watch the full panel with engineers from Netflix, Airbnb, LinkedIn, and Moderne.