Code authorship vs. code remediation

Jonathan Schneider
March 23, 2023
not present

Key Takeaways

Just as the traditional technical debt was addressed via refactoring, this new type of technical debt plaguing our modern assembled applications can also be addressed by expanding on refactoring capabilities. These activities are highly repetitive across organizations as everyone integrates some subset of the same third-party components to create business value. This high level of repeatability points us to automation. 

When we write code, our hands lag behind our thoughts. We think much faster than we are able to type the code out. 

When we need to remediate some part of the codebase, we usually have a set of instructions: update the dependency version, replace this API call with another, or add arguments. When a major framework migration is necessary, we read release notes, contemplate our codebase, and go into despair. The types of changes we need are enumerable almost immediately, but again our ability to implement them lags far behind our recognition of the problem. 

Code authorship and code maintenance are quite different activities. There are multiple ways to implement new functionality. Writing new code is like going to a restaurant, and the waiter says do you want red wine or white? Neither is wrong per se at that point. 

There aren’t multiple valid branches to implement remediations like upgrading from JUnit 4 to 5. There are edge cases that require creativity in how they are approached, but this is the enjoyable part of work and should be left to developers. The majority is immediately enumerable.

Because of this one-to-one correspondence between code before and after fixes, a rules-based refactoring engine for code transformations can be developed. Our IDEs use similar underlying technology but are oriented toward single-repository manipulation, not the management of organization-wide software assets. Moreover, when transforming large bodies of existing code, we can make a simplifying assumption that the code is in a working state to begin with so that we can assemble much more complex rules while still being 100% accurate. We can progressively encapsulate lower building blocks to do amazing things.

One last difference between code authoring and code remediation is that code remediation needs to be coordinated across multiple places across the code base, within the same repository or across the repository bounds. For example, changes need to be coordinated across repository boundaries if we want to change APIs and their consumers (dependency management being a case of this change when the producer/consumer of changes are different organizations). On the other hand, code authoring is single-threaded. We can only author in one place in the code at once. So IDE is perfect for authoring, but a new large-scale distributed system is needed to manage code remediation at scale.

Note that new AI-based autocompletion tools like Github’s Copilot are also primarily code authoring tools. In the IDE, which is traditionally a rules-based engine, when we hit a shortcut, we know exactly what code is going to be generated. AI-based autocompletion is likely to generate a block of code that is unpredictable. As an authorship experience localized to a single point in the code, this can be valuable. Developers can review and accept or reject the suggestion because they are working on the code in that place. Look for future blog entries where we will show how generative AI can work alongside the authoritative refactoring engine to speed up code remediation in a specifically targeted way.