Book Analysis: Working Effectively with Legacy Code

I have had the pleasure of reading and using at work Working Effectively with Legacy Code from Michael C. Feathers. It has provided me with a pool of resources to draw ideas from and put them in practice right away.

As the name implies, the book focuses on dealing with the issues present in any system with legacy code. By legacy code, the author refers to code without test, code that, in essence, is not possible to know if it is getting better or worse.

It covers several mainstream topics for legacy systems, such as:

Getting legacy code into a test harness.
Writing tests that shield you from new problems.
Common techniques that are applicable across any language.
Identification of places where code must be changed and safe ways to proceed.
Coping with systems with no recognizable structure.
Handling systems that aren’t OOP-oriented.

The book also includes a catalog of different techniques to aid the reader breaking dependencies among system components and make safer changes.

Book Structure

The book is split in three main parts that cover related topics:

The Mechanics of Change
Changing Software
Dependency-Breaking Techniques

The Mechanics of Change

The first part of the book introduces concepts that are essential when we are diving into legacy code. It divides them in 5 chapters that are devoted each to one of the main ideas.

It starts by analyzing the different causes to make a change in code and how they affect structure, functionality and resource consumption. In turn, this lead us to questioning what changes are needed, how we can be certain those changes are correct and, last but not least, how we can be confident we did not break things up?

Now, that we have changed things, how do we get feedback?.

The book addresses this question by showing the role of tests in software development. To support its statement, it delves on different aspects of testing. First, it mentions different goals of writing tests, such as: testing to show correctness and testing to detect change. Then, it moves towards explaining what testing is exactly and the different types of tests. Finally, it analyzes how to use consistently tests as a safety net when doing changes in code.

To write tests successfully, it is common in legacy code to face dependencies among classes. Usually these dependencies make very difficult to get an object of the class under test and we must resort to break them. The author differentiates between two reason to break dependencies:

Sensing: Used to interrogate values our code can’t in its original state.
Separation: Used to be able to get a section of code in a test harness.

However, in certain situation, breaking dependencies is not the best solution. Instead, it makes more sense to fake the classes our object depends on. The book debates on the merits of using fake collaborators to write tests and the differences between using Fake Objects and Mock Objects.

On the other hand, breaking dependencies is not always enough to get our classes under test. In the presence of global calls or external API calls, we need to use a slightly different approach to attack this problem. I am referring to the concept of seam, which according to the author, is a place where you can alter program behavior without modifying in that place. To make it happen, we need to find a place in code where we can decide which behavior to use, depending if we are running on production or testing. This place is named enabling point. Depending on our language of choice, several types of seams can be utilized. The book lists some common ones, such as: preprocessing seams, link seams and object seams

Finally, the first part discusses the importance of using the proper tools to facilitate our work. To deal with legacy code we need, as a minimum, an editor (or IDE), our build system and a testing framework. But to skyrocket our efficiency, the book recommends to take advantage of an automated refactoring tool. A proper refactoring tool automates many of the changes we must do while improving our codebase. To conclude the topic, the author then compares different test frameworks for different languages, such as JUnit, NUnit and FIT.

Changing Software

The next section of the book comprises 19 chapters that address common questions that are present in legacy code.

It starts by tackling several techniques we can use when we need to perform changes in our code. These techniques entail writing fresh code separately, with the goal of making easier to test those changes in isolation. Then, the author moves the analysis towards answering how much time takes to implement a change and get feedback, which depends mainly in:

Understanding: As projects grow, it becomes more difficult to have a clear picture of what changes should we make. The difference between a well-maintained system and a legacy system lies in how easy a change can be implemented. In a well-maintained system changing code should be relatively easy, while it could be nigh-impossible in a legacy project.
Lag time: Lag time, as the author defines it, is the amount of time that passes between making a change and getting feedback from the change. According to the book, recompiling our changes and see their effects should not take more than 10 seconds. This means our build should be quick, and the only way to achieve it is by properly managing (i.e. breaking) dependencies among classes.
Build dependencies: As dependencies of our code changes, it takes longer for our build to run, thus delaying our feedback. When we break dependencies and made our average build faster, it helps us reduce the number of errors we generate and improve our tests execution.

Then, it follows with a discussion of two common ways of programming a new feature: TDD, which implies producing our tests first and then writing the code that will make them pass and Programming by Difference, which it is based on the usage of inheritance to add new behavior without modifying existing functionalities in our code.

Next chapters bring a set of recipes to utilize when we are trying to get a class or method into a test harness for the first time. These strategies (e.g Construction Blob and Onion Parameter) , as we use them, become patterns we can apply iteratively to improve the project’s health.

The author continues by focusing on a set of abilities to use in preparation to make changes in the code. It shows how to reason about the effect of our modifications and different techniques to predict them like Sketching and Interception Points. And by writing Characterization Tests, we can verify the state of our code before attempting to introduce our changes.

The books moves on to tackle different issues that are possible to encounter in any application. It brings to the user different approaches to deal with Dependency Hell, API calls and absence of structure in the codebase. The author gives several suggestions such as using wrapping classes and isolating API code from the rest of the project. For structureless code, it recommends different techniques that help keep the system architecture alive in the team.

What if we are dealing with huge amount of code? A method with 500 lines of code? A class with 50 or 70 methods?

Feathers discusses why teams get into this situation. By adding pieces after pieces of code, classes and methods that were nicely design in first place, become a huge mess and it takes more time and effort to perform newer changes later on. Instead, it is better to focus our attention in breaking them down, thus facilitating future development and tests creation. For that, the books includes several methods to look for independent responsibilities and how to extract them.

Finally, the author mentions some suggestions to tackle these long-term goals in legacy code. Since it is very easy to lose perspective day-to-day about how the project is being improved, Michael suggests to find what motivates you about programming and realize as well, that as you start taking control of your legacy code, oases of good code will start to appear and make your work more pleasant.

Dependency-Breaking Techniques

The last part of the book provides a variety of refactoring techniques. As the author wrote, they are special because they can be used without tests, since they act as the initial grunt work we have to do in order to get our target code into a test harness. These procedures won’t make your design better; in fact, in some cases, the project design becomes worse. But it is important to take into account that once we got the classes under test, we can (must) go back and improve the design for the better.

These refactoring can be grouped in several main categories, such as:

Element Parameterization: Refers to techniques such as Adapt Parameter and Parameterize Constructor that allows to break down direct dependencies to external classes by using new interfaces and/or injecting dependencies.
Component Extraction: Comprises approaches that focus on separation of concern and extensibility, such as Extract Interface and Extract and Override Factory Method.

These techniques, and all of their variations, combined together with the previous strategies to change software, should provide coverage for tons of problems that we can found on our projects.

Conclusions

Let’s now recap our analysis. First we saw first several concepts that serve as the foundation to tackle legacy code. Knowing what changes we are doing and how to get feedback from them allows to make safe decisions and move forward with our refactorings. We viewed that tests are an essential component on this process and discussed several strategies to implement them by breaking our dependencies or mocking them. Then, we moved to review different strategies that the author utilizes to address common problems in our projects, such as adding new features and getting code into a test harness. Finally, the last section comprised several refactoring recipes that facilitates doing changes in the wild without corresponding test coverage. Overall, the book delivers on its promise and I already feel much more prepared to handle these tricky situations that are common to programming in legacy codebases.

Hope you like reading this analysis as much I did writing it. I hope that you have learned a few bits to deal better with legacy code in the future. See you soon and stay tuned!