Breaking Hidden Dependencies

This is a part of my series on practising TDD while working with legacy code. The first problem most developers encounter when trying to get a class under test is hidden dependencies. It is what usually forces most developers to give up their TDD initiative and conclude that unit testing legacy code is next to impossible.

Seams and Dependencies

The main reason legacy code is hard to test is because it is hard to get at the hidden dependencies of a class. The class you are trying to change might depend on a database or file streams or an email server or on twenty modules with their own dependencies. The setup for a test trying to initialise all these dependencies might be 50 lines of code or involve setting up a new test database. To make unit testing viable, you have to break these dependencies and make the code under test more loosely coupled. I don’t want to log or save something to the database while testing a unit of code. I only want to test the logic in that unit. To achieve this I need to be able to create a fake database (or a fake logging class) that doesn’t interfere with my test.

Another reason to try and break dependencies is to use them for sensing. Sometimes the class under test affects other coupled classes but we have no way of seeing the result. If we can get at the dependent class then we can see what message the class under test tried to send.

The first problem is getting at the dependency so that I can swap it out or disable it. It could be buried a few levels down in the code and be inaccessible to me when I new up the object I want to test. I need to find a seam; a point in the code where I can write tests or make a change to enable testing. I work mostly with C# (and JavaScript but this stuff is way easier there) so if I want to test without changing the code then I usually have three options.

The Easy Option

The first option is to write a test for the class and see it if works. If it doesn’t then for all the parameters into the constructor I trying passing null instead of a value. And if there are still dependencies causing problems then I can try setting them to null via a property (if the dependency is a public property). This works very occasionally but usually I have to do more work than this to get the class under test.

orderService = new OrderService(null, null, null, dependencyICareAbout);

The Test Without Changing Any Code Option

The second option is a pattern from Michael Feathers (again!) called Subclass and Override. This pattern uses inheritance to nullify the dependencies of a class that I don’t care about and allows me to get at dependencies I do care about. There are a lot of variations on this pattern so once you understand it then you can add a bunch of new tools to your refactoring toolbox.

To test a class I can create a new test class that inherits from it (a subclass) and then override methods to do something else. This test class is not part of the production code and is only for testing purposes. An example would be a class that has a dependency on the file system and that has a method that saves data to a file.

public void SaveOrder(int orderId)
{
    Order order = orderRepository.GetOrderById(orderId);
    GetOrderChanges();
    SaveOrderToFile(order);
}

public virtual void SaveOrderToFile(Order order)
{
    //File stream stuff
}

I could then override this method to do nothing and test the rest of class without worrying about data being written to files. So in my example, I can write a test for the SaveOrder method (using my new TestOrderService class) to test the first two lines and not worry about data being saved to a file in the SaveOrderToFile method.

public class TestOrderService: OrderService
{
    public override void SaveOrderToFile(Order order)
    {
        //Do nothing
    }
}

A subtle variant of this pattern is changing the access modifier of a method from private to protected so that you can inherit the method in a test class and get access to it. In C# you will also have to make the method virtual to be able to override it.

The Subclass and Override pattern can also be used to inject a fake class instead of a dependency or to get at other internals to be able to sense if our test succeeded or failed. It is a very powerful pattern especially if you use one of the variations of it called Extract and Override (Call, Factory Method etc.).

The Making Code Testable First Option

The third option is to make the code testable by making some changes to the code that open up the class for testing but do not change any logic. This sounds risky and it is, so to avoid any unpleasant side-effects I try and make any changes as small and safe as possible. Using a tool like Resharper and methods like Extract Method (from Martin Fowler’s classic book, Refactoring) is how I usually accomplish this. There are lots of different techniques for making your code more testable but I will start us off by showing how to use the Extract and Override Call pattern (from Michael Feathers once again).

Here is a typical hard to test method. I need to write a test to check that the email subject and body are generated correctly but if I call this method it will try to send an email (or throw an exception if I don’t have an email server).

public void SendOrderConfirmationToCustomer(int orderId)
{
    Order order = orderRepository.GetOrderById(orderId);

    var email = new MailMessage(DefaultSender, order.CustomerEmailAddress)
                    {
                        Subject = "Order Confirmation",
						Body = BuildOrderConfirmationMessage()
                    };
    smtpClient.Send(email);
}

The first step is to extract the last line into a new method. And then make this new SendEmail method protected and virtual.

public void SendOrderConfirmationToCustomer(int orderId)
{
    Order order = orderRepository.GetOrderById(orderId);

    var email = new MailMessage(DefaultSender, order.CustomerEmailAddress)
                    {
                        Subject = "Order Confirmation",
						Body = BuildOrderConfirmationMessage()
                    };
    SendEmail(email);
}

protected virtual void SendEmail(MailMessage email)
{
    smtpClient.Send(email);
}

The second step is to introduce a new testing subclass that records the message to be sent by saving it in the SentMessage property. This property is for testing purposes only.

public class TestOrderService: OrderService
{
    public MailMessage SentMessage { get; set; }

    protected override void SendEmail(MailMessage email)
    {
        SentMessage = email;
    }
}

Now I can write a unit test for the SendOrderConfirmationToCustomer method and confirm that the Subject is set to “Order Confirmation”.

[Test]
public void SendOrderConfirmationToCustomer_WithAnOrderThatExists_ShouldSendAnEmailWithSubjectSetToOrderConfirmation()
{
    var orderService = new TestOrderService();

    orderService.SendOrderConfirmationToCustomer(1);

    Assert.That(orderService.SentMessage.Subject, Is.EqualTo("Order Confirmation"));
}

Conclusion

I have shown how to override a method to nullify it for testing purposes and also how to do sensing in an overridden method (recording the sent message). The Subclass and Override pattern allows me to test a class without changing anything and so is a very low risk. The Extract And Override Call pattern requires some changes to the code but I would still classify it as low risk. It has the added bonus of improving the code slightly and is the easiest way to break up large methods and the first step to extracting low level code into separate classes.

I use both the Subclass and Override pattern and the Extract and Override Call pattern quite often while working with Legacy Code. They are really great, both for beginners and for programmers used to working with legacy code. I use these patterns less now as I tend to use mocking frameworks and other techniques to help test my code. But all mocking frameworks are basically just a variation of the Subclass and Override pattern.

But we still have one problem left with our SendOrderConfirmationToCustomer method; it fetches an order from the database. How can we test that in a simple way? See my next blog post about Dependency Injection for a description of how we can build on our test/fake classes to make legacy code testable.

The Legacy Code Lifecycle

If you have worked for more than couple of years as a programmer then you’ve seen a legacy system. If you are lucky then you have only dabbled with them and not been drawn into the epicentre of a full-blown legacy code project. If you’ve been unlucky then your very first job was maintaining an old legacy system that made you question your decision to choose this career.

Inception of a Legacy SystemFrankenstein

So what do I mean when I say Legacy Code? There are lots of definitions: old code, somebody else’s code, bad code etc. But in the extreme cases nearly everyone recognizes legacy code. Have you ever made a change to a system that should have taken one day but took five? Or made a seemingly simple change that rippled through the system introducing a bug in a totally different module?

I think that the concept of legacy code is tightly coupled to how we as programmers write code. To illustrate this, let’s start with the lifecycle of a project gone wrong.

First Phase – The Land of Milk and Honey

A company decides to build a new system. The developers rejoice as they get to work on a shiny, new greenfield project. This is developer paradise. If the original team is reasonably competent then this phase is a joy to work in. It is easy to add new features and the customers are happy as their requests are fulfilled within a couple of days. The developers get to use the latest, shiny technology and enjoy the feeling of fast flowing development.

However, it is very possible to sabotage the whole system from the very start. An example is the prototype trap; first the team hacks together a prototype which the customer loves, then the team decide to build on the prototype instead of throwing it away. And with that decision made, they are well on their way to building a spaghetti code jungle.

Second Phase – Getting Fatter

Life is not too bad. The features are not flying out the door at the same speed anymore but the customer is reasonably happy. The code might be starting to get a bit clunky, the controllers/code-behind classes are not as skinny anymore. Class size is growing as the classes accumulate more methods and more lines per method. The bug count increases as new features involve changing existing code and not just adding new code. It is high-time to set up bug tracking and a process for change requests.

Third Phase – A Bump in the Road

It is common for programmers to change jobs regularly and in most companies the key people will eventually leave for greener pastures. This means that the two or three developers that built the system and know every nook and cranny are gone. And so is their knowledge. New developers come in and do not manage to grasp the structure and logic of the system. Perhaps it is not even their main task and they are just helping out occasionally.

The era of quick fixes has begun. The new developers never went through the first phase and don’t remember when the code base was a thing of beauty. They see a chunky, inelegant code base and they want to get in and out as quickly as possible. Maybe the customer is putting the pressure on or they just want to get back to their other (greenfield) project.

Fourth Phase – The Big Project

Good news! The customer is delighted with the system. It is saving/earning them buckets of money. They have loads of ideas for new features and want them implemented pronto.

But this is where everything goes wrong. The current team of developers have allowed the code base to rot by applying quick fixes and not understanding the original vision for the system. There is not much structure left and new code has been dumped in inappropriate places. The team are trying to build on a foundation of sand and it is now the bug-fix death march begins.

They try to estimate how long a new feature is going to take but this is extremely difficult due to no-one knowing how big the ripple effect will be. So they triple their estimate to be on the safe side. The problem is this feature will never be done. They might get to 90% done but that last 10% is unattainable. In an entangled code base most large changes introduce bugs. Fixing those bugs introduces new bugs. It is near impossible to complete this project without bugs and meet a deadline.

This is where it turns into a death march. The team has a deadline to meet and just throws code at the bugs in a desperate attempt to get everything done on time. People work long hours, the code quality worsens all the time. In the end, the customer gets a bug-ridden Big Ball of Mud. The customer is not happy and crisis talks begin about the future of the project.

Fifth Phase – What do we do now?

At this stage, you have a burnt out team (or maybe no team at all) and a system that is a maintenance nightmare . What happens now? Throw away the code (and the money invested) and rewrite it? Limp along for years and spend your days fixing all the bugs and taking an eternity to add new features? This system might only be a few years old and already could be thrown on the scrap heap.

A lot of us have seen and experienced projects like this and agree that this system qualifies as legacy code. But I favour Michael Feathers’ much tougher definition:

Code without tests is bad code. It doesn’t matter how well written it is; it doesn’t matter how pretty or object-oriented or well-encapsulated it is. With tests, we can change the behavior of our code quickly and verifiably. Without them, we really don’t know if our code is getting better or worse.

So yes, my example qualifies as legacy code but so does most of code I’ve written during my career as a programmer. Code without tests inevitably rots and becomes harder to maintain. It might never get to the fifth phase but will require a lot of care and attention to stay at the second phase and not get out of control.

In my experience, writing unit tests is the key to changing the lifecycle of a project and to start improving code quality and reducing the bug count. But how can you learn unit testing and TDD if you work with a legacy system? If you are stuck with a legacy system and want to make programming fun again then get started by reading my next article on TDD and Legacy Code.