Breaking Hidden Dependencies

This is a part of my series on practising TDD while working with legacy code. The first problem most developers encounter when trying to get a class under test is hidden dependencies. It is what usually forces most developers to give up their TDD initiative and conclude that unit testing legacy code is next to impossible.

Seams and Dependencies

The main reason legacy code is hard to test is because it is hard to get at the hidden dependencies of a class. The class you are trying to change might depend on a database or file streams or an email server or on twenty modules with their own dependencies. The setup for a test trying to initialise all these dependencies might be 50 lines of code or involve setting up a new test database. To make unit testing viable, you have to break these dependencies and make the code under test more loosely coupled. I don’t want to log or save something to the database while testing a unit of code. I only want to test the logic in that unit. To achieve this I need to be able to create a fake database (or a fake logging class) that doesn’t interfere with my test.

Another reason to try and break dependencies is to use them for sensing. Sometimes the class under test affects other coupled classes but we have no way of seeing the result. If we can get at the dependent class then we can see what message the class under test tried to send.

The first problem is getting at the dependency so that I can swap it out or disable it. It could be buried a few levels down in the code and be inaccessible to me when I new up the object I want to test. I need to find a seam; a point in the code where I can write tests or make a change to enable testing. I work mostly with C# (and JavaScript but this stuff is way easier there) so if I want to test without changing the code then I usually have three options.

The Easy Option

The first option is to write a test for the class and see it if works. If it doesn’t then for all the parameters into the constructor I trying passing null instead of a value. And if there are still dependencies causing problems then I can try setting them to null via a property (if the dependency is a public property). This works very occasionally but usually I have to do more work than this to get the class under test.

orderService = new OrderService(null, null, null, dependencyICareAbout);

The Test Without Changing Any Code Option

The second option is a pattern from Michael Feathers (again!) called Subclass and Override. This pattern uses inheritance to nullify the dependencies of a class that I don’t care about and allows me to get at dependencies I do care about. There are a lot of variations on this pattern so once you understand it then you can add a bunch of new tools to your refactoring toolbox.

To test a class I can create a new test class that inherits from it (a subclass) and then override methods to do something else. This test class is not part of the production code and is only for testing purposes. An example would be a class that has a dependency on the file system and that has a method that saves data to a file.

public void SaveOrder(int orderId)
{
    Order order = orderRepository.GetOrderById(orderId);
    GetOrderChanges();
    SaveOrderToFile(order);
}

public virtual void SaveOrderToFile(Order order)
{
    //File stream stuff
}

I could then override this method to do nothing and test the rest of class without worrying about data being written to files. So in my example, I can write a test for the SaveOrder method (using my new TestOrderService class) to test the first two lines and not worry about data being saved to a file in the SaveOrderToFile method.

public class TestOrderService: OrderService
{
    public override void SaveOrderToFile(Order order)
    {
        //Do nothing
    }
}

A subtle variant of this pattern is changing the access modifier of a method from private to protected so that you can inherit the method in a test class and get access to it. In C# you will also have to make the method virtual to be able to override it.

The Subclass and Override pattern can also be used to inject a fake class instead of a dependency or to get at other internals to be able to sense if our test succeeded or failed. It is a very powerful pattern especially if you use one of the variations of it called Extract and Override (Call, Factory Method etc.).

The Making Code Testable First Option

The third option is to make the code testable by making some changes to the code that open up the class for testing but do not change any logic. This sounds risky and it is, so to avoid any unpleasant side-effects I try and make any changes as small and safe as possible. Using a tool like Resharper and methods like Extract Method (from Martin Fowler’s classic book, Refactoring) is how I usually accomplish this. There are lots of different techniques for making your code more testable but I will start us off by showing how to use the Extract and Override Call pattern (from Michael Feathers once again).

Here is a typical hard to test method. I need to write a test to check that the email subject and body are generated correctly but if I call this method it will try to send an email (or throw an exception if I don’t have an email server).

public void SendOrderConfirmationToCustomer(int orderId)
{
    Order order = orderRepository.GetOrderById(orderId);

    var email = new MailMessage(DefaultSender, order.CustomerEmailAddress)
                    {
                        Subject = "Order Confirmation",
						Body = BuildOrderConfirmationMessage()
                    };
    smtpClient.Send(email);
}

The first step is to extract the last line into a new method. And then make this new SendEmail method protected and virtual.

public void SendOrderConfirmationToCustomer(int orderId)
{
    Order order = orderRepository.GetOrderById(orderId);

    var email = new MailMessage(DefaultSender, order.CustomerEmailAddress)
                    {
                        Subject = "Order Confirmation",
						Body = BuildOrderConfirmationMessage()
                    };
    SendEmail(email);
}

protected virtual void SendEmail(MailMessage email)
{
    smtpClient.Send(email);
}

The second step is to introduce a new testing subclass that records the message to be sent by saving it in the SentMessage property. This property is for testing purposes only.

public class TestOrderService: OrderService
{
    public MailMessage SentMessage { get; set; }

    protected override void SendEmail(MailMessage email)
    {
        SentMessage = email;
    }
}

Now I can write a unit test for the SendOrderConfirmationToCustomer method and confirm that the Subject is set to “Order Confirmation”.

[Test]
public void SendOrderConfirmationToCustomer_WithAnOrderThatExists_ShouldSendAnEmailWithSubjectSetToOrderConfirmation()
{
    var orderService = new TestOrderService();

    orderService.SendOrderConfirmationToCustomer(1);

    Assert.That(orderService.SentMessage.Subject, Is.EqualTo("Order Confirmation"));
}

Conclusion

I have shown how to override a method to nullify it for testing purposes and also how to do sensing in an overridden method (recording the sent message). The Subclass and Override pattern allows me to test a class without changing anything and so is a very low risk. The Extract And Override Call pattern requires some changes to the code but I would still classify it as low risk. It has the added bonus of improving the code slightly and is the easiest way to break up large methods and the first step to extracting low level code into separate classes.

I use both the Subclass and Override pattern and the Extract and Override Call pattern quite often while working with Legacy Code. They are really great, both for beginners and for programmers used to working with legacy code. I use these patterns less now as I tend to use mocking frameworks and other techniques to help test my code. But all mocking frameworks are basically just a variation of the Subclass and Override pattern.

But we still have one problem left with our SendOrderConfirmationToCustomer method; it fetches an order from the database. How can we test that in a simple way? See my next blog post about Dependency Injection for a description of how we can build on our test/fake classes to make legacy code testable.

The Legacy Code Lifecycle

If you have worked for more than couple of years as a programmer then you’ve seen a legacy system. If you are lucky then you have only dabbled with them and not been drawn into the epicentre of a full-blown legacy code project. If you’ve been unlucky then your very first job was maintaining an old legacy system that made you question your decision to choose this career.

Inception of a Legacy SystemFrankenstein

So what do I mean when I say Legacy Code? There are lots of definitions: old code, somebody else’s code, bad code etc. But in the extreme cases nearly everyone recognizes legacy code. Have you ever made a change to a system that should have taken one day but took five? Or made a seemingly simple change that rippled through the system introducing a bug in a totally different module?

I think that the concept of legacy code is tightly coupled to how we as programmers write code. To illustrate this, let’s start with the lifecycle of a project gone wrong.

First Phase – The Land of Milk and Honey

A company decides to build a new system. The developers rejoice as they get to work on a shiny, new greenfield project. This is developer paradise. If the original team is reasonably competent then this phase is a joy to work in. It is easy to add new features and the customers are happy as their requests are fulfilled within a couple of days. The developers get to use the latest, shiny technology and enjoy the feeling of fast flowing development.

However, it is very possible to sabotage the whole system from the very start. An example is the prototype trap; first the team hacks together a prototype which the customer loves, then the team decide to build on the prototype instead of throwing it away. And with that decision made, they are well on their way to building a spaghetti code jungle.

Second Phase – Getting Fatter

Life is not too bad. The features are not flying out the door at the same speed anymore but the customer is reasonably happy. The code might be starting to get a bit clunky, the controllers/code-behind classes are not as skinny anymore. Class size is growing as the classes accumulate more methods and more lines per method. The bug count increases as new features involve changing existing code and not just adding new code. It is high-time to set up bug tracking and a process for change requests.

Third Phase – A Bump in the Road

It is common for programmers to change jobs regularly and in most companies the key people will eventually leave for greener pastures. This means that the two or three developers that built the system and know every nook and cranny are gone. And so is their knowledge. New developers come in and do not manage to grasp the structure and logic of the system. Perhaps it is not even their main task and they are just helping out occasionally.

The era of quick fixes has begun. The new developers never went through the first phase and don’t remember when the code base was a thing of beauty. They see a chunky, inelegant code base and they want to get in and out as quickly as possible. Maybe the customer is putting the pressure on or they just want to get back to their other (greenfield) project.

Fourth Phase – The Big Project

Good news! The customer is delighted with the system. It is saving/earning them buckets of money. They have loads of ideas for new features and want them implemented pronto.

But this is where everything goes wrong. The current team of developers have allowed the code base to rot by applying quick fixes and not understanding the original vision for the system. There is not much structure left and new code has been dumped in inappropriate places. The team are trying to build on a foundation of sand and it is now the bug-fix death march begins.

They try to estimate how long a new feature is going to take but this is extremely difficult due to no-one knowing how big the ripple effect will be. So they triple their estimate to be on the safe side. The problem is this feature will never be done. They might get to 90% done but that last 10% is unattainable. In an entangled code base most large changes introduce bugs. Fixing those bugs introduces new bugs. It is near impossible to complete this project without bugs and meet a deadline.

This is where it turns into a death march. The team has a deadline to meet and just throws code at the bugs in a desperate attempt to get everything done on time. People work long hours, the code quality worsens all the time. In the end, the customer gets a bug-ridden Big Ball of Mud. The customer is not happy and crisis talks begin about the future of the project.

Fifth Phase – What do we do now?

At this stage, you have a burnt out team (or maybe no team at all) and a system that is a maintenance nightmare . What happens now? Throw away the code (and the money invested) and rewrite it? Limp along for years and spend your days fixing all the bugs and taking an eternity to add new features? This system might only be a few years old and already could be thrown on the scrap heap.

A lot of us have seen and experienced projects like this and agree that this system qualifies as legacy code. But I favour Michael Feathers’ much tougher definition:

Code without tests is bad code. It doesn’t matter how well written it is; it doesn’t matter how pretty or object-oriented or well-encapsulated it is. With tests, we can change the behavior of our code quickly and verifiably. Without them, we really don’t know if our code is getting better or worse.

So yes, my example qualifies as legacy code but so does most of code I’ve written during my career as a programmer. Code without tests inevitably rots and becomes harder to maintain. It might never get to the fifth phase but will require a lot of care and attention to stay at the second phase and not get out of control.

In my experience, writing unit tests is the key to changing the lifecycle of a project and to start improving code quality and reducing the bug count. But how can you learn unit testing and TDD if you work with a legacy system? If you are stuck with a legacy system and want to make programming fun again then get started by reading my next article on TDD and Legacy Code.

Review of The Art of Unit Testing

The Art of Unit Testing With Examples in .NET by Roy Osherove, published by Manning. ISBN: 1933988274

Introduction

The Art of Unit Testing aims to teach developers how to write maintainable, readable and trustworthy unit tests. The author Roy Osherove has recently moved over to the Ruby community but is well-known in the .NET world for his TDD courses and TDD katas. He previously worked at TypeMock as Chief Architect so he lived and breathed unit tests for a few years.

This book is not what I had expected after reading all the online reviews. It is most definitely a beginner’s book, a book for someone just starting out with TDD. The first two books I read when getting started with TDD were Working Effectively with Legacy Code by Michael Feathers and Growing Object-Oriented Software, Guided by Tests by Steve Freeman and Nat Pryce. The Art of Unit Testing refers heavily to them and I would describe it as an introduction to unit testing. I noticed that The Art of Unit Testing placed above them on Jurgen Appelo’s Top 100 Agile Books list and I do not understand that to be honest. It would have been the perfect text for me during the first month of beginning TDD but feels mostly redundant after reading the two afore mentioned books.

The Meat of the Book

The Art of Unit Testing is well-written and easy to read, I flew through it in a few sessions. The first three chapters are a very basic introduction to unit testing. It gets a bit more interesting after that with nice, clear explanations of stubbing and mocking in chapters 4 to 6. Chapter 6 also contains a section on creating a test API that I really liked. It is the section of the book that I am most likely to reread later. Chapter 7 goes a bit deeper into writing tests and is the most interesting part of the book. It manages to gather together most of the good tips on unit testing that I have seen before e.g. enforcing test isolation, practising DRY even in your tests, keep setup code simple, avoiding overspecification. Chapter 8 is non-technical and is about getting your organisation to start unit testing. The rest of the book lists the various unit testing tools and frameworks available as well as the classic TDD books that you should read.

The parts that could be better

The lists of tools has not dated too well. The book is written in 2009 and a lot has changed in two years. Mocking frameworks like NSubstitute and FakeItEasy are not included and Rhino Mocks gets a lot of attention. The newer BDD tools like Specflow are also not included. Those sections will be near worthless in a couple of years time.

The last chapter on working with legacy code is very skimpy for a subject that is both huge and difficult. And generally, a lot more could have been written about unit testing. Nothing about Object Mothers or Test Builders (check out this article from Los Techies or read Growing Object-Oriented Software, Guided by Tests). Not much on GUI testing with Selenium or integration testing the database or the BDD-style of testing. When I write code I do not really separate writing production code and writing unit tests, the two go hand in hand. And this makes the book less valuable than a good book on TDD even if I can understand why the author chose to do so. There are a lot of .NET developers out there who know nothing about unit testing or TDD and something like GOOS would scare off most of them.

The formatting of the code samples is dreadful, the alignment is off in almost half of them. You can still read and understand them but it feels low quality. I am reading another Manning book at the moment and its formatting is fine so I cannot understand how that got in to the final version.

And the verdict is…

I would recommend The Art of Unit Testing for anyone getting started with writing unit tests but not if you have been doing TDD for a while and have already read some of the recommended texts on TDD. It’s a nice, smooth read and very easy to absorb but it is only about unit testing and not TDD (a minus for me) and there is almost nothing about writing testable code, a huge part of testing in my opinion.