Tuesday 26 July 2011

SRP: Simplicity and Complexity

Simplicity does not precede complexity, but follows it. - Alan Perlis

The Single Responsibility Principle (SRP) - one of the SOLID principles - is definitely one of my favourites principles in Object-Oriented Design. I find its simplicity and, at the same time, its complexity fascinating. I have known this principle for years and wrote about it for the first time over one year ago. Still, I can't stop thinking about how hard it is to apply it. Speaking to and pairing with many developers over the years, I noticed that every single one of them understands what the principle is about but when it comes to write code, either they forget it or they simply can't find a way to adhere to it. I'll give you my view about the problem and what I've seen happening in many occasions.

The outside view

 Every class and method should have just one responsibility

This is what I call outside view and naming plays a big role here.

This view is generally used when we need to add some totally new code. We think what the new code is about, create a class and/or a method, give it an appropriate name that represents what the new code does and we are done. We also use this outside view when reading code. When looking at a class or method name, it should be clear to us what they do.

I don't think anyone ever had a problem understanding this principle.

However, if it is so simple to understand, why don't we all follow it? Why do we still find classes and methods with hundreds, if not thousands of lines? A few things come to my mind to explain that. Maybe we are just bad at naming classes and methods. Maybe the names we give are too generic or too broad, allowing us to write loads of code in there. Maybe some of us just don't care.

The inside view

Classes and methods should have one, and only one, reason to change

This is what I call inside view and it is the other (often forgotten) side of the Single Responsibility Principle.

This view is generally used when we need to change some existing code or add more code to it. That's when we look inside each class and method to see exactly what they do. More often than not, we get frustrated because they do much more than they should do and we need to go through many lines of frustration and WTFs to understand the bloody thing.

However, it is much easier to adhere to the SRP when we keep the inside view in mind instead of the outside view. That means, always think about the number of reasons to change a class or method instead of thinking purely in their responsibilities.

So, when and where do things go wrong?

NOTE: In general I find more SRP violations in systems where developers are not doing TDD and refactoring.

Here is some food for thought. I'm risking to say that the majority of SRP violations are found in the classes and methods that are closer to the interface of the system. For example, in a web application, it's very common to find large controllers, actions or backing bean classes. For swing applications, we tend to find large event handlers. For systems where the entry point is a queue, we tend to find large methods that are responsible for processing the messages. 

In general, this happens because the classes and methods closer to the interface of the system have a more generic and broader responsibility. Quite often, these methods are responsible to trigger loads of business rules and/or complex workflows.

Imaging that the input to our system is a huge XML file representing a trade. Also imagine that the first method invoked to handle that is called "processTrade(tradeXML)", located in a TradeService class. What is the responsibility of this method? It is to process a trade, right? Should it have a different name? Our system is expected to "process" all trades received so it is fair to say that the first method handling the request (or reading from a queue) should be called processTrade.

In another example, imagine that an user added a few items to her shopping basket (web application), provided some payment details and clicked on the "place order" button. In our back end, we will have a method handling this request that probably will be called something like placeOrder(order). Fair, right?

Developing an idea

In general, the closer the code is to the system's interface, the broader and more generic its responsibility will be. The further away the code is from the system's interface, the narrower and more specific its responsibility will be. 

In both examples above, by their names, you can argue that the methods processTrade and placeOrder have a single responsibility. One processes the incoming trade and the other places a customer order. So, when developers take into account just the outside view of the SRP, they feel comfortable to add as much code as they need to satisfy these responsibilities.

The problem is that processing a trade or placing an order may be extremely complicated tasks, where a huge workflow will be triggered, many business rules and requirements need to be satisfied and we will need to write hundreds, if not thousands, of lines for that. So clearly, adding all these lines to a single method doesn't just simply violates the SRP. It's also plain stupid.

So, in order to make our code compliant to the SRP, we need to have just a single reason to change it. This leads to a complementary idea.

In general, the closer the class is to the system's interface, the more delegation the class will do. The further way the class is from the system's interface, less delegation the class will do.

A general example would be, in a traditional Java web application, the controllers, that are closer to the UI, have a broad responsibility and tend to delegate all the business logic to other objects. They just control the flow. At the other end, we have the DAOs that have a very specific and narrow responsibility, almost never delegating anything to another class. In the middle, we have services, that do some business logic of their own but also delegate work to other collaborator classes. Services tend to have a narrower responsibility when compared to a controller but a broader responsibility when compared to DAOs. Still, each class and method have a single responsibility.

Asking a different question

When mentoring and/or pair-programming with other developers, quite often we end up discussing about the amount of code in a method or methods in a class. Just using the SRP's outside view as an argument sometimes is not enough to convince some developers that the code is doing too much. That's why I find the inside view more useful. Instead of asking how many responsibilities a method or class have, I now ask how many reasons we would have to change them.

Monday 18 July 2011

Testing legacy: Hard-wired dependencies (part 2)

In part one, I showed how to unit test a method that uses a Singleton and makes static calls. So now, let's have a look at common code problems we find in legacy code, using the same example:


How many problems can you see? Take your time before reading the ones I found.. :-)

Refactoring
NOTE: When I've done it, I've done it step by step running the tests after every step. Here I'll just summarise my decisions

The first thing I noticed is that the tripList variable does not need to be created when the logged user is null, since an exception is thrown and nothing else happens. I've decided to invert the outer if and extract the guard clause



Feature Envy

When a class gets data from another class in order to do some calculation or comparison on that data, quite often it means that the client class envies the other class. This is called Feature Envy (code smell) and it is a very common occurrence in long methods and is everywhere in legacy code. In OO, data and the operations on that data should be on the same object. 

So, looking at the code above, clearly the whole thing about determining if an user is friends with another doesn't belong to the TripService class. Let's move it to the User class. First the unit test:


Now, let's move the code to the User class. Here we can use the Java collections API a bit better and remove the whole for loop and the isFriend flag all together.


After a few refactoring steps, here is the new code in the TripService


Right. This is already much better but it is still not good enough.

Layers and dependencies

Some of you may still be annoyed by the protected methods we created in part one in order to isolate dependencies and test the class. Changes like that are meant to be temporary, that means, they are done so we can unit test the whole method. Once we have tests covering the method, we can start doing our refactoring and thinking about the dependencies we could inject.

Many times we would think that we should just inject the dependency into the class. That sounds obvious. TripService should receive an instance of UserSession. Really?

TripService is a service. That means, it dwells in the service layer. UserSession knows about logged users and sessions. It probably talks to the MVC framework and/or HttpSession, etc. Should the TripService be dependant on this class (even if it was an interface instead of being a Singleton)? Probably the whole check if the user is logged in should be done by the controller or whatever the client class may be. In order NOT to change that much (for now) I'll make the TripService receive the logged user as a parameter and remove the dependency on the UserSession completely. I'll need to do some minor changes and clean up in the tests as well.

Naming

No, unfortunately we are not done yet. What does this code do anyway? Return trips from a friend. Looking at the name of the method and parameters, or even the class name, there is no way to know that. The word "friend" is no where to be seen in the TripService's public interface. We need to change that as well.

So here is the final code:


Better, isn't it? We still have the issue with the other protected method, with the TripDAO static call, etc. But I'll leave this last bit for another post on how to remove dependencies on static methods. I'll park my refactoring for now. We can't refactoring the entire system in one day, right? We still need to deliver some features. :-)

Conclusion

This was just a toy example and may not even make sense. However, it represents many of the problems we find when working with legacy (existing) code. It's amazing how many problems we can find in such a tiny piece of code. Now imagine all those classes and methods with hundreds, if not thousands of lines.

We need to keep refactoring our code mercilessly so we never get to a position where we don't understand it any more and the whole business starts slowing down because we cannot adjust the software quick enough.

Refactoring is not just about extracting methods or making a few tweaks in the logic. We need to think about the dependencies, the responsibilities that each class and method should have, the architectural layers, the design of our application and also the names we give to every class, method, parameter and variable. We should try to have the business domain expressed in the code.

We should treat our code base as if it was a big garden. If we want it to be pleasant and maintainable, we need to be constantly looking after it .

If you haven't read it yet, check the part one of this post. If you want to give this code a go or find more details about the implementation, check: https://github.com/sandromancuso/testing_legacy_code

Sunday 17 July 2011

Testing legacy: Hard-wired dependencies (part 1)

When pairing with some developers, I've noticed that one of the reasons they are not unit testing existing code is because, quite often, they don't know how to overcome certain problems. The most common one is related to hard-wired dependencies - Singletons and static calls.

Let's look at this piece of code:

public List<Trip> getTripsByUser(User user) throws UserNotLoggedInException {
    List<Trip> tripList = new ArrayList<Trip>();
    User loggedUser = UserSession.getInstance().getLoggedUser();
    boolean isFriend = false;
    if (loggedUser != null) {
        for (User friend : user.getFriends()) {
            if (friend.equals(loggedUser)) {
                isFriend = true;
                break;
            }
        }
        if (isFriend) {
            tripList = TripDAO.findTripsByUser(user);
        }
        return tripList;
    } else {
        throw new UserNotLoggedInException();
    }
}

Horrendous, isn't it? The code above has loads of problems, but before we change it, we need to have it covered by tests.

There are two challenges when unit testing the method above. They are:

User loggedUser = UserSession.getInstance().getLoggedUser(); // Line 3  
   
tripList = TripDAO.findTripsByUser(user);                    // Line 13

As we know, unit tests should test just one class and not its dependencies. That means that we need to find a way to mock the Singleton and the static call. In general we do that injecting the dependencies, but we have a rule, remember?

We can't change any existing code if not covered by tests. The only exception is if we need to change the code to add unit tests, but in this case, just automated refactorings (via IDE) are allowed.

Besides that, many of the mocking frameworks are not be able to mock static methods anyway, so injecting the TripDAO would not solve the problem.

Overcoming the hard-dependencies problem

NOTE: In real life I would be writing tests first and making the change just when I needed but in order to keep the post short and focused I will not go step by step here .

First of all, let's isolate the Singleton dependency on it's own method. Let's make it protected as well. But wait, this need to be done via automated "extract method" refactoring. Select just the following piece of code on TripService.java:

UserSession.getInstance().getLoggedUser()

Go to your IDE's refactoring menu, choose extract method and give it a name. After this step, the code will look like that:

public class TripService {

    public List<Trip> getTripsByUser(User user) throws UserNotLoggedInException {
        ...
        User loggedUser = loggedUser();
        ...
    }

    protected User loggedUser() {
        return UserSession.getInstance().getLoggedUser();
    }
}

Doing the same thing for TripDAO.findTripsByUser(user), we will have:

public List<Trip> getTripsByUser(User user) throws UserNotLoggedInException {
    ...
    User loggedUser = loggedUser();
    ...
        if (isFriend) {
            tripList = findTripsByUser(user);
        }
    ...
}  
 
protected List<Trip> findTripsByUser(User user) {
    return TripDAO.findTripsByUser(user);
} 
 
protected User loggedUser() {
    return UserSession.getInstance().getLoggedUser();
}

In our test class, we can now extend the TripService class and override the protected methods we created, making them return whatever we need for our unit tests:

private TripService createTripService() {
    return new TripService() {
        @Override protected User loggedUser() {
            return loggedUser;
        }
        @Override protected List<Trip> findTripsByUser(User user) {
            return user.trips();
        }
    };
}

And this is it. Our TripService is now testable.

First we write all the tests we need to make sure the class/method is fully tested and all code branches are exercised. I use Eclipse's eclEmma plugin for that and I strongly recommend it. If you are not using Java and/or Eclipse, try to use a code coverage tool specific to your language/IDE while writing tests for existing code. It helps a lot.

So here is the my final test class:

public class TripServiceTest {
        
    private static final User UNUSED_USER = null;
    private static final User NON_LOGGED_USER = null;
    private User loggedUser = new User();
    private User targetUser = new User();
    private TripService tripService;

    @Before
    public void initialise() {
        tripService  = createTripService();
    } 
        
    @Test(expected=UserNotLoggedInException.class) public void 
    shouldThrowExceptionWhenUserIsNotLoggedIn() throws Exception {
        loggedUser = NON_LOGGED_USER;
                 
        tripService.getTripsByUser(UNUSED_USER);
    }
        
    @Test public void 
    shouldNotReturnTripsWhenLoggedUserIsNotAFriend() throws Exception {             
        List<Trip> trips = tripService.getTripsByUser(targetUser);
                 
        assertThat(trips.size(), is(equalTo(0)));
    }
        
    @Test public void 
    shouldReturnTripsWhenLoggedUserIsAFriend() throws Exception {
        User john = anUser().friendsWith(loggedUser)
                            .withTrips(new Trip(), new Trip())
                            .build();
                 
        List<Trip> trips = tripService.getTripsByUser(john);
                 
        assertThat(trips, is(equalTo(john.trips())));
    }

    private TripService createTripService() {
        return new TripService() {
            @Override protected User loggedUser() {
                return loggedUser;
            }
            @Override protected List<Trip> findTripsByUser(User user) {
                return user.trips();
            }
        };
    }        
}

Are we done?

Of course not. We still need to refactor the TripService class. Check the part two of this post.

If you want to give it a go, here is the full code: https://github.com/sandromancuso/testing_legacy_code

Sunday 3 July 2011

Working with legacy code

Context

Large organisations' systems may have from tens of thousands to a few million lines of code and a good part of those lines is legacy code. By legacy code I mean code without tests. Many of these systems started being written many years ago, before the existence of cool things and frameworks we take for granted today. Due to how some systems are configured (database, properties file, proprietary xml) , we cannot simply change a class constructor or method signature to pass in the dependencies without undertaking a much larger refactoring. Changing one piece of code can break completely unknown parts of the system. Developers are not comfortable in making changes in certain areas of the system. Test-first and unit testing is not widely used by developers.

The Rule

In our commitment to make the existing systems better, more reliable, easier to deal with (changing and adding more features), we established the following rule: We can not change existing code if it is not covered by tests. In case we need to change the existing code to be able to write the tests, we should do it only using the automated refactoring tools provided by our IDEs. Eclipse and IntelliJ are great for that, if you are a Java developer like me. 

That means, before we make any change, we need to have the current code under test, guaranteeing that we don't break its current behaviour. Once the existing code is under test, we then write a new test for the new feature (or change an existing test if it's a change in existing behaviour) and finally we are can change the code.

Problem

There is an eternal discussion on forums, mailing lists and user groups about TDD being responsible for the increase or decrease of the team's velocity. I don't want to start this discussion here because we are not doing TDD for majority of the time. With legacy code, we spend the majority of our time trying to test existing code before we can do TDD to satisfy the new requirement. And this is slow. Very slow. It can be, in our experience, somewhere between 5 to 20 times slower than just making the change we need and manually test it.

You may be asking, why does it take so much longer? My answer is: If you haven't, try unit testing all the possible execution paths in a 2000+ line class? Have you tried to do it with a 1000 line method, full of multiple return statements, hard-wired dependencies, loads of duplication, a big chain of nested IFs and loops, etc? Believe me, it will take you a bloody long time. Remember, you can't change the existing code before writing tests. Just automated re-factorings. Safety first. :-)

Why are we "under-performing"?

Quite often, after start following this rule, management will enquire why the team is under-performing. Although it is a fact that the team is taking longer to implement a feature, this is also a very naive view of the problem. When management and development teams talk about team's velocity, they just take into consideration the amount of time taken to write the code related to a few specific requirements.

Over the time, management and teams just forget how much time they spend in things that are caused by the lack of quality and tests in their code. They never take into consideration that there is absolutely no way that developers would be able to manually test the entire system and guarantee they didn't break anything. That would take days, for every piece of code they write or change.  So they leave that to QA.

By not increasing the test coverage, the team will never have a regression test suite. Manually testing the system, as it grows in size and complexity, will start taking longer and longer by the QA team. It will also be more error prone, since testing scripts need to be kept in sync with the current behaviour of the system. This is also waste of time.

Also, just hacking code all the time will make the system more fragile and way more complicated to understand, which will make developers to take longer to hack more code in.

Because we can't press a button and just unit test a specific piece of code, the team is forced to spend a lot of time debugging the application, meaning that the team needs to spend time compiling and deploying it. In certain systems, in order to test a small piece of code, you need to rely on other systems to trigger something so your system can respond. Or you need to manually send messages to a queue so your piece of code is invoked. Sometimes, depending of the complexity of your system and it's dependencies, you can't even run it locally, meaning that you need to copy your application to another environment (smoke, UAT, or whatever name you use in your company) in order to test the small piece of code you wrote or changed. And when you finally manage to do all that to execute that line of code you just added, you realise that the code is wrong. Sigh. Let's start over.

Since I started practising TDD, I realised that I very rarely use my debugger and very rarely have to deploy and run my application in order to test it.

So basically, when people think we are spending too much time to write a feature because we were writing tests for the existing code first, they are rarely considering the time spend elsewhere. I've seen it many times: "We don't have time to do it now. Let's just do a quick fix.". And then, contradicting the laws of physics, more time is created when bugs are found and the QA phase needs to be extended. Panic takes over: "Oh, we need to abort the release. We can't go live like that.". Exactly.




The graphic above is not based in any formal research. It is purely based on my own experience during my career. I also believe that the decrease in the team's velocity is related to the state of the untested code. There are untested code out there that is very well written and writing tests to it is quite straightforward. However, in my experience, this would be an exception.

In summary, if you are constantly applying the "Boy Scout Rule" and always keep improving the code you need to touch, over the time you will be making your code better and you will be going faster and easier. If you don't do that, you will have a false impression of good velocity at start but gradually and quite often without noticing, you will start to slow down. Things that used to take a few days, now take a few weeks. Because this lost of velocity happened slowly (over months or even years), no one really noticed until it was too late and everything now take months.   

Drifting away and losing focus

One interesting side effect of constantly making legacy code better, writing tests and refactoring, is that it is too easy to be carried away. More than once I've seen pairs, while trying to clean a piece of code, drifting away from the task/story they were working on. That also happened to me quite a few times. We get carried away when making progress and making things better.

Very recently I was asked by one of the developers: "When do we stop?". My answer was: "When we finish the task.". In that context, that meant that the focus is always on task at hand. We need to keep delivering what was agreed with our product owners (or whoever the person in charge of the requirements are). So, make the system better but always keep the focus on the task. Don't go crazy with the refactoring trying to re-write the whole system in a few days. Do enough to finish the task and try to isolate the parts that are still not good enough. Baby steps. 

A more positive side effect

Another very common problem in legacy systems is the lack of understanding about the system and also about the business. Quite often we find situations where the developers and business people are long gone and no one really knows how the system behave.

Writing tests for the existing code force us to understand what it does. We try to mine some hidden business concepts in the middle of the procedural code and try to make them very explicit when naming our tests. That is a great way to drive our refactoring later own, using the business concepts captured by the tests. Quite often, we also get the business people involved, asking them questions and checking if certain assumptions make sense.

In writing the tests and refactoring the code, the previously completely unknown behaviour of the system is now well documented by the tests and code.

Quality is a long term investment

In a recent conversation, we were discussing how we could measure our "investment" in quality. A quick and simple answer would be that we could measure that by the number of bugs and the velocity that the teams are delivering new requirements. However, this is never too simple. The question is, why do you think you need quality? Which problems do you have today that makes you think they are quality related? What does quality mean anyway?

When working with legacy, after many years, we tend to forget how things were in the past. We just "accept" that things take a long time to be done because they have been taking a long time to be done for a long time. We just accept a (long) QA phase. It's part of the process, isn't it? And this is just wrong.

Constantly increasing the quality level in a legacy system can make a massive difference in the amount of time and money spend on that project. That's what we are pursuing. We want to deliver more, faster and with no bugs.

It's up to us, professional software developers, to revert this situation showing a great attitude and respect for our customers. We should never allow ourselves to get into a situation where our velocity and quality of our work decreases because of our own incompetence.