SOA Testing

So we’re doing some pretty trick stuff with the underlying architecture of our products. Essentially we’re going whole-hog enterprise SOA. I’m distinguishing between ‘simple’ SOA things (just providing information through a REST or WS* interface), and enterprise as with messaging queues, etc. and platform neutrality happening. Right now it seems like overkill, but if things take off the way they are predicted to (and changing regulatory environments are swinging in our favour as well) then it should give us a nice bit of scalability as well.

The question is of course, how do you test this monstrosity of message producers, client libraries, buses and message consumers. Right now, the development team is taking the approach of ‘test the service through the client’, which appears to be working so can’t be too too wrong or bad, but it introduces all sorts of environment dependencies, fragile tests, slow tests and of course means great swaths of code goes untested (directly).

As you might have guessed, I’m a bit concerned about this strategy and my gut says it is going to backfire, hard, at some point down the road. But, do I come up with something better? I don’t know, but this is what I’ve come up with so far.

First, a bit of architectural clarification. Our external facing applications are written in Ruby on Rails (using JRuby) and we use a ruby gem as a client to interface with an ESB. A Java service listens to the bus and pulls its messages from there and responses follow it the same path on the way back.

The main thrust of what I think is a much safer testing strategy is to unit test components in isolation (and to do the necessary growing-up of process that is required for that) followed by more standard testing driven from the ui.

At its core, this is just the classic client/server testing problem only with lots of chaining involved.

Application (Producer) – The application is completely unaware of anything down the execution path of the service call. It just knows it needs to call a method a certain way and react to the response in certain other ways. That’s it. And when you think about it in these terms it becomes easy to test: create tests which cause it to generate each possible message then verify the message was possible and create tests which respond with each response condition then verify the action taken is correct.

There is a trick here though. A unit test should be fast, not hit the network, and certainly not hit the disk. (The first condition is usually met by doing the latter two). This means we need to fake, or stub, the client gem that the application uses. This is not the place for a discussion about Test Doubles, but essentially the test pretends to have gone to the server. Test Doubles are super powerful and proper use of them is a sign of development team maturity.
Client gem – Much like the Producer, the Client is tested by stubbing out both the Producer and ESB and manipulating messages coming and validating those going out. The additional wrinkle here is that Ruby doesn’t have the notion of contracts in the same way Java or C++ do. In those languages you define your interface and those are the only things you code against. In dynamic languages though you have to check not only the behaviour of the officially public methods, but the other ones that might get called as well. (And of course there are ample unit tests around the internal methods as well; it would be folly not to)
ESB – I don’t know enough about this piece of technology but I would suspect you need to test that messages of a certain form get put into the right queue. But we run the risk of testing the library we are using for this rather than our code. At some point out have to trust the libraries you are using.
Service (Consumer) – Exactly the same as Producer

Now we’re at the point where everything works within its own little bubble (to some definition of works…), but in the real world it is going to have to play nicely with each other. If we had contracts we would be slightly over halfway to that already, but since we don’t we’re going to have to rely on the competence of the developers to have changed both side of the conversation (in all products that have a side). (I think in SOA terms, this concept is wrapped up in the umbrella of SOA governance.) The way to do this testing is replicate the testing done at the unit level in the Application portion but with all the stubs removed and through the actual end interface; ideally through judicious use of automation (like Selenium).

Harnessing CI, all the stubbed unit tests are run upon commit and the un-stubbed integration ones run every couple hours.

The hard part is actually implementing this (of course) and not just being a test astronaut. TDD goes a long way to this as you need to write tests before everything downstream (or upstream) is ready so you naturally stub it out. Version-itis also becomes a problem without contracts, formal or informal, as a change to a shared service means changing the gem which then needs to be pushed to all server and a restart to get the new version. Version-itis is a tough nut to crack. I’ve seen it argued that a lot of Microsoft’s issues around quality stem from having to support so much stuff from previous releases.

Anyone have any lessons-learned or ideas? Most of what I found searching this morning was marketing or dealt with REST/WS* services and not large-scale enterprise SOA. Also, anyone have any experiences to share around the use of RR (Double Ruby) in implementing stubs in a gem context?