The CI server as a Selenium Grid replacement
As I’ve been involved in more-and-more Selenium deployments over the last year, I’m less-and-less convinced that Selenium Grid should have much, if any, in any large-scale deployment. But that doesn’t mean that you shouldn’t be distributing scripts across various machines for both configuration coverage and just pure workload distribution. Which of course you absolutely should be.
Configuring you CI server as a Se-Grid replacement is easy-peasy; just create a parallel job chain.
But the bigger question is why you would want to do this. To me, the big thing is a visibility the CI server provides you for your runs. With Se-Grid, you have a single job and either it passes, or doesn’t. Well, which of your OS/browser/version combinations didn’t work? Time to start digging… With the CI route, you just look at the individual job results and it becomes quite apparent which one didn’t work.
Which leads to the next reason; debugging. If you have four environments in your grid_configuration.yml file and only one is misbehaving, in order to debug on that single one you have to change the file. This either means that you have to commit a change to a file, or log into a machine and modify a file. Either is bad. You should never have to log into a machine used for automation — that is why there exists things like Puppet. And committing things in should have other ramifications — like starting a new run through a build pipeline which seems silly for just a config. And a debug config at that.
And those two reasons are enough for me. But if you needed some other reasons:
- Your CI server server at this point is likely going to be distributing tests on multiple machines. And so by adding Se-Grid on top of that you have put another way of communicating to remote machines. ‘Another’ is a synonym for ‘something else that could go wrong and have to debug’.
- Se-Grid functionality is planned for inclusion in future versions of the Selenium Server, which will address the problem of it falling out of step with the ‘current’ version of things. I believe now it includes 1.0.3 of the Selenium Server which is old.
- Se-Grid relies on your scripting frameworks runner to do the parallelization, and subsequent aggregation of run results. Not every framework supports parallel execution. And even for those that do, having to put the parallelization code into the scripts un-necessarily complicates things. Threads are tricky; avoid trickiness.
- When I was at HP, we had 3 full racks of machines at our manual and/or automation disposal. Maintaining them basically monopolized our poor IT person 100% of the time. So I don’t have any qualms about recommending Sauce OnDemand to handle jobs for you. Building your own grid/cloud might seem the cheaper route, but how much is your hourly rate for maintaining it? That is the cost of it. Unfortunately, you cannot add OnDemand to a grid — no matter how often I suggest it. :) When you have multiple single jobs running in parallel via CI, you just point the ones that OnDemand can handle at OnDemand and the ones it cannot at local instances. And no one is the wiser.
Selenium is only a small part of the overall toolset these days. When completely manual processes and non-linked teams involved in building their own fiefdoms, erm, parts of the product I can see how Se-Grid fits into the picture. But high-functioning teams are very linked and the whole process of building stuff is automated as much as [ethically] possible meaning it isn’t up for the task anymore.