I’ve worked on a number of projects with Cucumber now, and I think it is a terrific tool. However, I’ve been seeing it used in ways that it was not originally intended for, the consequence being acceptance-test suites that grow increasingly slow, brittle and difficult to maintain.
These problems occur when developers start using Cucumber to drive overly-detailed tests. The problem of excessive UI testing with Cucumber has been blogged about in the past by Aslak Hellesøy – the creator of Cucumber – but I thought I’d cover some of my own experiences and opinions in this post.
The Thing About Developers
We developers are a funny bunch – we’ll take a new tool and flog it to death, often pushing it way past what it was originally intended for. In some ways, this is only human, but the problem is that these failed experiments stay with us in the form of difficult-to-maintain code.
To make matters worse, when we discover the tool can’t do absolutely everything we hoped, there’ll be a backlash and we’ll declare the tool to be rubbish. Those of us lucky enough to have the chance will move onto the next big thing, sometimes going right back to where we were in the first place, and failing to acknowledge the more modest (but not insignificant) advancement that the tool gave us in the first place. Those unlucky souls left behind will have to clean up the post-party mess, cursing those who moved on and the wretched tools and techniques that they used.
The technology and process landscape is littered with such roadkill. And I believe Cucumber is in danger of suffering the same fate.
What Should Happen…
So how is Cucumber supposed to be used? Well, it’s worth remembering that Cucumber is primarily intended to be a collaboration tool, and only secondarily an automated testing tool. This is a point emphasised in the recently-published Cucumber Book by Hellesøy and Matt Wynne.
However, in my experience, this collaboration can be difficult to achieve. Business people mightn’t want to write acceptance tests in that format, or might not even be able to; the format might be too far from what they’re used to. They need developers to help them, but developers aren’t particularly well known for their collaboration skills. Which brings me to…
What Actually Happens
So what usually happens is that developers dive in and start to write the features and scenarios themselves. At first this goes well, but it rapidly spirals out of control. For developers are, by necessity, detail-focussed, and their acceptance tests rapidly begin to reflect this fact. Being obsessed with DRY principles, they begin to re-use steps. To achieve this, step arguments are introduced. Data tables next. Examples get out of control. Lots of state begins to be stored between step definitions. Steps need to be executed in a particular order.
And before you know it, you’ve got a real mess on your hands. New team members find it harder and harder to understand the step implementations. When they want to add another small feature, they try and contrive extra data that will wend and weave its way through the system and exercise their code change. And if it all gets too hard, they’ll start copying-and-pasting scenarios and tweaking them slightly to test the particular path they want to cover. The number of scenarios explodes and the test suite slows down.
It can get to a point where Cucumber starts to replace the role of unit-testing. This is unfortunate because whilst Cucumber is great, it’s a poor choice for a unit testing tool. There are two reasons this is the case. Firstly: acceptance tests, functional tests, integration tests; call them what you will, but one property they all share is that they test the system as a whole rather than particular units. Hence the increasingly convoluted test-data gymnastics that we have to contrive in order to test new changes. Secondly, compared to writing your tests in code, all Cucumber is adding is an additional layer of indirection via a pseudo-natural-language syntax. If non-technical people aren’t reading those steps, all you’re adding is additional hoops to jump through for no benefit.
The Golden Rule
I have a basic criteria for whether Cucumber is going to add value to your project. In short, if your scenarios aren’t understandable to business people, there’s no point in using Cucumber.
If developers are going to write the acceptance tests, they need to exercise extreme diligence. They need to ensure that each scenario should be almost immediately comprehensible to a business person. A scenario should not be full of so many gory details and twists and turns that a business person glazes over when reading it.
Of course, this is easier said than done. It’d require developers to regularly go and get a business person to take a look at our new feature files and scenario definitions. Yet, for many developers, this is an inexplicably hard thing to do.
A strategy that I’ve seen some developers use to stop their Cucumber tests getting out of control is to avoid tables, examples, and even arguments in steps unless absolutely necessary. Better to have comprehensible steps with some repetition than highly-refactored steps, but unreadable scenarios.
If you’re a developer who doesn’t think you’re going to be able to keep your feature files comprehensible, that’s OK – just don’t use Cucumber. Whilst in some ways this is a shame (because I think that it’s healthy and beneficial for developers to write a couple of high-level scenarios for new functionality that they are going to implement), in other ways it’s a relief, because if you just can’t help yourself when it comes to getting down to the nitty-gritty, I’d much rather you wrote your tests in code than that you wrote them in Cucumber.
A Third Way
So what if you still really want to use Cucumber, despite the fact that you can’t trust yourself to write scenarios that are comprehensible to the business, and the business is incapable of writing scenarios at all?
I have found testers to be great people to drive the development of feature files (or at least have a red-hot go at kicking them off). They can interact with the business to get a high-level picture, fill out the details, then get developers to write the step definition code (which developers just love to do because they just love to code). Testers are far enough away from the code to avoid unit-testing, yet close enough to check the important details.
The only trap testers should avoid falling into is this: when the developers have got all of the automated acceptance tests passing, testers should most definitely not use those tests as a script for their own manual testing (a strange phenomenon I have seen on a couple of occasions). Instead, they should use their new-found time and freedom to engage in exploratory testing.
I believe that Cucumber is only appropriate if non-technical people are going to be reading (and preferably writing) the steps. Whilst in theory Cucumber is a great tool for behaviour-driven development at any level, I’ve seen too many developers struggling to write comprehensible scenarios and step-definition code that is understandable to other developers, let alone non-technical people. The result is often a Cucumber test suite that is slow, brittle and difficult to maintain.
I’d rather that developers wrote their test in plain-old code, unless a business person is going to look a them. I also believe that, rather than trying to put a natural-language wrapper around their tests, developer effort would be better expended just trying to make test code more understandable.
High-level acceptance tests written in Cucumber are great, but let’s not go overboard. Let’s not let Cucumber become another technology roadkill, run down by over-zealous developers. And let’s not leave ourselves a test-suite legacy that we’ll all regret.