Wednesday, April 11, 2012

BDD and big datasets

One of the challenges that I hear in my classes on Agile Testing is around Behavior Driven Development and big datasets.  The intro to a lot of BDD tools looks something like this:

Given a customer named 'John Smith' who is 45 years old
When I execute a check for retirement eligibility
Then the result should be false

This is, of course, the Gherkin language from Cucumber, which has implementations in many languages including Ruby, Java, and .Net.  It's pretty compelling -- succinctly describes preconditions, actions, and expected results.  Until you think about applying it, then the trouble starts.

In this post, I'll talk about big datasets.  In the above example, assuming that we can insert a customer with only two attributes (name and age) is what stands out as being too simple.  In practice, our customer records have dozens or in some cases almost a hundred fields.  Gherkin has the ability to do tables by using the vertical pipe symbol, so do we create something like this?

Given a customer with these fields:
     |  name  |  age | street 1 | street 2 | city | state | zip | ssn | credit card | ccExpiry | signupDate | currentBalance | blah | blah | blah |
    | john smith | 42 | 4823 third | | seattle | wa | 98173 | 592-93-5382 | 4324 4322 3345 2838 | 11/15 | 04/12 | 438.27 | moo | beep | foo |

When I execute a check for retirement eligibility
Then the result should be false

I hope not, because that stinks.  It pretty quickly strips away most of the readability benefit that we were getting from this tool in the first place.  From a data point of view, it's also a mess because the tabular format requires us to flatten a lot of our dataset, making it harder to maintain.  And from a communication point of view, it requires us to really pan for gold -- some of these columns have an impact on whether the customer is eligible for retirement, and some of them are just boilerplate junk that we have to provide in order to create a customer.  Which is which?

The distinction (impactful vs boilerplate) is especially important when we try to maintain the test.  If 80% of the fields are boilerplate, we get into the habit of changing data to make the test pass.  Which makes it very easy to make a 'maintenance' change to the test that actually makes the test invalid.  In our example above, it might be reasonable to assume that it's just name and age.  But what if it was address, too?  It's possible that retirement eligibility (whatever that means) varies by state or county.  We need to  remove these attributes from the test, and have the test focus only on the details that impact the outcome.

The first step is to listen.  How do our product owner or businesspeople talk about the customers?  I mean beyond calling them big fat jerks when they're at the bar after work.  Are there certain classes of customers that we can specify?  Let's try that:

Given a West Coast customer aged 45,

When I execute a check for retirement eligibility
Then the result should be false

But what does West Coast customer mean?  This definition is contained in the step definition, and includes a set of Reasonable Defaults.  Every customer needs a name, so the system makes one.  And within the context of our West Coast Customer class, it makes up an address in one of the western coastal states.  This makes it clear what the real dependencies are in our test:  The address should be somewhere on the West Coast, and a particular age.  Everything else about the customer is irrelevant to this test.

On the step definition side, there are some decisions to be made.  First is what to parameterize.  In the beginning, do the Simplest Thing That Could Possibly work and parameterize none of it:

     @Given("a West Coast customer aged (.*)") ...

If, as I would expect, we end up with several different additional datapoints that we want to 'override' about our West Coast Customer, then the Builder pattern will be useful.  Specifically Builder rather than Factory, so that we avoid combinatorial explosion of the options for overrides.  The WestCoastCustomerBuilder might look like this after some time:

   public class WestCoastCustomerBuilder {
      public void BuildAndInsertCustomer();
      public void SetAge(int age);
      public void SetSSN(string ssn);
      public void SetCurrentBalance(double balance);
   }

And here's how it might be used in several step definitions:


    @Given("a West Coast customer aged (.*)")
        public void BuildWestCoastWithAge(int age) {
           builder.SetAge(age);
           builder.BuildAndInsertCustomer();
        }

    @Given("a West Coast customer aged (.*) with ssn (.*)")
    public void BuildWestCoastWithSSN(int age, string ssn) {
            builder.SetSSN(ssn);
            builder.SetAge(age);
            builder.BuildAndInsertCustomer();
    }

This shows the re-use of the builder pattern.  Critics will point out that we could also just have the second definition call the first, or avoid the need altogether with an optional clause in the step definition.  True, but doesn't provide as good an example of re-use.

The defaults could either be static data (all WestCoast addresses are my house), a random selection of a given dataset (choose a random address among these 50) or a pure random generation (make up an address on a numbered street in a west coast city).  As with all things, do the simplest first and see if that works for you.

In summary: whenever our 'required' datasets start to harm the readability (and hence maintainability) we should refactor the test to define only those inputs that affect the observed result.  All the boilerplate should be handled by generalized step definitions that are capable of supplying reasonable defaults.  This ensures that the test stays clear and focused, while creating clear and reusable step definitions.