Friday, December 04, 2015

Node's Benefits Don't Matter. Use Node Anyway.

Systems that are be maintained by a single team will coalesce to the paradigm of their least changeable component. In web development, that's the front end, which is javascript.

One of the problems of being a "full stack web developer" is that you must master very disparate sets of abilities. In the traditional form, that's HTML/CSS, front-end javascript, back-end languages, and SQL. And maybe devops tools on top of that.

What makes Node unique in terms of platform features is that it is designed to be functional and non-blocking from the ground up. Each library is already set up to use some kind of callback pattern, and the platform architecture forces the developer to adhere. There are benefits to this paradigm, but they don't really matter.

Yes, Node's nonblocking IO and other features are great terms of scalability and performance, but these benefits don't really explain the uptake of Node as a first-implementation language. I would bet that 90% of the applications written in Node never get to a volume where the performance considerations matter. And those that do still end up needing custom performance features written in a compiled language.

But I hear the word "just" a lot, implying simplicity -- "just build it in Node". So what's going on?

The real benefit: paradigm consistency. All the different concepts (CSS, functional javascript, imperative Java, SQL, etc) are too much, and take away from your time concentrating on the business problem. So we skimp. We write Javascript that looks a lot like Java, or vice-versa. But what about Node?

Node is not only the same language as the front-end, but also the same programming concepts: event-based, callbacks, etc. This avoids shifting into the paradigm of imperative style, with blocking multithreading, big classes, etc. This, coupled with avoiding the specialized knowledge of build cycles and compilation and artifact deployment, makes it feel "just easier". And it is.

What about no-sql? Everything I said about Node is probably true for Mongo.

Writing to no-sql databases really looks very similar to the list/hash data structures that appear on the front end. It avoids the specialized knowledge of DDL. Data migration is done in functional/reactive programming.

That's what makes the node/nosql stack so compelling -- it's all the same paradigm, and it matches the browser-side paradigm, which is the one you can't change. This isn't bad, necessarily. In fact, it may be the kind of thinking that unlocks Node for a lot of teams.

Thursday, November 05, 2015

Convincing your team you haven't decided

One of the things I find myself saying a lot as a software architect is, "I don't have an opinion yet." Often the team continues asking questions, ending with, "Is that what you want?" and I say again, "No, really -- I don't have an opinion yet!" Since I am either the expert whose advice is being sought, or the leader who is responsible for making the decision, this is surprising (and sometimes a little frustrating) for the team I'm working with. I can't possibly be the only software leader with this challenge, so here are my insights about it.

There are two main reasons I profess not to have an opinion:

I need more information before I can make the decision.
I have an opinion, but I don't trust it.

One helpful thing for my team would be just to share these things. I think the reason I don't is that it's subconscious, so writing a blog like this helps me move these things from mental habits to explicit thought processes that I can change.

But sometimes I actually want the team to make the decision, and review it with me in some way (see 7 levels of delegation). In that scenario, I really don't want to taint the results with my opinion in situation #2, but I have to avoid making the team feel "set up" when I ask them to research something, and don't like their results.

Another part of the communication problem can be my tone of voice, coupled with a little bit of a love of philosophy. When we find a good question, my intonation tends to end on a descending tone, as though the sentence ended with a period. For example, "What is the test framework we will use here." If you read it like a sentence, it sounds a lot like your teacher asking a review question. And that tone of voice made the team feel like I already knew the answer, and was just 'testing' them to see if they can come up with it. (Kudos to my friend Matt for this insight)

Anyway, these are some things that happen to me as I work to lead a team of smart people. Share or reblog your insights as well!

Team Metrics Roundup

Potential Project Metrics

The purpose of metrics is to force a conversation, not to jump to a conclusion.

The purpose of metrics is to ultimately result in a change in behavior. But that change does not come from the metric directly, rather the metric indicates that a problem *may* exist, and forces a conversation. That conversation then results in a change to potentially both the team’s behavior and to the metric itself.

Metrics aren’t free -- they cost team time and leadership time, even if they are collected automatically by a tool, the team and management need to take time to review and discuss the results periodically. For that reason, it is important that teams and projects choose to track only the metrics that are potential problems to discuss. It may be that a team tracks a particular metric as they are trying to create a change in behavior for a time, and after that behavior change is achieved, the metric may no longer be tracked.

Productivity and Focus

These metrics attempt to measure the output of the team. In addition to raw numbers (higher is better), it is also valuable to look at the volatility of the numbers. For example, cycle time will tend to be more variable if the size of the work item is more variable.

Cycle Time

Definition: The average amount of time it takes an item to move from “in progress” to “complete”

Measures: Team productivity, focus.

Affected by: Work in Progress, external dependencies, task size

Mechanics: Configure task tracking tool to collect this information.

Work In Progress

Definition: The number of items that have been started, but not completed. This may include work that is awaiting external resources, etc.

Measures: Team Focus

Affected by: External dependencies, team size, team decisions

Mechanics: PM record number of tasks in progress daily, task tool tracks/enforces upper limit.

Throughput

Definition: The average number of issues completed per (day/week/etc).

Measures: Team productivity

Affected by: Task size, team focus, work in progress

Mechanics: Configure task tracking tool to collect this information.

Quality and Feedback

Testing is only one kind of feedback that leads to quality. Demonstrations to Product Owners can identify problems just as readily.

Testing and demonstration should be a practice that happens continuously during development, not just at project end. These metrics help identify this goal.

Time to first demo

Definition: The amount of time from the item moving “in progress” until the first demo to a Product Owner role

Measures: Early feedback habits

Affected by: team planning choices, Product Owner Availability, type of work done

Mechanics: Record this information manually in task tracking tool (Jira, etc)

Mean time between demos

Definition: the average number of days between a demonstration of any sort on the project.

Measures: Frequency of feedback

Affected by: team planning choices, Product Owner Availability, type of work done

Mechanics: Record when demonstrations take place in task tracking tool, and calculate manually.

Production Bug Count

Definition: The average number of bugs discovered in production, per release.

Measures: pre-production quality processes

Affected by: quality definitions, testing efforts, regression suites

Mechanics: Record each bug in a bug tracker, review each bug as it is reported to identify the release that introduced the bug.

Time to first passing functional test

Definition: The average number of work hours until the first non-unit test is written and passing.

Measures: Test Driven Development, early testing habits

Affected by: System design for testability

Mechanics: Record manually via task tracking tool.

Code Coverage

Definition: The percentage of the code covered by the automated testing suite. Best when broken down by cyclomatic complexity of target code. (simple code may be untested, but complex code should be well tested)

Measures: Testing Completeness, Change Risk

Affected by: team habits, system design for testability

Mechanics: configure CI to collect, team agrees to respond to these events.

Planning and Estimates

Estimated vs Actual time

Definition: Compare estimates given before starting to the actual time elapsed.

Measures: Accuracy of estimates

Affected by: team familiarity, developer assignment, external dependencies

Mechanics: record both values, review and discuss differences. Best when paired with ‘meterstick stories’ that can be used for comparison in lieu of raw hours.

Friday, February 13, 2015

The Highway Beautification Committee

The story goes that an old farmer was sitting on his porch one morning. He was looking down the state highway that ran in front of his house, and saw a truck off in the distance. As he watched, one of the people in the truck got out, and picked up a shovel from the back of the truck. They dug a hole by the highway, and got back in the truck.

A little bit of time passed, and the other person got out of the truck, and picked up the shovel. They filled in the freshly-dug hole, and returned to the truck. They pulled forward about 50 yards and did it all over again: dug a hole, waited, filled in the hole, and pulled forward.

The farmer watched these two all morning as they approached his house, until finally they were close enough for him to yell out, "Hey! What are you two doing out there?"

The first person yelled back, "We're on the Highway Beautification Committee!"

Puzzled, the farmer said, "Beautification? I don't think I understand."

"Oh, well, the guy who plants the trees is out sick today."

Wednesday, March 12, 2014

PragDave is wrong. And his advice is harmful.

Dave Thomas has posted that "Agile is dead, long live agility." There's some credibility assumed there, as Dave is a signer of the original manifesto itself, at the famous Snowbirds meeting that spawned it.
He asserts that Agile itself is corrupted, and we should reject the word and all its related practices, education, literature, trade groups, and conferences.

But he's wrong.

These terms (Eco, Natural, and Agile) aren't an excuse to turn off your brain. You still have to know what Agile (or Eco or Natural) means in order to evaluate that claim. Expecting everyone who uses a term to use it in exactly the same way is ridiculous. Changing it from Agile to "agility" won't make any difference or distinction. The same shady consultants and book writers that create "Doing Agile Right" books and courses will just write "Programming with Agility the Right Way." The same companies that say "we are agile" while writing four-page user stories won't wake up and say, "oh, our practices don't exhibit agility", they'll continue to say it just the same. And you'll still have to evaluate their claims, just the same.

But he's not just wrong, he's harmful.

Dave suggests that agile conferences, trainings, and apparently all literature on the subject is counter to the original spirit of the manifesto. Maybe is is, I wasn't there so I wouldn't know. But I know that they wrote about teams coming together regularly to tune and adjust their behavior. If your conference doesn't feel like that, it's a bad conference. If your training doesn't feel like someone sharing their experiences so that you can use what works, it's a bad training. If you're ignoring that principle by requiring that your teams "do Scrum by the book", then changing from nouns to adjectives in titles and slides isn't going to fix that.

Rejecting the experiences of others isn't going to make you better at doing agile, being agile, or performing with agility. His advice to "just do these four things, and build up your experience" while an accurate general guide (that I like), it ignores the fact that he has, himself, taken at least fifteen years to get to this point. Should we all just derive the fundamental principles of calculus ourselves as well? Or would it be better to talk together, as a team or as an industry, about what works and what doesn't?

He talks about "protecting" the word agile -- there is no practical way to do that, unless you want to trademark the term and sue those who use it in a way that you don't approve of. But I get the sense that Dave isn't a big fan of the Scrum Alliance either.

So what do we do? We keep being leaders, we keep sharing what works, we keep pointing out when the emperor or alliance has no clothes. What we don't do is mess around with meaningless semantics, and in the process reject an entire community and network who, on the whole, has changed the industry for the better and continues to push the envelope.

Sunday, August 25, 2013

How to get stakeholders addicted to attending demos

Do one of these characters sound like your business stakeholders?

Waterfall Warrior: "I don't have time to look at mid-sprint demos, it's not like anything changes if I do."

Optimistic PO: "I don't need to take time to look at the software mid-sprint, I already wrote down what I want."

The Waterfall Warrior is carrying the perception of non-agile projects -- all changes go through the Change Board, whose middle name is "Denial." Attending early demos is just a waste of time, as the news (bad or good) is the same whether she hears it every week, or just once at the end of the sprint.

The Optimistic PO believes in your team's perfection -- maybe too much! They don't yet see the problem that their original docs are full of ambiguity.

Both of these are a contract negotiation model. How do we get them to participate, and as the Manifesto says, favor customer collaboration over contract negotiation?

Nicotine.

Have your mid-sprint demos in a very warm room, and as they enter, secretly put a nicotine patch on them, and remove it when they leave. Eventually, they will become trained that early demos deliver a mild euphoria. If a few days pass with no demo, your stakeholders become anxious, and jittery. They come to the Scrum Master and ask, "Hey, can you give me a demo? I could really use a demo right about now."

There's one problem with this solution: it's felony assault. A better idea is to find something more addictive, and easier to deliver.

"Yes."

Saying "yes" to changes, with small or zero change in the cost, is like crack cocaine to stakeholders. But how do we magically make changes cheap or free? Timing. Typically work that is recently created is the cheapest to change. It has nothing built upon it, no disruption cost, and if we give preliminary demos before costly activities like final QA, less rework cost. Try to put yourself in a situation where you're often saying, "Yes, we can change that. No, it's not a schedule impact, I just build it this morning."

It's the role of the Scrum Master to help the team move to this "right time", but also to highlight with some fanfare what just happened. Enthusiastic personalities might say, "Wait, say that again? We just found a change and are going to do it, with no project impact? That's great!" You could say "groovy" here, but that's probably stretching the metaphor. Regardless of your personality, it's the role of the Scrum Master to be sure that the stakeholder realizes that this is different than it has been before, and it's not just luck. It's a result of their attendance. Done properly, a few days later the stakeholder will come to the Scrum Master and say:

"Hey man, you got any demos?"

Friday, December 21, 2012

Scalatron: The best language tutorial / learning environment ever made

Let me expand on how great the Scalatron initial experience has been.

Scalatron has this fantastic introduction in several steps. Each step produces a bot that does something, and by pressing a couple of buttons in the in-browser UI, you can see it work as well as tweak it yourself. Here's a screenshot:

On the left is the tutorial, in the middle is the code, and on the right is the sandbox that allows you to actually see your little 'bot running around, limited to what it's allowed to "see". All this is regular in-browser stuff, no plugins. And the highest praise I can give it is that it "just works". Edit the code in the middle, press 'run in sandbox' and it does. Compile errors? The build step pops up an error box at the bottom with line numbers. Smooth.

And then the interaction with the tutorial: Every block in the tutorial has a "load into editor button" drops the code directly into the editor pane. Which means no copy/paste errors, and it works in blocks, so if you're working on the missile-launching section and you messed up the movement section with a half-baked idea, the tutorial lets you reset just one part. This is really polished, catching use cases like, "you asked to load it into the editor, but you haven't saved the work there. Save first?"

The tutorials were the foundation of the thoughts and opinions on the language that you can read in my previous post. They take you through the major language features by introducing real problems you need to solve: how do I parse a string? How do I re-use a function? What are vars and vals? None of it feels contrived just to make a point.

And this IDE has room for growth. There's a little [<<] button on the tutorial that lets you take off those training wheels, and continue on with your bot development. Save your work, give it a label, and load it into the Scalatron battle instance that loaded up when you started Scalatron. Instant gratification.

One complaint I have is that it's not very clear from this IDE how to restart the tournament without restarting the Scalatron process, which is running your IDE is running the tournament. It doesn't always interrupt the IDE, but when it does...

My other complaint, is that there's no way to do TDD with it. That's a show stopper for me, if I write too much code without a test I start to get physically ill. So now it's time to get a Scala environment up and running. This has often been the stopping point for me in other languages/environments as I really have very little patience for cobbling together build scripts or downloading disparate parts. More on that in the next post.

Thursday, December 20, 2012

If Socrates was a Scrummaster

This blog moved to my company's site: http://www.davisbase.com/if-socrates-was-a-scrummaster/

Thanks for reading!

Wednesday, December 19, 2012

Getting Scala with Scalatron

I've always liked Scala, but like many of my early dating experiences, it has been from afar... wistfully looking at Scala projects on github, thinking that one day I'd take the plunge and ask one of those nice projects out.

Then I found a geek that was easy for me to connect with: Scalatron. Here ends the dating analogy.

Scalatron is a framework for writing competitive AI bots that compete in an online arena. But Scalatron is also super easy (hey, the dating analogy is over, remember?). Download the zipfile, extract it, type java -jar Scalatron.jar and jump into the very nice tutorials, and get quick visual feedback on the code you're writing. Plus, it's fun and you can kill stuff.

Scala's Goals

Scala's goals remind me a lot of Java's goals when it first came out. Java was a Real OO language, where everything is possible (mostly), but the easiest way to do things was the OO way. At least as compared to C++ or interpreted languages at the time. I think it's maybe being eclipsed by C#, but that's a different article.

Scala's goals are to be a Real Functional language where everything is possible, but the easiest way to do things is the Functional way. So it has all the tools to make it really obvious when you're being non-functional. val vs var, for example really prompts you to say, "Hey man, why you gotta go and make this mutable? You trying to make problems for yourself?"

And it wants to have a type system that doesn't get in the way, or leave you in the dark. This addresses my big complaint with dynamic languages. Dynamic languages are nice right up until you're working with unfamiliar codebases, which unfortunately is immediately. The Ruby framework routinely made me upset by having no types defined at all, not even named. What does this method take? A hash. Ok, not helpful. So I open up the code and ask, "what does it do with that hash? Oh, it passes it to another method. great." And pretty soon you figure out that it's hashes all the way down.

"But Kevin!" the dynamic programmers say, "That's the flexibility! You can pass it anything, that's the point!" Anything? Yeah, what if I pass it some drek? Will it like it? I suspect it won't be good for either of us.

Type Inference to Tuples

The middle ground. Strong types and a compiler that cares (vs leaving us to write unit tests to do what a typing system could do for us) but no more of Collection collection = collection collectioncollectioncollectioncollectioncollection. It's interesting to work with a type inference system, because it's such a perfect window into why we had no inference before. I find myself looking around the code saying, "Wait, what type is this again?" The compiler knows, but I don't. It's like one of those logic puzzles that is 10 steps deep in transitive logic. And if I were writing code in vi or something, it would be a pain. But I don't do that anymore. I'm not using an IDE right now, but I'm sure that will be my main way of working, and I have full confidence that the Scala IDEs will let me hover over something and see its type.

But Functional programming's no side effects rules means that you're frequently going to have multiple return values, which in many languages generates a lot of types. But this is Functional programming, so Tuples are built right in with both type inference and static constructors to keep the code from being redundant. Syntactic sugar? That's a funny way to spell 'readability'.

val myTuple = ('something','somethingElse')

Except I do have a complaint: _1 and _2 aren't particularly good names for the contents of a Tuple. Time will tell if this is a problem, but I suspect it might be. The typing system will tell me I get three strings, but what are they? There's some more syntax so that you can assign things directly to meaningful names, which is nice:

val (firstName, lastName) = someTupleReturningFunc()

But how did I know what that function did in the first place? Especially if it's cleverly passing tuples from its component parts? Ideally I'd like for the API to be discoverable, and type information isn't always enough.

I guess at that point you create an object (via Scala's object keyword, which is distinct from class) and give everything proper names. I read online about using a case class to extend a tuple, which looks nice but apparently has some real complications. I'm just a novice, of course, perhaps there's a better solution still that doesn't fall all the way back to making pojo beans.

The first touch that I have with real Functional programming is almost always with some kind of mapping function. And that's even the famous Function Heard Round The World, Google's MapReduce. I like how it looks in Scala, returning a Tuple for a given input.

No return statements? What about readability and accidents?

And here again is an "aha" moment for me. The Scala 'way' is to have no return statements. This is an element of consistency when looking at quick function definitions. And Scala, like Javascript, loooves functions. And if most functions are going to be one-liners, having a return keyword in there is just clutter. So the quick-closure definition of a function doesn't use the return keyword. And neither should regular functions.

Language Habits: names for everything

I can see that my Java habits are going to give my initial Scala code a strong accent. Like the charming Eastern European habit of dropping articles like "a" and "the", I'm going to probably name lots of stuff that I don't need to in the beginning. Right from Scalatron's example:

val rest = tokens(1).dropRight(1)               
val params = rest.split(',')                    
val strPairs = params.map(s => s.split('='))    
val kvPairs = strPairs.map(a => (a(0),a(1)))   
val paramMap = kvPairs.toMap

That's familiar code right there. But we don't need all those names, technically. The way all the cool kids are doing it these days is this:

val paramMap =
    tokens(1)
    .dropRight(1)
    .split(',')
    .map(_.split('='))
    .map(a => (a(0),a(1)))
    .toMap

Get off my lawn! I mean, uh, what a charming way to string everything together. But what it represents here is the Functional point of view. These are all simultaneous operations we are invoking on the original value. The intermediate names aren't useful. I have to admit I'm not convinced. But then I can't explain to my Lithuanian friends why "a" and "the" are important, either.

Function scoped Functions

And hey, if we're going to have functions that we pass around like variables, there's no reason we can't have locally-scoped functions is there? In fact, this is pretty key. In OO, methods exist to change the state of the object. So we had only a couple of levels of visibility of methods. If you felt like you wanted to hide some methods from some other methods, this was a pretty good hint that you actually need to divide the object into parts.

But now we don't usually have state to deal with, and we're going to be using a LOT more functions. So there you go, dawg.

Objects vs Classes vs Case Classes.

This distinction seems a little fine on first blush. First, terminology. Just like in Java and other OO languages, objects are things that are instantiated, and classes are definitions and names for things that you may choose to instantiate (or not, I suppose). And Case Classes are a special thing that ... some people apparently hate. I think that's where I'm going to have to pick up next time.

Friday, November 30, 2012

Real Discoverability

An important aspect of languages and toolkits is discoverability. The ability to answer questions like:

What other code (including display code, configuration) uses this method/class?

What code or behavior does this configuration option reflect?

Where is this text displayed?

Where is this data from this DTO used? What business logic does it impact, or is it just display?

This is a property that dynamically typed languages can lose as a result of that dynamic typing. (That's not to say that statically typed languages preserve it in all, or even most, cases -- more on that later). One way to think about it is to compare the difference in result to searching for string matches on "getName(" and comparing that result with the reality of the codebase. Some problems will include:

False positives: the text search picks up calls to methods on other objects that happen to have the same name

Stupid false negatives: you need to be a little sophisticated to get both "methodName(" and "methodName (" etc.

Method overloading causes more problems: If you're looking for the method that takes a string, not the one that takes a Customer, that's tricky to do by text searching alone. You'll have to go through them by hand.

Metaprogramming totally wrecks your string search. Assigning the method to a variable then passing that off to something else to be executed hoses your text search.

Statically typed languages aren't a whole lot better, actually. As soon as you add an interface, you've abstracted the call to the method. Even though it's a useful abstraction, you can't tell what happens in practice. And static languages' reflection attributes wreck things as surely as function pointers.

What if the virtual machine recorded things for you, and piped that data back to your IDE? Assuming you're running this in either production or a test environment that has good coverage, that could make it clear what paths are frequent and what sections are never called -- in production meaning they're unused, and in test code meaning untested. For example, a Ruby IDE could use this to generate many of the features of statically typed langauges, like automated refactoring, and to the point of this article: call hierarchy discovery, without all the headaches above.

This same approach might be viable at the framework level. Rather than trying to derive from the code and the configuration where data is displayed, use testing or production to record it in practice. The former is like proving code correct -- technically possible, but very very difficult. The latter is much like the real-world testing we do: rather than trying to construct a proof of correctness of the button, just push the button and see what happens. In this case, we record what did happen and use it to inform the rest of the system and our tools.

Thursday, July 05, 2012

Acceptance Testing: what is it good for?

Overview

Acceptance Testing is an approach to providing fast feedback based on business scenarios. It helps teams avoid the brittleness and long delays associated with automated GUI testing. I'll also look at a specific tool, Gherkin, and explore in detail how its approach allows us to build automation steps that are business-readable, very reusable, and potentially even allow non-technical people to author tests with no development involvement.

What puts the "Acceptance" in "Acceptance Testing?"

First, let's get the context right: the word Acceptance here is the same as the Acceptance in Acceptance Criteria, the "back of the card" additional details on on a User Story that the team needs to estimate the work. It's unrelated to the Acceptance in User Acceptance Testing, except maybe at some overarching goal of making people happy.

The Acceptance in Acceptance Criteria does relate to the ceremony at the end of the work where the Product Owner decides whether to accept the work or not. Our Acceptance Criteria then, are the Criteria that the Product Owner will use to make that decision. (Or at least some of the criteria -- not intended to be a contract).

Acceptance Testing is expressing those criteria as tests. At this level, it could be anything that the Product Owner can dream up. All of that should be tested, but not all of it can be automated. This puts it squarely in Q2 of the testing quadrants that are at the heart of Agile Testing by Lisa Crispin and Janet Gregory:

Behavior Driven Development with Gherkin

With all this in mind, let's look at a User Story, some Acceptance Criteria, and a Gherkin test. We'll use the User Story format of "As a .. I want to ... so that ..", and add Acceptance Criteria as bullet points afterward.

As a nurse,
I want to scan a patient's id and get a list of prescribed medications
so that I give the correct medication to the correct patient

Patient's name and current room number is displayed
List of all currently prescribed medication list is displayed, ordered alphabetically
Medications already administered are indicated in red.
If the patient is not found, a message is displayed with the info from the scanned id

Let's look at another format, the Gherkin BDD format.

Given precondition
When actor + action
Then observable result

This standard format, like the standard User Story format, is a lot like the tool pegboard that you have in your garage: It tells you where to put things so you can find them quickly, and tells you when something is missing.

Extending the garage metaphor, Gherkin itself is only a tool -- it can be used for many purposes including Unit Testing, GUI testing, load testing, etc. Let's look at how to use Gherkin to create Acceptance Tests from our Acceptance Criteria.

Our first Criteria is that the patient's current room and floor is displayed. Let's first re-word that in the Gherkin format, called a Scenario:

Given a patient is in a particular room
When the nurse scans the patient's id
Then the patient's room and floor is displayed.

Okay, big whoop. This isn't very helpful -- it's just wordier. In this case, when the actor and action are the same as the user story, it's not very interesting. But let's add some specific examples to it:

Given a patient with id 1234 named Kevin Klinemeier is in room 305B
When the nurse scans patient id 1234
Then Kevin Klinemeier and 305B is displayed

Now that we have specifics in place, this is an Acceptance Test. This looks like the kind of thing we could actually automate, and it begins to suggest some other scenarios: What happens if the wrong patient id is scanned? Can there be two patients with the same ID? Can there be two patients in the same room? But first, this is just English-- how does it get automated?

Each step in the BDD Scenario is implemented by a developer. If we were working with a user interface, it might look like this:

First choose the variable parts, described here by underlines:

Given a patient with id 1234 named Kevin Klinemeier is in room 305B

Then write some "wiring and glue" code to put the variables into place, described here as a UI-automation pseudocode:

 public void givenAPatient(patientID, patientName, roomNumber) {  
   window.open(PATIENT_ENTRY_SCREEN)  
   writeIntoTextBox("patient_id", patientID)  
   writeIntoTextBox("patient_name",patientName)  
   writeIntoTextBox("room_number", roomNumber)  
 }

Great, that makes sense. Except it stinks! Our goal as Agile Testers is to provide fast feedback, right? But now we can't run our test until the GUI is finished. Furthermore, automated GUI testing is itself notoriously brittle. Instead of a simple writeIntoTextBox method, we're more likely to have something that looks like this:

 writeIntoTextBox("xpath:=/html/body/div/div[2]/div/div/form/label/input[1]",patientID);

That's xpath in there, and it's specifying an input box on a particular place on the page. If the page changes the order of the fields, the test breaks. If it changes the location of the fields, the test breaks, if it removes something that is above the fields, the test breaks. You get the idea.

Instead, we avoid the problems of GUI automation by testing in the "middle". The unit test level is too early, and as a result is too technical and doesn't speak to the business. The GUI layer is too late, it doesn't allow the team feedback on both behavior and design at a time when it can be resolved. The sweet spot is to test services, and hence design services for this kind of testing. This is best described by Mike Cohn's testing pyramid:

Part of the feedback we are providing is on whether a given software design is Testable. There are many roadblocks to testability in GUI, but much less at the service level. Furthermore, when we provide these BDD Scenarios as the starting place for testing at the service level, our services as a result are not only testable, but based on the underlying business concepts, and that's just what we want out of a good service layer.

To complete the example, let's look at a service-layer implementation for the step we have above:

Given a patient with id 1234 namedKevin Klinemeier is in room 305B

 public void givenAPatient(patientID, patientName, roomNumber) {  
   patientEntryService.createPatient(patientID,patientName,roomNumber);  
 }

When the nurse scans patient id 1234

 public void scanPatientId(patientID) {  
    result = patientScanService.scan(patientID);  
 }

Then Kevin Klinemeier and 305B is displayed

 public void shouldContainPatientNameAndRomNumber(patientName, roomNumber) {  
    assert.that(result.getName).isEqualTo(patientName);  
    assert.that(result.getRoomNumber).isEqualTo(roomNumber);  
 }

BDD and Building Blocks

We now have one working example: the patient scan. It looks like maybe a lot of work. The exciting part of BDD, and the thing that makes this all a scalable part of long-term software development processes, is what happens when we move on to the next Scenario. Let's pick one of the other Scenarios we came up with before: scanning a patient that doesn't exist. First let's write it with Given .. When .. Then:

Given a patient with id 5678 named Joe Justice is in room 205A

When the nurse scans patient id 0001

Then an error is displayed

I've highlighted the first two steps in green: they're free! Well, we already paid for them, but they're free to re-use in this new scenario. The only new wiring to write is looking at the error that is returned.

Once we have a few of these basic building blocks, we can really take that development effort and amplify it through everyone else on the team who can create these near-english Scenarios: QA, BA, Product Owner, perhaps even Customers and Users. As an example, with just the four different steps we've completed so far, we can create and automatically run all the following test Scenarios without further development help:

Scenario: Duplicate patient ID
Given a patient with id 2838 named Bill Bates is in room 203D
And a patient with id 2838 named Clay Cummings is in room 405F
When the nurse scans patient id 2838
Then an error is displayed

Scenario: Multiple patients in room
Given a patient with id 8432 named Amelia Anderson is in room 403B
And a patient with id 9392 named Nelly Newborn is in room 403B
When the nurse scans patient id 2838
Then Amelia Anderson and 403B is displayed
And Nelly Newborn and 403B is displayed

These scenarios might also drive discussion: should we be able to tell the difference between an error for a patient id that is missing vs an error for a patient id with multiple entries? And that's just the kind of business-level input we are looking to provide feedback for, delivered before the GUI is even started.

Summary

Acceptance tests that express business concepts are powerful, but automation at the GUI level is problematic and waiting for the GUI layer to be built introduces long delays for testing. Instead, provide faster feedback by automating these tests at the "middle" layer (service layer) where the business concepts can be expressed, but the GUI details are not yet in the way.

Using the BDD Gherkin format to write tests creates tests that are not only automatable, but also creates small building blocks that reduces the overall cost of automation and allows non-technical team members to extend the automated test suite.

Wednesday, April 11, 2012

BDD and big datasets

One of the challenges that I hear in my classes on Agile Testing is around Behavior Driven Development and big datasets. The intro to a lot of BDD tools looks something like this:

Given a customer named 'John Smith' who is 45 years old
When I execute a check for retirement eligibility
Then the result should be false

This is, of course, the Gherkin language from Cucumber, which has implementations in many languages including Ruby, Java, and .Net. It's pretty compelling -- succinctly describes preconditions, actions, and expected results. Until you think about applying it, then the trouble starts.

In this post, I'll talk about big datasets. In the above example, assuming that we can insert a customer with only two attributes (name and age) is what stands out as being too simple. In practice, our customer records have dozens or in some cases almost a hundred fields. Gherkin has the ability to do tables by using the vertical pipe symbol, so do we create something like this?

Given a customer with these fields:
| name | age | street 1 | street 2 | city | state | zip | ssn | credit card | ccExpiry | signupDate | currentBalance | blah | blah | blah |
| john smith | 42 | 4823 third | | seattle | wa | 98173 | 592-93-5382 | 4324 4322 3345 2838 | 11/15 | 04/12 | 438.27 | moo | beep | foo |

When I execute a check for retirement eligibility
Then the result should be false

I hope not, because that stinks. It pretty quickly strips away most of the readability benefit that we were getting from this tool in the first place. From a data point of view, it's also a mess because the tabular format requires us to flatten a lot of our dataset, making it harder to maintain. And from a communication point of view, it requires us to really pan for gold -- some of these columns have an impact on whether the customer is eligible for retirement, and some of them are just boilerplate junk that we have to provide in order to create a customer. Which is which?

The distinction (impactful vs boilerplate) is especially important when we try to maintain the test. If 80% of the fields are boilerplate, we get into the habit of changing data to make the test pass. Which makes it very easy to make a 'maintenance' change to the test that actually makes the test invalid. In our example above, it might be reasonable to assume that it's just name and age. But what if it was address, too? It's possible that retirement eligibility (whatever that means) varies by state or county. We need to remove these attributes from the test, and have the test focus only on the details that impact the outcome.

The first step is to listen. How do our product owner or businesspeople talk about the customers? I mean beyond calling them big fat jerks when they're at the bar after work. Are there certain classes of customers that we can specify? Let's try that:

Given a West Coast customer aged 45,

When I execute a check for retirement eligibility
Then the result should be false

But what does West Coast customer mean? This definition is contained in the step definition, and includes a set of Reasonable Defaults. Every customer needs a name, so the system makes one. And within the context of our West Coast Customer class, it makes up an address in one of the western coastal states. This makes it clear what the real dependencies are in our test: The address should be somewhere on the West Coast, and a particular age. Everything else about the customer is irrelevant to this test.

On the step definition side, there are some decisions to be made. First is what to parameterize. In the beginning, do the Simplest Thing That Could Possibly work and parameterize none of it:

@Given("a West Coast customer aged (.*)") ...

If, as I would expect, we end up with several different additional datapoints that we want to 'override' about our West Coast Customer, then the Builder pattern will be useful. Specifically Builder rather than Factory, so that we avoid combinatorial explosion of the options for overrides. The WestCoastCustomerBuilder might look like this after some time:

public class WestCoastCustomerBuilder {
public void BuildAndInsertCustomer();
public void SetAge(int age);
public void SetSSN(string ssn);
public void SetCurrentBalance(double balance);
}

And here's how it might be used in several step definitions:

@Given("a West Coast customer aged (.*)")
public void BuildWestCoastWithAge(int age) {
builder.SetAge(age);
builder.BuildAndInsertCustomer();
}

@Given("a West Coast customer aged (.*) with ssn (.*)")
public void BuildWestCoastWithSSN(int age, string ssn) {
builder.SetSSN(ssn);
builder.SetAge(age);
builder.BuildAndInsertCustomer();
}

This shows the re-use of the builder pattern. Critics will point out that we could also just have the second definition call the first, or avoid the need altogether with an optional clause in the step definition. True, but doesn't provide as good an example of re-use.

The defaults could either be static data (all WestCoast addresses are my house), a random selection of a given dataset (choose a random address among these 50) or a pure random generation (make up an address on a numbered street in a west coast city). As with all things, do the simplest first and see if that works for you.

In summary: whenever our 'required' datasets start to harm the readability (and hence maintainability) we should refactor the test to define only those inputs that affect the observed result. All the boilerplate should be handled by generalized step definitions that are capable of supplying reasonable defaults. This ensures that the test stays clear and focused, while creating clear and reusable step definitions.

Tuesday, February 02, 2010

Relative estimates are like a stagecoach

If you're having trouble getting across the idea of relative estimates (story points) vs velocity to your management, try this approach:

Estimate stories using a distance measure. Furlongs for fun, miles if you're a boring American, kilometers if you're a boring person on the rest of the planet.

Your team velocity is measured in furlongs-per-sprint. This makes clear the separation between how much work needs to be done and how fast a team can complete it. It calls out the knobs that management can turn immediately: go faster, do less work.

For the vehicle, consider a stagecoach. Every new horse added to a stagecoach adds to its potential speed. Four horses is better and faster than one horse, but not four times better. There's a point where you're better off with two stagecoaches.

Breakdowns:

Another popular analogy is rocks-in-a-box. Your velocity is called capacity, and described as a box for the Product Owner to fill with rocks. Rock size is estimated by the team. Lots of small rocks do a better job of filling up the box. This point, that small well-understood stories are easier to complete, is missing from the stagecoach analogy. (If the miles are short, they go by faster? Doesn't quite work.)

Monday, November 09, 2009

Questions to ask when you want something

Can I have it now?

No.

Can you tell when I can have it?

No.

When can you tell me when I can have it?

I don't know.

Do you need something from me?

No.

Bullshit.

Maybe this should be a flowchart.

Wednesday, August 26, 2009

Testing with Moxy

Overview
Build proxies that pass through most requests, but create mock results on well-known data.

Our situation
We needed to be able to verify the behavior for error conditions from our Selenium (web) tests. We had already created mock services that would return these results. We then had to manage when we connected to which endpoint, which put the expectation about behavior outside the test and in the server configuration. In order to do a single regression pass, we had to stop and reconfigure the server.

Our solution
We created Moxies (mock proxies), which most of the time are just a plain passthrough to the real endpoint from our vendors' test systems. For a well known set of data (addresses in Broken Arrow, Oklahoma) the Moxy returns error conditions. Those addresses and their expectations are defined in a single class, shared both by the Moxy implementation and the tests.

Benefits
Test real success and mock failure in same server configuration.
Data and its expectation is in sync (shared classes).
Code is explicit about which tests depend on which Moxies, and which specific behaviors.

Future
With the same solution we expect to create a set of test cases that do not rely on the availability of the vendor test systems, based on similar conventions. (Eureka, CA)

Other implementations
We considered using wrapper objects injected via Spring to accomplish the same thing with less monkeying around with endpoints. This would avoid needing to deploy a real http mock service, and avoid issues around certificate verification and authentication. However, an injected solution would require deploying the mock software on all the client machines, and tweaking its configuration so that the wrapper bean is injected in our test environment, but not production. The injection solution also doesn't test the parsing of the error response, though that should probably be done in unit tests anyway.

If we were using a ESB like Mule, we could have configured Mule to redirect the requests based on the data, and used our mock services unchanged, and the test system configuration also unchanged.

Some solutions avoid having to use key data in the request object by manipulating http header info instead. This wasn't an option for us, unless we create some kind of signal for that header all the way through to our public pages. That seemed too invasive for our situation.

Closing
It works for us, and has a cool name*. What more could you want?

* Yes, you could probably more accurately describe this as a decorator. But Mockorator isn't nearly so awesome a name.

Monday, August 03, 2009

Leading Self Organized Teams

Self Organization, Collaboration != Fire Your Leadership

One of the questions I've tackled with the agile teams I've worked with is how to find a balance between the need for direction and quick decisions with the open and self-organizing approaches that are recommended for teams practicing agile processes. A lot of this perceived conflict comes from people overstating the basics. In particular:
Self Organization != No Leadership. Collaborative Approach != Everyone Votes On Everything.

Core Values:

Self Organization: Invite owners and participants rather than assigning people to teams

Transparency: Discuss topics openly, rather than among a separate team.

Collaboration: The point of the self-organized group

Direction: The owner is responsible for driving to dates, providing major guidance (sometimes from above), and deadlock resolution.

Review: Periodic review of practices and applications is key to success.

Example: Spacely Sprockets Quality Issue

Problem: All the sprockets we've got are throwing NullPointerExceptions.

The Development Director in this case wants to delegate this task to the team. In order to do that, she asks for volunteers to be the "owner" for this issue, and Olaf steps up. After consulting with the Development Director for parameters (due dates, budget, etc) the first thing Olaf does is send an invitation: (Self Organization)

To: yourWholeTeam
Subject: Spacely Sprockets Quality Issue
Hello Team,
We need to determine whether to stay with Spacely Sprockets, change to Cogswell Cogs, or pursue some third option. if you're interested in participating, send me an email and I'll include you in tomorrow's meeting.
-Olaf

The group meets a couple of times (Collaboration), trades emails (via whole team mailing list, for Transparency) and works towards a recommendation. One participant suggests Gary's Gears, but Olaf shares that Gary's Gears are outside the budget for the project. (Direction). Absent of that option, the group finds consensus on staying with Spacely after an impassioned speech by George J., one of their salesmen. Olaf shares that recommendation with the rest of the team, then the Development Director, who puts the recommendation into place after a few additional questions/clarifications (more Direction).

Common Problems

Self Organization Problem: nobody signs on

Often you'll be expecting "the usual suspects" to show up when you invite people to collaborate. Sometimes you'll be surprised to find no responses to your hot topic. As the owner, this gives you the opportunity to find out why. This may be for many reasons:

People are busier than expected
People are tired of working with the issue
People feel that the solution is obvious
People feel that the recommendation won't result in change.

What to do:

Talk to your usual suspects with these possibilities in mind. The major advantage of the process in this case is that as the owner you are aware of these problems at the beginning rather than the end of the process.

Self Organization Problem: everybody signs on

Instead of "the usual suspects", you get the whole department. Reasons for this include:

Concern that some aspects of the issue are being ignored or are unknown to the group
Concern about "the expected outcome"
Size/Impact of recommendation

What to do:

Hold a first meeting and have a round-table where you invite each participant to share what motivated them to participate.

People who feel that they're alone in a concern have an opportunity to share it, and can hear others if they exist. Those who are worried about an "expected outcome" can share their point of view. If the source is that the impact of the recommendation is huge, then team members have an opportunity to voice general concerns and witness for themselves the process by which the recommendation is being made. Sometimes just having the preliminary session is enough -- the team can identify when enough of each viewpoint exists and will drop out satisfied that their viewpoint is represented.

Another approach is to assure the team as a whole that there will be an opportunity to review the recommendation before it is "ratified." This can make team members feel less urgency about allowing others to tackle a difficult or contentious issue.

You may be tempted as the Owner to be aggressive in this case about reducing team size. Be sure that you are keeping in mind that the real goal is not to simply make a recommendation, but to have it understood and implemented by the entire team. To this end, it may be more effective to allow for some up-front "inefficiency" in order to get everyone on the same page and reach the real desired outcome faster as a result.

Direction Problem: small project

Sometimes this all seems like a lot of effort, with more time spent sending invitations and setting up meetings than it would take to do the work.

What to do:

Use IM and/or email for self-organization. Set a deadline for response (self-organization), and list your planned actions by that deadline (Direction, Transparency). Ex:

To: yourWholeTeam
Subject: I hate the PMD "use if x==y not if x!=y" rule
Hey all, Can someone tell me why I shouldn't hate this rule? If nobody objects, I'll remove it on Wednesday.
-Developer Danielle

Collaboration Problem: can't reach consensus

Rational people can disagree on a subject. Time doesn't always allow all avenues to be examined.

What to do:

This is the "big job" of the owner. It's the owner's responsibility not only to identify when it's time to just make the call, but to also make all the participants feel that they've been heard even if they haven't been agreed with.

Every time the process is used without the need for this outcome are like money in the bank. That money (trust, really) is expended in these situations. If you've got a positive balance, then this is just a a problem. If you're overdrawn, then this situation can become a fiasco. Team members must feel that the situations where the owner makes the call without consensus are rare, and due to issues that are a toss-up, or due to outside pressure. If you're doing a lot of this, it's probably an issue that should be considered in the process review.

Transparency Problem: When should I include everybody?

Is copying yourWholeTeam on *everything* really the recommendation? What if they're unlikely to care and it's just noise?

What to do:

Use a restating of the golden rule to determine what to send to the whole group: If you weren't an active participant, would you want to know about this part of the discussion? Err on the side of Transparency.

Criteria For Review

These criteria help determine success of this approach and your specific practices:

Does everyone feel they understand the approach?
Does everyone feel that quality recommendations are being made?
Do team members feel involved, not dictated to?
Are recommendations timely and within expected parameters, ie conforming to Leadership's direction?

Last Words

To restate a problem from above, it's important to keep in mind that the end goal isn't a decision, its a decision whose spirit is implemented and upheld team wide. I use this approach not because it makes everyone feel good, but because it's the most effective way to get real results.

Wednesday, April 15, 2009

Rails Hates me

So, I've got a little extra time, and I figured it was time to get back to my rails project. I've half written this character-timeline-thingy maybe three or four times but never quite completed it. As I recall, I had it almost-completely-working maybe a year and a half ago, so I thought I'd take it out for a spin on my new macbook with its built-in rails mojo.

I grabbed my old files off of my external hard drive and: No love. No mysql. Oh, of course. Installed mysql and a visual editor, felt warm and fuzzy about that process. Created the tables with the table-creation script I created when I used this last time, made another mental note to look into rails' migrations and

$ rake...

Fail. My tests are talking about fixtures that have some kind of problem. This is fixed, and something else doesn't work, a nil where I didn't expect it. Oookay, bwuh? Maybe this is just old and I'll start over. It's got deprecation warnings in it too, so meh.

$ mv timemachine timemachineOldAndBusted
$ rails timemachine
$ rake

Fail again. The mysql gem isn't installed by default anymore. Okay, so:

$ sudo gem install mysql

Fail again. some crap about headers and native extensions. I turn to google, and go through a lot of gymnastics around installing stuff from source, upgrading gems itself, etc. etc. I follow all the directions I see in a pretty helpful article, everything seems to be passing and:

$ mv timemachine timemachineFail
$ rails -d mysql timemachine
$ rake

Fail. Now it says that the mysql gem isn't installed, but gem list says it is. Back to google, now the suggestion is to do some trickery with 64-bit mysql vs 32-bit mysql and binary hacking the broken mysql.bundle file to fix the problem.

This all feels like why Java and platform independence is still important. I'm tired of mucking with building from source into my operating system. AppEngine supports Java, right?

Wednesday, March 25, 2009

Easy self-improvement

What's the easiest thing you can do to improve yourself? Take a compliment, and make it twice as true.

What's your favorite compliment of all time, or of the last year? Chances are, your complimentor has identified something that comes easily to you, a natural gift of yours that was appreciated. It may be more valuable to improve something that you're already pretty good at than try to rebuild yourself in something that is difficult.

Let's say it's some mode of communication, maybe written communication. Where else can you use this skill? How could you be even better at it? What is it that makes you good, exactly?

These kinds of questions, when asked about something you're already doing well have a much lower "effort." If you've picked an area you're good at, that you're excited about, you may find that they return energy to your day instead of subtract it.

Monday, February 09, 2009

Cut n Paste to lower communication barriers

At work, we have a weekly newsletter. I used to read it, but I don't anymore. It used to just be in email, and I'd skim some of it as it came by before I deleted it.

Now, it's a document attached to an email. It requires just one more click to see it, and I don't do it. It had some nice information in it -- new hires, birthdays, other company news. Stuff I could read "incidentally". Clicking that link feels like committing to reading the whole document, rather than just skimming over what's already in my box.

I've been thinking about how I can lower barriers when I communicate as well. The easy one is that instead of just mailing a link (which most people won't click on), I paste the whole document into my email.

The same thinking has led to a lot of physical printouts in our office. This strikes some people as funny -- we're a tech company, yet much of our process is documented on giant sticky notes. Yet the results are clear: a giant sticky note next to the fridge communicates much more effectively than email, or a diagram in a folder in Sharepoint.

But what other barriers exist in our group that I'm not aware of? We've worked pretty hard to reduce them: we move desks when we reorganize teams so members can sit together, we aggressively encourage pair programming, our scrum process has its standups, etc.

A better question might be: how do we identify them? I'm not aware of a metric that I can rely on. Things come up in the retrospective, but that's just once every three weeks. Perhaps the real story here is about using all channels of communication.

In my email example, I have only one channel so I need to make sure it's as effective as possible. Even then, I don't really expect everyone to get the information I'm sharing.

When I give presentations, I try to say my point three times, and in three different ways: Once in text, once with a picture, and once verbally. Maybe there's a correlation there when I'm looking to drive change in our office (or just hold us to changes we've already agreed on): A poster, an email, and a verbal reminder during standup.

I'm going to look for things where I'm using only one mode of communication, and either reduce my expectations around that message or increase my modes. (make a poster, send an email, include it in our standups)