Wednesday, April 2, 2014

A San Franciscan in Seattle: #ALMForum Day Two Reflections

Last night, Adam Yuret invited me out to see what the wild world of Seattle Lean Coffee is all about. Having heard from a number of the people who have participated in these events, I decided I wanted to play as well, so my morning was centered around Lean Coffee and meeting a great group of Seattleites and their various roles and areas of expertise.

We covered some interesting topics including the use of Pomodoro and how to make the best use of it (I added the Procrastination Dash to the mix of discussions), the use of SenseMaker and whether or not the adherence to it as a paradigm bordered on religion (it's a framework for helping realize and see results, but it's not magic), some talk about the challenges of defining what technical testing really means (yes, I introduced that topic ;) ), sharing some thoughts on what defines a WIP limit for an organization, and some thoughts about "Motivation 3.0" (based on Daniel Pink's book "Drive").

Great discussions, lots of interesting insights, and an appreciation for the fact that, over time, we see the topics change from being technical to being more humanistic. The humanistic questions are really the more interesting ones, in my estimation. Again, my thanks to Adam and the rest of the Seattle Lean Coffee group for having me attend with them today.


Cloud Testing in the Mainstream is a panel discussion with Steve Winter, Ashwin Kothari, Mark Tomlinson, and Nick Richardson. The discussion has ranged across a variety of topics, staring with what drove these organizations to start doing cloud based solutions (and therefore, cloud based testing) and how they have to focus on more than just the application in their own little environment, or how much they ned to be aware of in between hops to make their application work in the cloud (and how it works in the cloud. as an example, latency becomes a very real challenge, and tests that work in a dedicated lab environment will potentially fail in a cloud environment, mainly because of the distance and time necessary to complete the configuration and setup steps for tests.

Additional technical hurdles have been to get into the idea of continuous integration and needing to test code in production, as well as to push to production regularly. Steven works with FIS Mobile, which caters to banking and financial clients. Talk about a client that was resistant to the idea of continuous deployment, but certain aspects are indeed able to be managed and tested in this way, or at least a conversation is happening where it wasn't before.

Performance testing now takes on additional significance in the cloud, since the environment has aspects that are not as easily controlled (read: gamed) as they would be if the environment were entirely contained in their own isolated lab.

Nike was an organization that went through a time where they didn't have the information that they needed to make a decision. In house lab infrastructure was proving to be a limitation, since they couldn't cover the aspects of their production environment or a real example of how the system would work on the open web. With the fact that OPS was able to demonstrate some understanding through monitoring of services in the cloud, that helped the QA team to decide to collaborate and help understand how to leverage the cloud for testing, and how leveraging the cloud made for a different dialect of testing, so to speak.

A question that came up was to ask if cloud testing was only for production testing, and of course the answer is "no", but it does open up a conversation about how" testing in production" can be performed intentionally and purposefully, rather than something to be terrified about and say "oh man, we're testing in PRODUCTION?!" Of course, not every testing scenario makes sense to be tested in production (many would be just plain insane) but there are times when it does make a lot of sense to do certain tests in production (a live site performance profile, monitoring of a deployment, etc.).

Overall an interesting discussion and some worthwhile pros and cons as to why it makes sense to test in the cloud. Having made this switch recently, I really appreciate the flexibility and the value that it provides, so you'll hear very few complaints from me :).

Mike Brittain is talking about Principles and Practices of Continuous Deployment, and his experiences at Etsy. Companies that are small can spin up quickly, and can outmaneuver larger companies. Larger companies need to innovate of die. There are scaling hurdles that need to be overcome, and they are not going to be solved overnight. There also needs to be a quick recovery time in the event something goes wrong.  Quality os not just about testing before release, it also includes adaptability and response time. Even though the ideas of Continuous Deployment are meant to handle small releases frequently performed, there still needs to be a fair amount of talent in the engineering team to handle that. The core idea behind being able to be successful in Continuous Development is the idea of "rapid experimentation".

Continuous Delivery and Continuous Deployment share  a number of principles. First is to keep the build green, no failed tests. Second is to have a "one button" option. Push the button, all deployment options are performed. Continuous Deployment breaks a bit with the fact that every passing build is deployed to production, where continuous delivery means that the feature is delivered with a business need. Most of the builds deploy "dark changes", meaning code is pushed, but little to no changes are visible to the end user (thin CSS rules, unreferenced code, back end changes, etc.). A Check in triggers a test. If clean that triggers automated acceptance tests. If that passes, then it triggers the need for user acceptance tests. If that's green, then it pushes the release. at any point, if the step is  red, then it will flag the issue and atop the deploy train.

Going from one environment to another can have unexpected changes. How many times have you heard "what do you mean it's not working in production? I tested that before we released!" Well, that's not entirely surprising, since our test environment is not our production environment. Question of course is, where's the bug? Is it in the check ins? Are we missing a unit test(s)? are we missing automated UA tests (or manual UA tests)? Do we have a clear way of being identified if something goes wrong? What does a roll back process look like? All of these are still issues, even in Continuous Deployment environments. One avenue Etsy has provided to help smooth this transition is a setup that does pre-production validation. Smoke tests, Integration tests, Functional and UA tests are performed with hooks into some production environment resources, and active monitoring is performed. All of this without having to commit the entire release to production, or doing so in stages.

Mike made the point that Etsy pushes, approximately, about 50,000 lines of code each month. With a single release, there's a lot of chances for there to be bugs clustered in that single release. By making many releases over the course of days, weeks or months. The odds of a cluster of bugs appearing are minimal. Instead, the bugs that do appear are isolated and considered within their release window, and their fix likewise tightly mirrors their release.

This is an interesting model. My company is not quite to the point that we can do what they are describing, but I realized we are also not way out of the ballpark to consider it. It allows organizations to iterate rapidly, and also to fix problems rapidly (potentially, if there is enough risk tolerance build into the system). Lots to ponder ;).

Peter Varhol is covering one of my favorite topics, which is Bias in Testing (specifically, cognitive bias). Peter started his talk by correlating the book "Moneyball" to testing, and that often, the stereotypical best "hitter/pitcher/runner/fielder/player" does not necessarily correlate to winning games. By overcoming the "bias" that many of the talent scouts had, he was able to build a consistently solid team by going beyond the expectations.

There's a fair amount of bias in testing. That bias can contribute to missing bugs, or testers not seeing bugs, for a variety of reasons. Many of the easy to fix options (missing test cases, missing automated checks, missing requirement parameters) can be added and covered in the future. The more difficult one is our own biases as to what we see. Our brains are great at ambiguity. they love to fill in the blanks and smooth out rough patches. even when we have a "great eye for detail", we can often plaster over and smooth out our own experience, without even knowing it.

Missed bugs are errors in judgment. we make a judgment call, and sometime we get it wrong, especially when we tend to think fast. When we slow down our thinking, we tend to see things we wouldn't otherwise see. case in point: if I just read through my blog to proof-read the text, it's a good bet I will miss half a dozen things, because my brain is more than happy to gloss over and smooth out typos; I get what I mean, so it's good enough... well, no, not really, since I want to publish and have a clean and error-free output.

Contrast that with physically reading out, and vocalizing, the text in my blog as though I am speaking it to an audience. This act alone has helped me find a large number of typos that I would otherwise totally miss. The reason? I have to slow down my thinking, and that slow down helps me recognize issues I would have glossed over completely (this is the premise of Daniel Kahneman's "Thinking, Fast and Slow".  To keep with the Kahneman nomenclature, we'll use System 1 for fast thinking and System 2 for slow thinking.

One key thing to remember is that System 1 and System 2 may not be compatible, and they may even be in conflict. It's important to know when we might need to dial in one thought approach or the other. Our biases could be personal. They could be interactional. they could be historical. they may be right a vast majority of the time, and when they are, we can get lazy. We know what's coming, so we expect it to come. when it doesn't we are either caught off guard, or we don't notice it at all. "Representative Bias" is a more formal way of saying this.

When we are "experts" in a particular aspect, we can have that expertise work against us as well. we may fail to look at it from another perspective, perhaps that of a new user. This is called "The Curse of Knowledge".

"Congruence Bias" is where we plan tests based on a particular hypothesis, whereas we may not have alternative hypotheses . If we think something should work, we will work on the ways to support that a system works, instead of looking at areas where a hypothesis might be proven false.

'Confirmation Bias" is what happens when we search for information or feedback that confirms our initial perceptions.

"The Anchoring Effect" is what happens when we become to convinced on a particular course of action that we become locked into a particular piece of information, or a number, where we miss other possibilities. Numbers can fixate us, and that fixation can cause biases, too.

" Inattentional Blindness" is the classic example where we focus on a particular piece of information that they miss something right in front of them (not a moonwalking bear, but a gorilla this time ;) ). there are other visual images that expand on this.

The "Blind Spot Bias" comes from when we evaluate our decision making process compared to others. With a few exceptions, we tend to think we make better decisions than others in most areas, especially those we feel we have a particular level of expertise.

Most of the time, when we find a bug, it's not because we have missed a requirement or missed a test case (not to say that those don't lead to bugs, but they are less common). Instead, it's a subjective parameter. We're not looking at something in a way that could be interpreted as negative or problematic. This is an excellent reminder of just how much we need to be aware of what and where we can be swayed by our own biases, even by this small and limited list. There's lots more :).


More to come, stay tuned.

No comments: