Tuesday, October 10, 2023

Automation, You're Doing It Wrong With Melissa Tondi (PNSQC)



This may feel a bit like deja -vu because Melissa has given a similar talk in other venues. The cool thing is I know each time she delivers the talk, it has some new avenues and ideas. So what will today have in store? Let's find out :).



What I like about Melissa's take is that she emphasizes what automation is NOT over what it is.

I like her opening phrase, "Test automation makes humans more efficient, not less essential" and I really appreciate that. Granted, I know a lot of people feel that test automation and its implementation is a less than enjoyable experience. Too often I feel we end up having to play a game of metrics over making any meaningful testing progress. I've also been part of what I call the "script factory" role where you learn how to write one test and then 95 out of 100 tests you write are going to be small variations on the theme of that test (login, navigate, find the element, confirm it exists, print out the message, tick the pass number, repeat). Could there be lots more than that and lots more creativity? Sure. Do we see that? Not often.

Is that automation's fault? No. Is it an issue with management and their desire to post favorable numbers? Oh yeah, definitely. In short, we are setting up a perverse expectation and reward system. When you gauge success in numbers, people will figure out the ways to meet that. Does it add any real value? Sadly, much of the time it does not.   

Another killer that I had the opportunity to work on and see change was the serial and monolithic suite of tests that take a lot of time to run. I saw this happen at Socialtext and one of the first big initiatives when I arrived there was to see the implementation of a docker suite that would break out our tests into groupings split into fours. Every test was randomized and shuffled to run on the four server gateways. We would bring up as many nodes as necessary to run the batches of tests. By doing this, we were able to cut our linear test runs down from 24 hours to just one. That was a huge win but it also helped us determine where we had tests that were not truly self-contained. It was interesting to see how tests were set up and how many tests were made larger specifically to allow us to do examinations but also to allow us to divvy up more tests than we would have been able to otherwise. 

Melissa brought up the great specter of "automate everything". While granted, this is impossible, it is still seen forlornly as "The Impossible Dream". More times than not, it's the process of taking all of the manual tests and putting them to code. Many of those tests will make sense, sure, but many of them will not. The amount of energy and effort necessary to cover all of the variations of certain tests will just become mind-numbing and, often, not tell us anything interesting. Additionally, many of our tests that are created in this legacy manner are there to test legacy code. Often, that code doesn't have hooks that will help us with testing, so we have to do end runs to make things work. Often, the code is just resistant to testing or requiring esoteric identification methods (the more esoteric, the more likely it will fail on you someday). Additionally, I've seen a lot of organizations that are looking for automated tests when they haven't done unit or integration tests at lower levels. This is something I've realized having recently taught a student group to learn C#. We went through the language basics and then later started talking about unit testing and frameworks. After I had gone through this, I determined if I were to do this again, I would do my best to teach unit testing, even if at fundamental levels, as soon as participants were creating classes that processed actions or returned a value beyond a print statement. Think about where we could be if every software developer was taught about and encouraged to use unit tests at the training wheels level!

Another suggestion that I find interesting and helpful is that a test that always passes is probably useless. Not because the test is necessarily working correctly and the code is genuinely good but because we got lucky and/or we don't have anything challenging enough in our test to actually run the risk of failing. If it's the latter, then yes, the test is relatively worthless. How to remedy that? I encourage creating two tests wherever possible, one positive and one negative. Both should pass if coded accurately but both approach the problem from opposite levels. If you want to be more aggressive, make some more negative tests to really push and see if we are doing the right things. This is especially valuable if you have put time into error-handling code. The more error-handling code we have, the more negative tests we need to create to make sure our ducks are in a row.

A final item Melissa mentions is the fact that we often rely on the experts too much. We should be looking at the option that the expert may not be there (And at some point if they genuinely leave, they WON'T be there to take care of it. Code gets stale rapidly if knowledgeable people are lost. Take the time to include as many people as possible in the chain (within reason) so that everyone who wants to and can is able to check out builds, run them, test them, and deploy them.

No comments: