Friday, December 31, 2010

TWiST #26: New Year's Eve Episode with Henrik Emilsson and PICT

This will be the last podcast back story for 2010 :).

I'm smiling because at the end of 2009, there wasn't even a podcast back story to be told. There wasn't a podcast. Heck, there wasn't a Software Test Professionals (well, not like we know it now, at least) and for that matter, there wasn't even a TESTHEAD! 2010 has been a big year for a lot of testing initiatives, and I'm happy that I'm part of this one.

During the holidays, it's easy to get off your game and watch time just get away from you. The metric that I told you all about last week, meaning ten minutes of editing for every minute of audio, is still holding to be true. This weeks episode is 20 minutes long, and true to form, it took me about 4 hours to edit it. Some of my friends who listen to the podcasts all wonder, what in the world could take that amount of time? Cleaning out dead air, removing repetitive words, sometimes resequencing flow when it makes sense, and leveling the audio between the members of the conversation... yep, it takes that much time.

Since I picked up the new microphone, I've decided to record the intro and outro in its entirety each time, rather than just have the canned audio, which I did for each show from Episode #6 through Eposide #21. I always record the show specific audio, and then I'd work to blend the two together, and that also took time to get the leveling right, make the flow feel natural... it took me more time to blend the two pieces than it did to record a new patch each week. Of course, the funny thing is, with each recording there's a little different vibe... this week I sounded very amped (LOL!).

So this week, Matt interviews "the other Henrik from Sweden". This has been a bit of a running gag w/ us, since back in TWiST #7 when Matt interviewed Jonathan Kohl, Jonathan mentioned "Henrik from Sweden." For TWiST #9 Matt interview Henrik Andersson, a well known and respected tetster in Sweden with a strong voice in the tester community. However, this wasn't the Henrik Jon was referring to. He meant Henrik "EMILSSON", senior consultant w/ Qamcom Research and Development. Henrik Emilsson is also one of the key writers for the "Thoughts From the Test Eye" blog. Whoops (LOL!). So we fixed that this week, and now you get to hear from the "Intended Henrik from Sweden" :).

Click on the link to listen to Episode #26.

Also, during the show, Henrik mentions PICT, a free pairwise testing tool from Microsoft.

Standard disclaimer:

Each TWiST podcast is free for 30 days, but you have to be a basic member to access it. After 30 days, you have to have a Pro Membership to access it, so either head on over quickly (depending on when you see this) or consider upgrading to a Pro membership so that you can get to the podcasts and the entire library whenever you want to :). In addition, Pro membership allows you to access and download to the entire archive of Software Test and Quality Assurance Magazine, and its issues under its former name, Software Test and Performance.

TWiST-Plus is all extra material, and as such is not hosted behind STP’s site model. There is no limitation to accessing TWiST-Plus material, just click the link to download and listen.

Again, my thanks to STP for hosting the podcasts and storing the archive. We hope you enjoy listening to them as much as we enjoy making them :).

Thursday, December 30, 2010

Interviewing Your Replacement

Yesterday was an interesting experience for me. Since I've wanted to make the transition into my new role over at SideReel as painless for my current company as possible, I said I would be happy to help interview and talk to testers they were interested in hiring.

This is an interesting place to be in. While I've helped interview for testers to join teams I was part of and also to help with other groups hiring, this is the first time I've ever interviewed candidates who would effectively take over my own job. I had a chance to think "So, if I was going to hire someone to replace me, what would I want to see them do, or be?"

My first interviewee was a really nice lady about my age from Sri Lanka. She had been in the industry for about 18 years, albeit with a little more structure and broader team experience, including management. I was curious as to why she wanted to be entering an environment where she'd be a "Lone Tester" and if she was OK with that. She said she'd had a number of experiences in her career where she'd been the Army of One, and that she was OK with it. She liked when others were around, but it wasn't essential unless the product being tested needed to have that many resources.

One of the things I've become fond of discussing are the ideas that are part of the Association for Software Testing's Black Box Software Testing - Foundations class. Having taken it once and staffed it three times, I felt that  any tester that would replace me should surely be able to cover the material in that class without much effort, even if they hadn't taken the class. Besides, it would make for an interesting comparison. What did I know about this, and would I feel confident to talk about the ideas? Would the candidate?

As it turns out, she handled the questions I had for her splendidly, albeit she used different words and was not familiar with the idea of a test oracle (she was really, just not by that name). I threw some questions at her to see if she'd be able to navigate some common traps (thank you, Weekend Testing (LOL!) ) and again, though the terminology was different, and we both had to spend some time making sure we were both talking about the same things, I thought she performed admirably.

What was interesting was that I started to realize that my own filter and what I felt was "good involvement" in learning and skills development was becoming colored by my active involvement in the testing community. I felt a little bad when I sprung the Lee Copeland question "So, tell me about your favorite testing book, or about testing books you have read recently?" and she said she hadn't read any. My heart sank a little, but after further probing, I found out that she was a Ninja at Google, and actively perused sites, some blogs, and forums to get tool tips, news and ideas. I'm glad I pressed on, because I could have been all self righteous and smug with my "oh really? well I have read (yadda yadda yadda)" and missed the whole point of the conversation. She keyed in on the stuff she needed when she needed it, and had her trusted resources when she needed an answer. More times than not she got them, too.

It's interesting at times how we get a chance to review and see what we are doing and how we frame the way we solve problems, learn and even develop friends in the testing world. I've enjoyed my way of doing it, but I was given a reminder that that's exactly what it was... MY way. Her way was different, but in many instances no less effective. What was also great about this experience was that it was genuinely fun. I had a chance to talk with a peer who was enthusiastic, interesting, and had her own take and spin on what testing and Q.A. was all about, and I enjoyed the debate and I hope she did, too.

Wednesday, December 29, 2010

PRACTICUM: Selenium 1.0 Testing Tools: Chapter 1: Selenium IDE

This is the first entry for the TESTHEAD PRACTICUM review of PACKT Publishing's book Selenium 1.0 Testing Tools Beginners Guide by David Burns. As such, the emphasis on this type of review is the actual exercises and the practical applications of what we learn. Rather than reprint out the full listing of the exercises in the book (and there are a bunch of them, as that is the format of this particular title), much of these chapters are going to be a summary of the content, some walk-through of application, and my take on what I see and do.

Some notes on this first foray into this type of review. Selenium 1.0 Testing Tools is full of examples, "try it yourself" instructions and explanations of what happened. The book’s cover states “Learn by doing: less theory, more results”. In short, this may well be the perfect book to experiment with this form of review, as the focus will be less on the concepts and more on the actual implementation and methods suggested.

Chapter 1, Getting started with Selenium IDE:

In this introductory chapter, the following topics and goals are introduced to the reader:

  • What is Selenium IDE
  • Recording our first test
  • Updating tests to work with AJAX sites
  • Using variables in our tests
  • Debugging tests
  • Saving tests to be used later
  • Creating and saving test suites

The Selenium framework of tools covers a number of applications and technologies, but in most tutorials and examples on the web, Selenium for most users starts with Selenium IDE. The Selenium IDE is a Firefox plug-in that allows the user to record and sequence test steps, identify values, change and modify test criteria. This is a logical first place to start, since with the IDE, users can create tests quickly and get a feel for the inner workings of the application. Note: Firefox is a prerequisite, as Selenium IDE runs inside of it. Firefox as of the writing of this review is at version 3.6.13 (download from and install it if you haven’t already.

Install Selenium IDE

To get the IDE, the first step is to go and download the latest version, which is found at Selenium IDE version with this review writing is at 1.0.10.

Selenium IDE comes as an xpi file, which Firefox handily installs directly into the browser.

Once it’s downloaded, the installation asks if it’s OK to install the IDE. Click install and away you go. Restart Firefox and the IDE is read to be used. Just click on the Tools Menu item, select Selenium IDE, and you’re ready to get your first tests started.

Touring Selenium IDE

Selenium IDE has some simple controls to be aware of. They are as follows:

  • Base URL—This is the URL where the tests reference and start. All open commands will be relative to the Base URL unless a full path is inserted in the open command.
  • Speed Slider—This allows the user to control the overall speed of test execution (fast/slow)
  • The Play button /w three bars allows the user to run all of the tests in the IDE.
  • The Play button /w a single lit-up bar will let the user run a single test in the IDE.
  • The Pause button allows the user to pause a test that is currently running.
  • The button with the blue downward pointing arrow is the Step Mode Playback option; it allows the user to step through the individual test steps one at a time when the test has been paused.
  • The Red record button is, you guessed it, the Record button. This allows the user the ability to record a sequence of steps to make a test suite.
  • The Playback button in the square is there to allow tests to be run in Selenium Core TestRunner.
  • The Selenese Command select box lists all the commands needed to create a test (more on Selenese in a bit).
  • The Target text box allows you to input the location of the element that you want to work against.
  • The Find button, once the target box is populated, when clicked will highlight the element listed on the page.
  • The Value textbox is where you place the value to change. You can use this option to enter text into a form’s text box, for example.
  • In the middle of the IDE is a Test Table. This table lists all of the combinations of commands, targets, and values. Selenium owes much of its initial design to a tool called FIT. Tests were originally designed to run from HTML files (and thus have structured the test data into HTML tables). The IDE keeps this format. Clicking on the Source tab shows the HTML source code and the test structure.
  • The bottom of the IDE shows a number of links. The first is the Log, which keeps track of times tests run and error messages. If a tests item fails an [error] entry is created. The others we will cover in a later chapter.

Creating the First Tests

Just like any tool, Selenium IDE tests will only be as good and as robust as the user makes them. In its simplest sense, the IDE will record and playback procedures and steps as they are entered. It shares this with many automation tools, and likewise, it also shares the brittleness of those tests. To make tests robust and less prone to errors, keep these thoughts in mind:

  • Make sure tests start at a known state, such as from a particular page or base URL.
  • Keep your tests compartmentalized. Have them do one thing or cover just one area.
  • Make sure that any tests that have set-up have a corresponding break down (good tests clean up after themselves :) ).

The test example uses for its test URL. First and foremost, this URL actually takes the user to the Selenium Beginners Guide Page (I mention that because I have been infuriated many times when I get a book that is a couple of years old, go to their sample sites, and the sites are down or expired. Granted, not to be expected with a new book, but very nice to see off the bat that the practice examples won’t be for naught :) ).

The first example is a good and basic one, but it omits a step “0”, which is to open the browser to the base URL page mentioned in Firefox itself. Not a big deal, but I found myself wondering why my output didn’t look the same. I figured it out quickly enough, but to save a little time, open up Firefox, use the base URL, and navigate to that page. THEN open up Selenium IDE, which will start in record mode automatically.

  1. Make the base URL entry for the test be the one from the book (
  2. Click on Chapter1 link.
  3. Click on the radio button.
  4. Change the value of the Select (the drop down menu with various Selenium products).
  5. Click on the link (which brings the user back to the home page).

Once the test is recorded, click on the Single test playback button. The IDE will then walk through each of the steps in sequence, and if all goes according to plan, the steps will be colored green, you will have a successful run, no failures and a log listing of all steps performed like this:

[info] Executing: |open | / | |
[info] Executing: |clickAndWait | link=Chapter1 | |
[info] Executing: |click | radiobutton | |
[info] Executing: |select | selecttype | label=Selenium RC |
[info] Executing: |clickAndWait | link=Home Page | |

The book then has the users take a look and see if they can pass a Pop Quiz asking what language drives the IDE and if Selenium IDE runs on Windows Explorer (I’ll let the readers determine the answers to those questions on their own ;) ).

Updating Tests / Using Asserts

Our first test does perform some rudimentary steps, but what else can we do? We can add “asserts” to check if particular elements are on the page. An assert statement if it proves false will stop the test. By contrast a “verify” statement will aos check to see if an item is on the bage. If it fails, though, the test will continue.

To add the assert or verify to the tests, use the context menu that Selenium IDE adds to Firefox. This means to go to the actual page, click the mouse into a place on the screen or highlight a spot on the screen. Right-click to see the element (this method works for Windows & Linux. Mac users use the two-finger click method and display the Context menu).

The context menu will show what values can be selected based on where the mouse pointer is located. Each element will show different options, but they will generally be in the following format:

  • open
  • assertTextPresent
  • verifyTextPresent
  • verifyValue
  • storeElementPresent

There are other options beyond these, but this is a good starting point.

With this let’s modify the test to verify that there are specific items on the page. Record another example test .

  1. Open the IDE so that we can start recording.
  2. Set the Base URL to
  3. Click on Chapter1.
  4. Click on the radio button.
  5. Change the select to Selenium Grid.
  6. Verify that the text on the right-hand side of the page has Assert that this text is on the page.
  7. Verify that the button is on the page. You will need to use the context menu for this.

When you run this test this time, sections that pass show a green background and assert statements that pass show a background that is darker green to differentiate the assert statements. The Log below gives an example output (I added some additional assert statements for variety)

[info] Executing: |storeElementPresent | radiobutton | |
[info] Executing: |verifyValue | radiobutton | on |
[info] Executing: |click | selected(1234) | |
[info] Executing: |verifyValue | selected(1234) | on |
[info] Executing: |assertElementPresent | selected(1234) | |
[info] Executing: |assertTitle | exact:Selenium: Beginners Guide | |

Tests that go beyond the very simple will need to be maintained at some point, possibly by you, possibly someone else. To make life a little more pleasant, it’s helpful to use comments in your tests, like this:

  • go to a selenium test and select a step.
  • Right-click on the selected item.
  • Click on Insert New Comment
  • Click on the Command Text box and type your comment (“This checks to make sure the radio button is present”). 

Comment text appears as purple in the table section to differentiate it from commands.

So this is all good and straightforward when dealing with a single window and the items on the screen. How about dialog windows, or other pop-ups? Can we work with those in the same way? Yep, but it has some challenges. The browser needs to let Selenium know how many “child browser processes” have been spawned and keep track of them when they are open and when they are closed and in which order.

In the examples given next, we will click on an element on the page, which will cause a new window to appear. If you have a pop-up blocker running, it may be a good idea to disable it for this site while you work through these examples.

  1. Open up the Selenium IDE and go to the Chapter1 page on the site.
  2. Click on one of the elements on the page that has the text Click this link to launch another window. This will cause a small window to appear.
  3. Once the window has loaded, click on the Close the window text inside it.
  4. Add a verify command for an element on the page. Your test should now look similar to the next screen-shot.
  5. Click on the Close the window link.
  6. Verify the element on the original window.

The text below is verbatim from the book, hence the Red Text:

In the test script, we can see that it has clicked on the item to load the new window and then has inserted a waitForPopUp. This is so that your test knows that it has to wait for a web server to handle the request and the browser to render the page. Any commands that require a page to load from a web server will have a waitFor command. 

The next command is the selectWindow command. This command tells Selenium IDE that it will need to switch context to the window called popupwindow and will execute all the commands that follow in that window unless told otherwise by a later command. 

Once the test has finished with the pop-up window, it will need to return to the parent window from where it started. To do this we need to specify null as the window. This will force the selectWindow to move the context of the test back to its parent window.

The next example in the text shows two windows being spawned and closed and how the IDE keeps everything straight using unique IDs for each window.

Selenium w/ AJAX

AJAX applications use JavaScript to creating asynchronous calls to a server and then return XML with the data that the user or application requested. AJAX also uses JavaScript Object Notation (JSON) since it is more lightweight in the way that it transfers the data.

The text walks the user through a number of examples that build and show how the test fails because it cannot access the element containing the text was not loaded into the DOM. This is because it was being requested and rendered from the web server into the browser. To remedy this issue, we will need to add a new command (waitForElementPresent) to our test so that our tests pass in the future. There are a variety of waitFor techniques to use. Explore at your leisure :).

Storing Variables

You can store elements that will be needed to enter in on the page later on for a test. You can call these values by requesting them from a JavaScript dictionary that Selenium tracks. The author suggests using storedVars['variableName'] format, since this format is similar to how Selenium is coded internally. The following example shows how to create a variable to call for a later run.

  1. Open the Selenium IDE and switch off the Record button.
  2. Right-click on the text Assert that this text is on the page, go to the storeText command in the context menu, and click on it. If it does not display there, go to Show all Available Commands and click on it there.
  3. A dialog will appear similar to the next screenshot. Enter the name of a variable that you want to use. I have used textOnThePage as the name of my variable.
  4. Click on the row below the storeText command in Selenium IDE.
  5. Type the command type into the Command textbox.
  6. Type storeinput into the target box.
  7. Type javascript{storedVars['textOnThePage'];} into the value box.

Once your test has completed running, you will see that it has placed Assert that this text is on the page into the text-box.


It’s possible to walk through individual tests by highlighting a specific command and then pressing the ‘x’ key. To create messages in the log for debug purposes, you can also create an ‘echo’ command, so that the words following get printed to the test log. If you are having issues with elements on the page, you can type in their location and then click on the Find button. This will surround the element that you are looking for with a green border that flashes for a few seconds.

Test Suites

In addition to running individual tests, we can also run multiple tests in order. This is called a test suite, and the playback button with multiple stacked bars allows us the option to do exactly that. Saving multiple tests with unique names differentiates the individual tests. Saving the suite groups all the tests under a unique suite name. Running the suite runs all of the tests in the suite.

So what can Selenium IDE not record?

As of the books writing, this is the list of items:

Silverlight and Flex/Flash applications, at the time of writing this, cannot be recorded with Selenium IDE. Both these technologies operate in their own sandbox and do not operate with the DOM to do their work.

HTML5, at the time of writing this, is not fully supported with Selenium IDE. A good example of this is elements that have the contentEditable=true attribute. If you want to see this, you can use the type command to type something into the html5div element. The test will tell you that it has completed the command but the UI will not have changed, as shown in the next screen-shot:

Selenium IDE does not work with Canvas elements on the page either, so you will not be able to make your tests move items around on a page.

Selenium cannot do file uploads. This is due to the JavaScript sandbox not allowing JavaScript to interact with \<\ input type=file \>\   elements on a page. The tests will be able to insert text, but will not be able to click on the buttons.

Bottom Line

That’s a pretty good run through for a Chapter 1. The examples presented are fairly straightforward and do not require much in the way of second guessing, though a few times I did have to go to the all commands listing to find the command value I was looking for. Also, there are steps where the user is clearly expecting to do something (interact with the browser, for example) and that step isn’t listed. This seems to me to not be an omission in the directions, but rather an expectation on the part of the author that the step in question is obvious. To be fair, after a couple of tries, I was able to read between the lines and realize a couple of times what was necessary, and from there, the steps flowed and the examples were easy to duplicate.

Testing: The Next Generation

Yesterday, there was a hoax running around the Internet. It made its way to Twitter, Facebook, and other places that Adam Sandler had died in a snowboarding accident in Switzerland. As it often the case, people run with these stories for one reason or another. Part of me was skeptical, though, so when I posted a link to the story I said "I want to believe this is a hoax, but in case it's not..."

Over the course of just a few minutes, people were coming out and saying "oh, so sad" and "wait, how come this isn't on other media"... but what I thought was cool was the fact that, in one of the replies early on, my son posted "you cant click any of the above links on the site, so it looks a little fishy", meaning the links in the "newspaper site's" headers.


My kid beat me to the punch. He didn't just take the story at face value, he decided to see if there was other information about the site reporting it, and found enough inconsistency to question the sites veracity or validity. This started other posters making comments, and finally a friend of mine I'd worked with a decade ago said "Hey Michael, notice the URL? See how the first two prefixes are 'Adam.Sandler'? Put in another name and see what happens. I just put in 'Michael.Larsen'... hey look, you're dead!"

Sure enough, that was the case. The site was a hoax, and I'm happy to report that Adam Sandler is alive and well and not dead because of an accident snowboarding. I'm also happy to report that my son is exhibiting the sleuthy traits that make for an excellent tester. I told him as much in the replies to my post, and recommended he either embrace this side of himself or run for the hills screaming. No answer from him on that reality just yet :).

What I do find cool about this is the fact that our kids are looking at the web and what's being disseminated and they are not as quick to believe or accept on the surface what many of us older folks might. It feels a little depressing to call myself an "older folk"; I'm Generation X, that used to be cool and hip. To my son, though, I realize I have the acumen, attitude and persona of Elmer Fudd (as well as the haircut... and if any of you answer back with "Who's Elmer Fudd?", I swear I'll unleash a middle-age rant on you that you won't soon forget (LOL!) ).

Seriously, though, I used to be concerned that kids would be easily duped or swayed by the Internet or by their sources of information. Instead, many of them have healthy distrust and questioning meters, and they actively use them. I think this bodes well for the future members of our profession. Whether or not my son will be one of them I guess remains to be seen :).

Tuesday, December 28, 2010

Missed Opportunities And Public Goalkeeping

As I approach the end of the year that is/was 2010, I've had a chance to reflect on what I've done this year. The simple fact is that I've done a lot of things, way more than what I actually thought I could. At the same time, I've also gone back and looked at the blog posts that I have written, and I've also seen another pattern.

Over the course of the year, most of the things I set out to do/complete/accomplish I have done, and have actively enjoyed the journey. There is, however, one thing I said a few times that I wanted to do, but didn't really accomplish, at least not to the level that I wanted to, and that was to become more deeply technical in my understanding of writing code.

This has been a frustration of mine for quite some time, and I would like to say that I know why this has been an elusive goal. It is an elusive goal because, compared to all of the other aspects of software testing, it is the aspect I personally enjoy the least. Many times I've written that I wanted to do something about it, yet I never held myself accountable for it. It just managed to drift away into the background, and I didn't revisit it again.

Another tester and I discussed the idea of Public Goalkeeping, or making promises in a public forum as a way to shame ourselves into accomplishing something. We both agreed that it had a benefit, and worked in many cases, but oftentimes just the act of stating a goal in a stream of other conversations (via Twitter, Facebook or even my blog) leaves little in the way of follow-up unless we specifically set a way to follow up. What's worse, the act of stating the goal alone acted as a sort of "candy" for the brain, and the same rush was experienced, without all that pesky hard work. In short, not a long term recipe for success all by itself.

In 2007, I made a goal to lose 52 pounds in 6 months. I did it very publicly, and specifically, I used myspace as my platform to do it. I was successful in my goal primarily because I placed a solid constraint on myself; I committed to making a post every single Monday, telling where I was, how much I lost, or didn't, putting into perspective from my starting point and end goal, and a paragraph about the ups or downs of that week (exercise, diet, stress, etc.). I wasn't a fan of the process. It wasn't fun, it wasn't really all that enjoyable (getting to the weight I targeted was nice, but I'll admit I've back-slid considerably since then). The key to reaching the goal was that I openly, frequently, and regularly made sure I made those posts, good news or no, and thus, each week I had simple micro goals to help me stay focused on the macro goal, making weight for the week, which itself was a micro goal towards the macro-goal of "make the ultimate goal weight".

Thus, with this desire/goal/action plan, I've decided on two initiatives. The first one is a personal one, and it has to do with getting back to "fighting weight" again, which with me being 6'2" is 200 pounds. That means I have 45 pounds to lose to make that goal. I know what works, and I know that the regular reporting to those I have delegated as my accountability partners is how I do that. Don't worry, those who read TESTHEAD will not be my accountability partners for that particular goal... but you will for my other initiative :).

Earlier today I mentioned setting up a different kind of a book review I called the TESTHEAD PRACTICUM. This serves two purposes. The first is that it gives me a way to review books in a "practical" setting, where I can delve deeply into the technical aspects in ways that the other reviews don't really allow me to (the TESTHEAD BOOK CLUB reviews are meant more for a deep dive into concepts and seeing if I understand them and can explain them to others). The PRACTICUM will let me focus on issues specific to implementation, to hooking things up, to my elations and frustrations when I get it and when I don't, and by their very design, since they are regular and, most important, recorded, TESTHEAD itself, and by extension all who read it, are my accountability partners. In short, my last excuse why I have not been able to make progress on an initiative I claim to be important, yet still my least favorite aspect of testing, I now have an avenue to make it more focused and, yes, more palatable. I'm actually going to do something I like to do (write about the process and the journey) as opposed to just slugging away at it "just because it's something I should be doing".

Again, I really appreciate everyone who reads this blog, leaves me comments, and encourages me on. You are helping to hone my skills as a writer and as a teacher. Now I ask your indulgence and encouragement once more (and hey, I can take some razzing, too if its warranted) as I look to expand that notion to the world of becoming proficient at programming. We're starting with Selenium; we'll take it from there :).

PRACTICUM: Selenium 1.0 Testing Tools

One of the books that I did a review for a couple of months ago was related to the software audio editing tool Audacity, and was published by Packt Publishing in the United Kingdom. Now Packt has contacted me directly and asked if I'd be willing to do a review of one of their newer titles, David Burns "Selenium 1.0 Testing Tools". To this end, they are sending me a paperback version of the book and it should arrive within the next few days.

This it interesting in that it has given me a chance to do something  a little different with my book reviews once again. My "Wednesday Book Reviews" have generally been quick summaries and a reason for the thumbs up. As veteran readers might know, I don't print reviews of books that I think poorly of. If you see that a book has been reviewed, it automatically means I thought highly enough about it to write about it. If I haven't done a review on a book, it means one of three things:

  1. I haven't read it.
  2. I'm in the process of reading it (and really, some books take a long time to get through; there's a reason why I haven't posted a review yet for William Perry's "Effective Methods for Software Testing" :) ).
  3. I've read the book, and decided I didn't like it.

Additionally, I have typically shied away from specific technologies, because many people may never use that particular technology, and I like my reviews to be consistent and apply to all of my readers whenever possible. I've since come to realize that there is space for focused technology books, since one of the more common search terms related to my site is "Audacity", and of course, the most oft returned listing is my review of "Getting Started with Audacity 1.3".

So what can I provide that would be interesting? How about an actual walk-through of what the book recommends? A practical review of the steps and exercises, and my reaction to them and what I learned in the process? This would be different from the reviews I'd done in the past, and I feel it would be a fair and interesting way to review the title as well. Therefore, I'm going to start a new book review section called TESTHEAD PRACTICUM. This would also address other books I have reviewed in a shorter form, but have said "now, this book reads well, but I can't really review it completely because I don't have the time to go through everything and see how well it actually works". A Beginner's Guide to a technology can do exactly that! It's one thing to say that the book is easy and enjoyable to read, but how about what it imparts in practical working knowledge?

Selenium 1.0 Testing Tools will be my first experiment with this process. It will be a longer form review, along with a summary review at the end. It will not be as long or as specifically details as the TESTHEAD BOOK CLUB entries, because those are designed to do a deep dive on each chapter, and that may be overkill for what Packt would like to see or use.

So please join me in this new adventure, one that I hope will be instructional for all of us.

Monday, December 27, 2010

BOOK CLUB: How We Test Software At Microsoft (10/16)

This is the second part of Section 3 in “How We Test Software at Microsoft”. This is one of the chapters I have been most interested in reading (intrigued is possibly a more apt description). This chapter focuses on Test Automation, and specifically how Microsoft automates testing. Note, as in previous chapter reviews, Red Text means that the section in question is verbatim (or almost verbatim) as to what is printed in the actual book.

Chapter 10. Test Automation

Alan starts out this chapter with an explanation (and a re-assurance, in a way) that automation doesn't have to be such a scary thing. We've all done automation in one way, shape, form or another. If you have created or tweaked a batch file to run multiple commands, you have performed automation. When you do backups, or archive folders in Outlook, or create macros in Excel, you are creating automation. So at the core, most of us have already done the steps to automate things. Making a series of repeatable steps, however, is not really test automation. It's automation, to be sure, but test automation has a few more steps and some paradigms that are important to understand. More to the point, even with test automation, one size does not fit all, and different groups will use different processes to meet different needs.

The Value of Automation

Alan takes this debate straight-away, and acknowledges that this is a contentious issue. Alan ultimately comes down on the side of the fact that testers are not shell scripts (let’s face it, if a company could realistically replaced testers with shell scripts, we would be). Testing and testers require the human brain to actually understand and evaluate the test results. Also, there are many areas where full automation is great and helpful, and some places where automation gets in the way of actually doing good work (or for that matter, is just not practical to automate).

To Automate or Not to Automate, That Is the Question

Generally speaking, one of the best rules-of-thumb to use to determine which tests might be candidates for automation is "how often will this test be run"? If a test is only going to be run one time, it's probably low on the value of being automated. Perhaps even ten or twenty times may not make it a candidate for automation. However, if it's a test you will perform hundreds or even thousands of times, then automation becomes much more valuable as a testing tool.

Beyond times executed, Alan suggests the following as attributes that will help determine if a test is a candidate for automation:

  • Effort: Determining the effort or cost is the first step in determining the return on investment (ROI) of creating automated tests. Some types of products or features are simple to automate, whereas other areas are inherently problematic. For example, application programming interface (API) testing, as well as any other functionality exposed to the user in the form of a programming object, is more often than not straightforward to automate. User interface (UI) testing, on the other hand, can be problematic and frequently requires more effort.

  • Test lifetime: How many times will an automated test run before it becomes useless? Part of the process of determining whether to automate a specific scenario or test case includes estimating the long-term value of the test. Consider the life span of the product under test and the length of the product cycle. Different automation choices must be made for a product with no planned future versions on a short ship cycle than for a product on a two-year ship cycle with multiple follow-up releases planned.

  • Value: Consider the value of an automated test over its lifetime. Some testers say that the value of a test case is in finding bugs, but many bugs found by automated tests are only found the first time the test is run. Once the bug is fixed, these tests become regression tests—tests that show that recent changes do not cause previously working functionality to stop working. Many automation techniques can vary data used by the test or change the paths tested on each run of the test to continue to find bugs throughout the lifetime of the test. For products with a long lifetime, a growing suite of regression tests is an advantage—with so much complexity in the underlying software, a large number of tests that focus primarily on making sure functionality that worked before keeps on working is exceptionally advantageous.

  • Point of involvement: Most successful automation projects I have witnessed have occurred on teams where the test team was involved from the beginning of the project. Attempts at adding automated tests to a project in situations where the test team begins involvement close to or after code complete usually fail.

  • Accuracy: Good automation reports accurate results every time it runs. One of the biggest complaints from management regarding automated tests is the number of false positives automation often generates. (See the following sidebar titled "Positively false?") False positives are tests that report a failure, but the failure is caused by a bug somewhere in the test rather than a product bug. Some areas of a project (such as user interface components that are in flux) can be difficult to analyze by automated testing and can be more prone to reporting false positives.

Positively False?

The dangers with test automation are augmented when we do not take into account false positives or false negatives (using Alan’s non-software example, convicting someone of a crime they didn’t commit is a false positive. Letting someone go from a crime they did commit is a false negative).

We Don’t Have Time to Automate This

Alan describes the process of using Microsoft Test to write UI automation. His first job at Microsoft focused on networking functionality for Japanese, Chinese, and Korean versions of Windows 95. When he was ready to start automating the tests and asked when they needed to be finished, he recalls vividly hearing the words "Oh, no, we don’t have time to automate these tests, you just need to run them on every build."

Thinking the tests must have been really difficult to automate, he went back and started running the manual test cases. After a few days, the inevitable boredom set in and with it, the just as inevitable missing of steps "Surely, a little batch file would help me run these tests more consistently." Within fifteen minutes, some additional batch files answered that question handily. He applied the same approach to the UI tests, and found he was able to automate them after all, saving weeks of time that freed him up to look at other areas that would otherwise never get consideration due to the time to finish the manual tests. At least a hundred bugs were found simply because Alan automated a series of tests he had been told couldn't or wouldn’t be automated.

User Interface Automation

The API or forward facing functions are good candidates for being automated, but most of the time, the tools and promotional materials we as testers are most familiar with are those tests that automate at the user interface level. As someone who used to participate in the music industry, one of the coolest things you would ever see would be a mixing board with “flying faders” that seemed to move up and down of their own accord based on the timing of the music (most often seen at what was referred to as “mix down”).

This is the same equivalent that user interface automation provides. It has the “wow” factor, that “pop” that gets people’s attention, and in many ways, it really excites testers because it provides automation at a level where testers and users are most likely to apply their energies interacting with a product.

Over the past decade and a half tools that could record and play back sequences of tests at the user interface level have been very popular, but somewhere In the process of recording and playing back tests, problems ensue. The biggest complaint of this paradigm is the fact that the tests are “brittle”, meaning any change to the way the software is called, or any trivial changes to the interface or spawning area of a test could cause the tests to fail.

Microsoft uses methods to bypass the presentation layer of the UI, and instead interact with the objects directly, as opposed to directly interacts with the UI by simulating mouse clicks and keystrokes. Recording or simulating mouse clicks are the closest to the way the user interacts with the program, but they are also the most prone to errors. Instead of simulating mouse clicks or keystrokes, another approach is to automate the underlying actions represented by the click and keystroke This interaction occurs by using what is referred to as an ”object model”. This way, test can manipulate any portion of the UI without directly touching the specific UI controls.

Microsoft Active Accessibility (MSAA) is another approach to writing automation. MSAA utilizes the “IAccessible” interface, which allows testers to get information about specific UI elements. Buttons, text boxes, list boxes, and scroll bars all utilize the IAccessible interface. By pointing to the IAccessible interface using various function calls, automated tests can use methods to get information about various controls or make changes to them.With the release of .NET 3.0, Microsoft UI Automation is the new accessibility framework for all operating systems that support Windows Presentation Foundation (WPF). UI Automation exposes all UI items as an Automation Element. Microsoft testers write automated UI tests using variations on these themes.

Brute Force UI Automation

While using the various models to simulate the UI can save time and effort, and even make for robust automated test cases, they are not in and of themselves foolproof. It’s possible to miss critical bugs if these are the only methods employed.

While testing a Windows CE device, Alan decided to try some automated tests to run overnight to find issues that might take days or weeks to appear for a user. In this case there wasn’t an object model for the application, so he applied “brute force UI automation” to find each of the windows on the screen that made up the application. By writing code that centered the mouse over the given window and sending a “click”, he was able to make a simple application that could connect to a server, verify the connection was created, and then terminate the terminal server session.

After some time, he noticed the application had crashed after running successfully. After debugging the application, it was determined that there was a memory leak in the application. IT required a few hundred connections before it would manifest.

In this case, the difference was the fact that he focused on automating to the UI directly instead of the underlying objects. Had he focused on the underlying objects and not the specific Windows interface elements, this issue might never have been caught. Key takeaway; realize that sometimes directly accessing the UI at the user level is the best way to run a test, so keep all options open when considering how to automate tests.

What’s in a Test?

Test automation is more than just running a sequence of steps. The environment has to be in state where the test can be run. After running the test, criteria must be examined to confirm a PASS/FAIL and test results must be saved to review/analyze. Tests also need to be broken down so that the system is in the proper state to rerun the test or run the next test. Additionally, the test needs to be understandable so that it can be maintained or modified if necessary.

There are many tests where an actively thinking and analyzing human being can determine what’s happening in a system much more effectively than the machine itself can. Even with that, automated tests can save lots of time and, by extension, money.

Keith Stobie and Mark Bergman describe components of test automation in their 1992 paper "How to Automate Testing: The Big Picture" ( in terms of the acronym SEARCH:


  • Setup: Setup is the effort it takes to bring the software to a point where the actual test operation is ready for execution.

  • Execution: This is the core of the test—the specific steps necessary to verify functionality, sufficient error handling, or some other relevant task.

  • Analysis: Analysis is the process of determining whether the test passes or fails. This is the most important step—and often the most complicated step of a test.

  • Reporting: Reporting includes display and dissemination of the analysis, for example, log files, database, or other generated files.

  • Cleanup: The cleanup phase returns the software to a known state so that the next test can proceed.

  • Help: A help system enables maintenance and robustness of the test case throughout its life.

Oftentimes, the entire approach is automated so all steps are performed by the automation procedures. At other times, the automation steps may only go so far and then stop, allowing the tester to continue the rest of the way manually. An example of this is using the tool Texter. Texter allows the user to set up a number of fillable forms or provide text for fields in a scripted manner, and then the user can take control and walk through the test themselves, calling on other Texter scripts when needed to fill in large amounts of data that would be tedious to enter in manually, yet still provide complete control of the application’s execution manually.

Many automation efforts fail because the test execution is all that is considered. There are many more steps that need to be taken into consideration. Setting up the test, preparing the environment to run the test, running the test, gathering necessary information to determine a PASS/FAIL, storing the results in a report or log for analysis, breaking down the test to put the machine back into a state where other tests can be run or back to the initial state and then a reporting mechanism to show exactly which case was run, whether it passed or failed, and how that relates to the other tests all need to be considered.

Alan included an extensive layout of a series of tests that include the entire SEARCH acronym. Due to the amount of space it would take to walk through the entire example, I have omitted it from this review, instead choosing to focus on some areas that interested me (see below). For those who wish to check out the entire example please reference the book. It’s very thorough :) ).

I Didn’t Know You Could Automate That

Sometimes Automation of certain tasks can be very difficult to downright impossible. However, before giving up, it may be worth it to investigate further and see what you might be able to do.

An example is using PCMCIA devices and simulating plugging in and removing the devices, a task that manually would be challenging and automating the task without a “robot” would be impossible… that is, if the actual test required plugging in actual devices over and over. Microsoft utilizes a device called a "PCMCIA Jukebox" which contains six PCMCIA slots and a serial connection. By using this device they could, through software, turn on or off a device and simulate the insertion and removal of a particular device. Using these jukeboxes and other “switching” tools allowed the testers to simulate and automate tests where physical insertions or removal would be practically impossible.

The Challenge of Oracles

When determining if a test passes or fails, the program needs to be able to compare the results with a reference. This reference is called an “Oracle”. When tests are run manually, it’s often very easy to determine if a test passes or fails, because we are subconsciously comparing the state of the test and the results with what we know the appropriate response should be. A computer program that is automating a test doesn’t have the ability to do that unless it is explicitly told to, so it needs to reference an Oracle directly. Whether it be a truth table, a function call or some other method of comparison, to determine if a test passes or fails, there has to be a structure in place to help make that determination.

Determining Pass/Fail

Examples of test reporting also go beyond simple Pass/Fail. Below are some common test results that need to be considered in any test automation scenarios:

  • Pass: The test passed.

  • Fail: The test failed.

  • Skip: Skipped tests typically occur when tests are available for optional functionality. For example, a test suite for video cards might include tests that run only if certain features are enabled in hardware.

  • Abort: The most common example of an aborted result occurs when expected support files are not available. For example, if test collateral (additional files needed to run the test) are located on a network share and that share is unavailable, the test is aborted.

  • Block: The test was blocked from running by a known application or system issue. Marking tests as blocked (rather than failed) when they cannot run as a result of a known bug keeps failure rates from being artificially high, but high numbers of blocked test cases indicate areas where quality and functionality are untested and unknown.

  • Warn: The test passed but indicated warnings that might need to be examined in greater detail.


In many cases, log files that are created *are* the report, and for certain projects that is sufficient. For larger projects however, the ability to aggregate those log files or condense them to make a more coherent or abbreviated report is essential to understanding or analyzing the results of the tests.

One method is to go through and pull out key pieces of data from log files and append them to another file in a formatted way to help simplify and consolidate the results in an easier to view report. From this output a determination can be made as to which tests passed, which failed, which were skipped, which were aborted, etc.

Putting It All Together

An automation system contains lots of moving parts. Part from the test ha4rness itself, there is automated steps to retrieve tests from the test management system. Those tests are then mapped to existing test scripts or executables that are run to execute the automation. Computers and devices must be in place to run the tests. The results are parsed, reports are made, posted where appropriate, and stored back to the test case manager.

Alan points out that as of this writing, there are more than 100,000 computers at Microsoft dedicated to automated testing.

Large-Scale Test Automation

Many of the automation initiatives at Microsoft are start to finish processes:

  1. A command executed from the command line, a Web page, or an application.
  2. The TCM constructs a list of tests to run.
  3. Test cases are cross-referenced with binaries or scripts that execute the test.
  4. A shared directory on one of the database servers or on another computer in the system may contain additional files needed for the tests to run.
  5. The automation database and test case manager contact test controllers, configure the test computers to run the specified tests.
  6. Once preparation of the test computer is complete, the test controller runs the test.
  7. The test controller waits for the test to complete (or crash), and then obtains the log file from the test computer.
  8. The controller parses the log file for results or sends the log file back to the TCM or another computer to be parsed.
  9. After the log files are examined, a test result is determined and recorded.
  10. The file is saved and results are recorded.
  11. Some systems will directly test failures directly to the bug tracking system.
  12. Results are presented in a variety of reports, and the test team examines the failures.

Common Automation Mistakes

Test automation is as much of a coding discipline as software development, and as such, shares many of the benefits and disadvantages of the software development life cycle. In short, testers writing automation code make mistakes and create bugs in automation code, too.

Production code is tested, but this begs the question: who tests the test code? Some could say that repeated test runs and the ability to complete them verify that the test code is working properly… but is it? The goal of nearly every Microsoft team is for test code to have the same quality as the production code it is testing. How do the SDET’s do that?

Alan includes the following common errors seen when writing test code:

  • Hard-coded paths: Tests often need external files during test execution. The quickest and simplest method to point the test to a network share or other location is to embed the path in the source file. Unfortunately, paths can change and servers can be reconfigured or retired. It is a much better practice to store information about support files in the TCM or automation database.

  • Complexity: The complexity issues discussed in Chapter 7, are just as prevalent in test code as they are in production code. The goal for test code must be to write the simplest code possible to test the feature sufficiently.

  • Difficult Debugging: When a failure occurs, debugging should be a quick and painless procedure—not a multi-hour time investment for the tester. Insufficient logging is a key contributor to making debugging difficult. When a test fails, it is a good practice to log why the test failed. "Streaming test failed: buffer size expected 2048, actual size 1024" is a much better result than "Streaming test failed: bad buffer size" or simply "Streaming test failed." With good logging information, failures can be reported and fixed without ever needing to touch a debugger.

  • False positives: A tester investigates a failure and discovers that the product code is fine, but a bug in her test caused the test to report a failure result. The opposite of this, a false negative, is much worse—a test incorrectly reports a passing result. When analyzing test results, testers examine failures, not passing tests. Unless a test with a false negative repeats in another test or is caught by an internal user during normal usage, the consequences of false negatives are bugs in the hands of the consumer.

Automated testing is considered a core competency at Microsoft. The products under test are large and complex. Testing them all without significant automation would be very difficult to practically impossible, even with their vast number of testers. Significant automation efforts are required, and are thus in place. Scaling automation and making repeated use of automated test cases is very important to Microsoft’s long term strategy. With potentially thousands of configuration options and dozens of supported languages, the more robust and extensible the automation platforms, the more testing can be performed automatically, freeing up the testers for more specific and targeted testing where automation is impractical if not impossible.

Day 16, 17 & 18 of 40: TESTHEAD BOOT CAMP: DZone Refcards, a Bunch of Them

Here’s hoping that everyone has had a wonderful Christmas holiday (if you celebrate it) and if not, hey, I hope you had a great weekend, too :) ).

It’s time to get back in the saddle again, and I’ve discovered that Newton’s First Law of Motion absolutely applies to writing as well. When I get into a groove writing, I can keep on doing it fairly regularly and without interruption. When I’ve left off writing for a couple of days, getting started again is really frustrating. Thankfully, I realized I was behind on my regular updates leading up to my 40 days, so that gave me a perfect excuse to start writing again.

I’ll have another chapter review for How We Test Software at Microsoft later today, but this chapter is fairly meaty. It deals with Test Automation, and that in and of itself is a huge topic, even if it’s specific to just what Microsoft does.

In my regular reading and poking around, I was happy to find a service called Refcardz from DZone. Anyone who has gone to a college or university will probably remember the quick reference cards sold in the University bookstore for a number of topics (I actually still have several related to Pre-Calculus, Trigonometry, Calculus and a series for grammar and writing called “What’s The Word?” that I have been using to help me hone my blog posts). Anyway, DZone has a number of these cards for free (Registration on site required) and the cards are formatted almost exactly like the old university quick reference cards are. There are over a hundred of them, and while perusing the list, I found cards related to the following areas:


Selenium 1

Selenium 2




Flexible Rails



For a quick reference to information and concepts, these are pretty nice.

Saturday, December 25, 2010


Wishing everyone a Merry Christmas. I'm taking a day off from testing, test study, test punditry and looking forward to enjoying the time with my family. I hope you all can do the same.

I'll be back tomorrow with more TESTHEAD fun & games :).

Friday, December 24, 2010

TWiST #25 and TWiST-Plus: Some Tech Talk for Christmas Eve!

TWiST turns 25 today :)!

Well, actually, TWiST has now posted its 25th episode, and with the fact that we had one week where a show was missed early on due to technical issues, that means that TWiST has reached the six month mark as a weekly podcast. It's been interesting to see how the show has progressed, and more to the point, to see how the technical details of producing the show have advanced.

Someone once asked me how much time it takes to do an episode, and I've always had to say "it depends", because the total audio of the shows varies from 15 to 30 minutes depending on the guests and the topics. I've finally come to the realization that every minute of audio on average takes ten minutes of production. Why? Because I go through and scrub the audio for various things; I remove unwanted sound artifacts like pops and screeches that occasionally show up when recording digital audio. I level and adjust audio to correct for balance and to remove sibilance in places. I fix around drop-outs. I remove dead air, and I clean out "extra words".

Earlier on, when I first posted my "tech of TWiST comments, I said that I tried to walk a fine line. I edit to remove what I call "pause words" like "um, er, and "you know" so as to not chop up the interview too much. I also try to not remove the natural flow of the conversation, but in truth, I do place a pretty heavy hand on this. It's a sort of OCD thing for me; I find it to be distracting, and I find I start counting stutters rather than listening to the conversation. So if you are one to notice that the interviewer and interviewees tend to be really crystal clear, yeah, there's a reason for that :). Also, if you are really on the ball and sense that there are some clicks that appear here and there inexplicably, you're not imagining it. I'm getting better at smoothing out those edit transitions, but in some environments (noisy ones in particular), it's not easy to make edits that are seamless, but each episode I get a little better.

This week's main twist episode is with Web Performance's Michael Cziezperger. Matt met Michael at STPCON in October of 2010, and said he gave one of the most highly rated talks there. The main topic was performance testing, and dealing with the insane challenges of people in perpetual crisis mode. He also talked a bit about the program that they use and offer (including a demo version that I talked about yesterday in my TESTHEAD BOOT CAMP post. If you’d like to hear this week's episode, please go to Episode 25.

This week, since it's Christmas Eve, I decided to throw in another of my poster paper interviews from the Pacific Northwest Software Quality Conference I recorded in October. This TWiST-Plus episode covers the idea of adding White-Box techniques to automated testing and was presented by Sushil Karwa and Sasmita Panda, both being test engineers with McAfee and working out of Bangalore, India. Since doing the interview, Sushil has accepted another job in London, UK, so my congratulations to him on his new endeavor. For the episode, you can listen to it here.

Standard disclaimer:

Each TWiST podcast is free for 30 days, but you have to be a basic member to access it. After 30 days, you have to have a Pro Membership to access it, so either head on over quickly (depending on when you see this) or consider upgrading to a Pro membership so that you can get to the podcasts and the entire library whenever you want to :). In addition, Pro membership allows you to access and download to the entire archive of Software Test and Quality Assurance Magazine, and its issues under its former name, Software Test and Performance.

TWiST-Plus is all extra material, and as such is not hosted behind STP’s site model. There is no limitation to accessing TWiST-Plus material, just click the link to download and listen.

Again, my thanks to STP for hosting the podcasts and storing the archive. We hope you enjoy listening to them as much as we enjoy making them :).

Thursday, December 23, 2010

Day 13, 14, & 15 of 40: TESTHEAD BOOT CAMP: AJAX and Performance Test Demo

Man, too many things going on in the days leading up to Christmas, I’ve fallen behind in my updates. This post covers three days, which have basically been spent reading about and learning about AJAX architecture, editing audio for the TWiST podcast, as well as doing some priming on Performance Testing. It was the TWiST podcast scheduled for this week that actually is the impetus for today’s tool recommendation.

Load and Performance testing can be a long and tedious process. Wouldn’t it be nice to have a quick and robust tool that you could do some basic testing with (something where simulating ten active users would be enough to get some answers)? If that sounds like something you would like to play with, head over to and download the Web Performance Load Tester demo. It’s a demo that does not expire, but it is capped at simulating ten users.

Again, it’s not necessarily something that will blow the roof off of the place or flood your site, but if you want to get some ideas as to how your application handles a moderate load and practice some performance techniques, this is a good option to play with.

Wednesday, December 22, 2010

BOOK CLUB: How We Test Software At Microsoft (9/16)

This is the first part of Section 3 in “How We Test Software at Microsoft”. This section focuses on test workflow and testing tools. For this chapter the focus is on Bugs and Test Cases.

Chapter 9. Managing Bugs and Test Cases

Alan relays his story about writing and managing a bug tracking system prior to his arrival at Microsoft, and some of the myriad challenges he had to deal with in that environment. Prior to developing this system, bugs were tracked on whiteboards, color coded sticky-notes, and email messages. While OK on the surface, they needed to store more information about the issues they found and how issues were fixed so that those fixes could be tested and validated. Also, having a reporting mechanism was considered vital, so that they could monitor which bugs were fixed and when. Alan considered it an OK first draft for a system, but it had its shortcomings, and those shortcomings would be part of the reason the system would be replaced about a year after Alan left.

Microsoft has a professional and fully supported bug tracking system that has been revised and modified a number of times over the years.

With this chapter, the actual tools and methodologies used at Microsoft come into focus. This section and the ones that follow will obviously not match exactly everyone else’s experience outside of Microsoft, but it will give an idea as to what Microsoft actually uses and how they actually manage their bugs and their test cases.

The Bug Workflow

The two largest collections of “artifacts” created during the software development lifecycle by the test team are test cases and bug entries.

At the most basic level, test cases are the “what we do” part, and the bug entries are the “here’s what we saw” part. Alan uses this section to explain the life of a typical bug, and the steps that the teams follow from creation to resolution.

Bug Tracking

Bugs are the single largest artifact that testers create. If we do not create automation scripts or test framework code, bugs and bug reports are often the only tangible artifact that points to our existence as testers (barring the creation of test cases and test plan documents). It’s possible for a tester(s) to discover thousands of bugs in a given software release (and that may be on the light side if the product in question is rather complex or has a lot of user features).

Using a bug system is one way to keep track of where and how issues are progressing (or not progressing) and how to work with them. While this chapter deals with Microsoft’s methods, I’ll also be dropping in my own comments about my experiences and how I have handled bugs and how it compares or contrasts with Microsoft’s approach.

A Bug’s Life

Bugs are subjective, but the easiest way I know how to describe it (and I’m paraphrasing a number of people’s comments with this including James Bach, Michael Bolton, and Cem Kaner), is that “it’s a pain point to someone or some entity that matters”. Whether it be design specific, implementation specific, interface or implementation, if you have someone who matters determining that it’s not what they want to see or expect to see, you have a bug.

Bugs are usually found in three ways; by developers, by testers, or by users. When developers find them, they are often re-factored into the code and the issues are fixed with potentially little to no mentions of their existence. If a tester finds them while performing tests, they are often recorded in a bug tracking system or written on note cards to be handed to the developer (I’ve done both methods). If they are found by users, often they are entered into a  Customer Relationship Management application (CRM) and the bug system (and injected into lots of meetings, often with much visible hand wringing and verbal haranguing about "how could we have let this out into the field... but that deserves its own post entirely :) ).

At Microsoft, a triage team will often review the issues and determine their severity and who (or whether) issues should be assigned and at what priority. One thing that Microsoft and work environments that I’ve been have in common is that, often, issues that seem like they would be trivial to fix can often spiral into huge problems. Sometimes the most practical “fix” is to leave it alone until later.

Often, issues that are not fixed are left to be resolved in a later release, or become the subject of Microsoft Knowledge Base articles (where workarounds or other potential options are published until the issue can be directly addressed). Other issues that are determined to require fixing go through a process of being assigned, worked on, submitted, approved and integrated into a build. If it works, the issue is considered resolved. If not, it gets reassigned and the process begins again. Even when a bug is resolved and closed, that may not be the end of the line. Often times, bugs that have been fixed are routinely examined to see if they can determine root cause or other related issues.

Alan makes the point that bugs are often seen as “something that has gone wrong, but in truth, bugs are also the lifeblood of engineering, in that those bugs are what prompt changes, design modifications, and even completely different paths of potential future development".

Note: More than 15 million bug and project management entries were created in Microsoft systems in 2007.

Attributes of a Bug Tracking System

Alan points out that most software test teams, regardless of the size of their organization, interact with or are users of a bug tracking system. Even at my company, where I am the lone tester, we do indeed have a bug tracking system and it is the center of my universe :). Seriously, it is very often the most accessed tool in any given test day.

Ease of use is critical, because the ability to enter issues rapidly and directly will be key in determining if people will actively use the bug tracking system. A difficult or cumbersome interface will discourage people from using the system. Being able to configure the system so that different projects can track different information is also important. Some projects require just a little information, while others that are dealing with specific components and the way that they interact with them will likely require more details to be able to save the issues. Bug tracking systems are one of the most used systems by testers and developers alike, and thus they are often one of the most mission-critical applications to have running.

Alan also presents other attributes to consider:

• Bug notification
• Interoperability
• External user access

Why Write a Bug Report?

While it may be easier to go and have a chat with a developer about various issues, it is important to understand how issues come about, what causes a particular bug to appear, and how that bug might interact with other parts of the system. For this reason, it’s important to encourage and have testers document their issues in an issue tracker system.

Reports allow engineers, managers and testers to see trends, look for root causes and determine the efficiency of groups being able to find and fix issues. They allow future development and sustaining teams the ability to look back and see which issues appeared where and to help guide future development initiatives. Sometimes, bug reports can be used as a legal defense to prove that the issue, if determined to be a defect, could be fixed or not and provide legal protection in the event of a lawsuit.

Anatomy of a Bug Report

Most reporting systems have the same features when it comes to creating and categorizing bugs. Likewise, between organizations and companies, knowing how to create effective and helpful bug reports is an essential skill for any tester.

Without going into too much detail issues all have the following in common:

  • A Title that makes clear what the problem is. Good titles are vital to communication. They shouldn't be too short or too wordy, and should be clear as to what the issue is.
  • A Description that gives a clear summary of the issue, including impacts, expected results vs. actual results, and steps to reproduce.
  • A Status to say what the bugs current state is (New, Assigned, Fixed, Closed, etc.)
  • A version number to identify which build or release the issue was found.
  • The feature area or component where the issue occurs.
  • Steps to reproduce (in my system at my current company, and in most places I’ve used a bug database, the steps to reproduce are part of the description).
  • An assigned to field so that everyone knows who is working on which issues.
  • The severity of the issue (Catastrophic, Severe, Normal, Minor, Cosmetic, etc.)
  • Priority or customer impact
  • Specific Environment details, if relevant.

Additional fields in common use in a bug database include the following:

  • How Found
  • Issue Type
  • Bug Type
  • Source

Having too many fields in a tracking system can make the bug tracking application difficult to interact with, or cause unneeded time to complete issue reporting or tracking. Alan recommends keeping the fields to the relevant ones that will help expedite the reporting, tracking and resolving of issues, whatever those may be. I concur :).

Bug Triage

Triage is a fancy way of saying priority, and comes from Hospital Emergency Rooms. When faced with many different critical cases, it’s important to get to the most critical issues first and then deal with less critical issues afterwards.

Bugs get the triage treatment as well, and some bugs are higher priority than others. An oft used phrase in testing is that the best testers are the ones that find the most bugs. That may be true, but Cem Kaner in Testing Computer Software opines that in reality, the best testers are those ones that report the most bugs that get fixed. There’s a subtle difference there, but it’s a profound one. Reporting lots of low priority bugs may look good in reports, but what does it mean for the productivity of the development team or the overall health of the product? The bugs that are the most critical need to get the attention first, and determine if they must be fixed now, should be fixed, or “it would be nice but it’s not all that important”.

The goal is to reach zero bugs, but that may not be practical or reasonable. Many of the teams I have been part of have had severity of bugs be the deciding factor, where there are no “A” bugs (catastrophic) or “B” bugs (severe) but several “C” (moderate or minor) bugs were allowed to ship. The point is that “zero bugs” really means “zero bugs that should prevent a product from shipping". It almost never means that there are zero bugs in the system that has shipped (at least I’ve never worked on a project that has ever reached that vaunted milestone!).

Common Mistakes in Bug Reports

Often times a bug reporting system can be rendered less effective or even useless if bugs are reported in a bad or haphazard way. Some common problems with reporting bugs are:

  • Treating the issue tracker as a messaging tool. Avoid including informal or distracting details that can dilute or de-rail the main message.
  • Bugs morphing from one problem into another. If there are two issues to be tracked, have two separate issues and keep the details separate.
  • Don’t track multiple bugs in the same report. It’s too difficult to keep the details straight as to which issue is fixed in a subsequent build.
  • Multiple testers entering in the same bug at different times: 

Out of all the “bug sins”, I personally (and from my reading, Alan agrees) think that duplicate bugs count lower than the others. It’s probably the most common issue testers may face, and they often get chastised for it. However, if indeed there are duplicate issues, most systems allow for duplicate bugs to be merged, so it shouldn’t be treated as a corporal offense. Additionally, though issues may be duplicates, each entry may provide unique information to help describe a problem more completely. Additionally, if I have to choose between dealing with two or three duplicate bugs, or not getting the information at all because a tester is gun shy and afraid of entering in a duplicate issue, heck, enter the duplicate issue. Again, that’s what the merge feature in many bug tracking systems is for. It’s also possible that the issue isn’t a duplicate at all, but is a similar issue that points to another problem than the original area reported.

Using the Data

One of the things a good bug tracking system will give you is reports, and those reports can, frankly, say anything an organization wants them to say. They can point to progress, they can show where they are being slowed down, they can extrapolate a “burn rate" over time, showing how many reported bugs are being fixed, etc. Additional helpful items that bug reports and their aggregated reports can provide:

  • Bugs found to bugs fixed
  • Total bugs per language
  • Bug fix rate over time
  • Bug fix rate over time
  • Bugs by code area
  • Bugs found by which department
  • Bugs by severity
  • Were found
  • When found
  • How found
  • When introduced
  • Bug reactivation rate
  • Average time to resolve
  • Average time to close

How Not to Use the Data: Bugs as Performance Metrics

Many testers know well the tyranny of the bug database. It’s often used as a measure of a testers ability and performance, and a yardstick to see which tester is better than the others, or who is “the weakest link”. Alan makes the case that there are too many variables in this to be an effective way to measure tester performance including the complexity of features, the ability of the developers on different projects, specification details, when bugs are reported, or even if one group does a lot more unit testing as opposed to another and there are fewer bugs down the river to find. Consider Alan’s example from a couple of chapters back and the fishing metaphor; two fishermen in two separate parts of a river. Odds are the conditions might be different enough to have one fisherman catch lots more fish than the other, even though they may be of equal abilities.

Other factors also come into play, such as severity of issues (one catastrophic bug vs. 10 cosmetic issues), time tracking down issues, level of detail in bugs, etc. Specifically, what does a high bug count mean? Does it mean we are looking at a great tester, or a poor developer (or even just an early iteration in a product life cycle)? The opposite may be true as well. A low bug count could mean a poor tester or excellent and low bug count code.

Bug Bars

This is an interesting idea that Microsoft is using. It’s a system where only a certain number of bugs can be assigned to any one developer at any given time. Of course, this is a system that could be abused or masked if taken to extremes.

The intent of a bug bar is to encourage issues to be resolved early in the process and not wait until the end of a project. Alan also states that, for this approach to work in its intended way, bugs also have to be found as early as possible, too.

Test Case Management

In most testing environments, the other most common artifact associated with testers are test cases. The reason is simple, the test cases document what a tester does, and often, following those test cases are what help the tester find the bugs in the first place.

My company has a few products released, and each has specific testing needs. At the current time, those test cases are managed in Excel spreadsheets. Even then, we still manage thousands of test cases for our projects. With a larger company like Microsoft, the need to handle tens or hundreds of thousands (or millions) of test cases goes way beyond any spreadsheet system. In this case, using a dedicated Test Case Management system makes a lot of sense. A TCM can define, version control, store, and even execute various test cases. Many test case management applications and bug tracking systems can be integrated together, and Microsoft Visual Studio Test Tools allow for that capability.

What Is a Test Case?

Any action that is performed against a software feature, component, or sub-system where an outcome can be seen and recorded can be considered a test case. Test cases can cover things as small as a function to as large as an installation procedure or beyond.

Test cases are typically described as a written method where a user can take an expected input value or action, and observe an expected output or result. Manual tests and automated tests share these attributes.

The Value of a Test Case

Test cases provide more than just steps to see if a system works as expected or determine if a bug exists. Other important purposes of tests include:

  • Historical reference
  • Tracking test progress
  • Repeatability

Drawbacks to test cases include:

  • Documentation time
  • Test cases get out of date as features change
  • Difficult to assume knowledge of reader

Not all test cases are documented specifically, and not all testing goes by scripted test cases. Exploratory testing is often performed “off script” where specific defined test cases are not in place, or the ideas go beyond the test case definitions as written. Good testing uses scripted tests and test cases, but also looks beyond those cases at times, too.

Anatomy of a Test Case

So what goes into a sample Microsoft test case?

  • Purpose
  • Conditions
  • Specific inputs and steps
  • Predicted results

Additionally test cases could also include:

  • Test frequency
  • Configurations
  • Automation (Manual Test Cases, Semi Automated Test Cases, Fully Automated Test Cases)

Test Case Mistakes

Test cases can be just as prone to mistakes as software code can be, and ill designed test cases can cause testing efforts to be less effective (or entirely ineffective). Areas where there may be problems with test case design are as follows:

  • Missing steps
  • Too verbose
  • Too much jargon
  • Unclear pass/fail criteria

Managing Test Cases

When dealing with very large systems or systems that deal with a lot of testers and a lot of components, eventually a case management system for test cases becomes a necessity. Microsoft uses their development tools to track test cases alongside their issue tracking system. (Product Studio and Visual Studio Team System). This allows the users to link test cases to issues found. Test cases can also be linked to functional requirements in Microsoft’s system as well.

This system allows the following views:

  • knowing how many test cases exist for each requirement
  • knowing which requirements do not have test cases
  • viewing the mapping between bugs and requirements.

Cases and Points: Counting Test Cases

Test cases allow the tester to confirm that a function works properly or that an error is handled correctly, or that other criteria are met such as performance or load capabilities. A test case may need to be run on multiple operating systems, on multiple processor types, or on multiple device types, or with multiple languages.

To help simplify this, Microsoft often refers to test cases and test points.

  • A test case is the single instance of a set of steps to carry out a testing activity
  • A test point is an instantiation of that test case in a particular environment.

The idea of a test point is that the same test case can be run on multiple hardware platforms and configurations of systems.

Microsoft breaks down the terminology in the following definitions:

  • Test case A plan for exercising a test.
  • Test point A combination of a test case and an execution environment.
  • Test suite A collection of related test cases or test points. Often, a test suite is a unit of tested functionality, limited to a component or feature.
  • Test run A collection of test suites invoked as a unit. Often, a test run is a collection of all the tests invoked on a single hardware or software context.
  • Test pass A collection of test runs. Often, a test pass spans several hardware and software configurations and is confined by a specific checkpoint or release plans.

Tracking and Interpreting the Test Results

TCM systems allow testers and test teams the ability to track how many test cases were run, how many cases passed, how many failed, etc. TCM’s can organize tests into groupings, or suites, that can be run each day, or each week, or against specific builds of an application. Sophisticated TCM’s, if integrated with a bug tracking system, could also determine the number of bugs found in a test pass, or if a fix for a bug actually resolved the issue.

Some test case metrics that can be examined are:

  • Pass rate
  • Number of pass/fail cases
  • Total number of test cases
  • Ratio of manual/automated cases
  • Number of tests by type/area
  • Number or percentage of tests finding bugs

Regardless of whether or not testing efforts are small or large, simple or complex, every tester deals with test cases and with bugs. How they are tracked and managed varies from organization to organization, but generally most systems have more similarities than differences. While my current company doesn’t have an integrated Bug Management and Test Case Management system (we do have a bug database but currently we handle our test cases in Excel), the standards that Microsoft uses and the standards that we use are pretty close. If anything, this chapter has given me some additional reasons to consider looking at a test case management system and to see if there’s an option to do so with our current bug tracking system (ah, but that may be worthy of a post all its own, well outside of this HWTSAM review :) ). The key is that the systems need to be simple to use and easy to follow. Over complicating either section will result in testers not using either to their effective capabilities.