Wednesday, December 22, 2010

BOOK CLUB: How We Test Software At Microsoft (9/16)

This is the first part of Section 3 in “How We Test Software at Microsoft”. This section focuses on test workflow and testing tools. For this chapter the focus is on Bugs and Test Cases.

Chapter 9. Managing Bugs and Test Cases

Alan relays his story about writing and managing a bug tracking system prior to his arrival at Microsoft, and some of the myriad challenges he had to deal with in that environment. Prior to developing this system, bugs were tracked on whiteboards, color coded sticky-notes, and email messages. While OK on the surface, they needed to store more information about the issues they found and how issues were fixed so that those fixes could be tested and validated. Also, having a reporting mechanism was considered vital, so that they could monitor which bugs were fixed and when. Alan considered it an OK first draft for a system, but it had its shortcomings, and those shortcomings would be part of the reason the system would be replaced about a year after Alan left.

Microsoft has a professional and fully supported bug tracking system that has been revised and modified a number of times over the years.

With this chapter, the actual tools and methodologies used at Microsoft come into focus. This section and the ones that follow will obviously not match exactly everyone else’s experience outside of Microsoft, but it will give an idea as to what Microsoft actually uses and how they actually manage their bugs and their test cases.

The Bug Workflow

The two largest collections of “artifacts” created during the software development lifecycle by the test team are test cases and bug entries.

At the most basic level, test cases are the “what we do” part, and the bug entries are the “here’s what we saw” part. Alan uses this section to explain the life of a typical bug, and the steps that the teams follow from creation to resolution.

Bug Tracking

Bugs are the single largest artifact that testers create. If we do not create automation scripts or test framework code, bugs and bug reports are often the only tangible artifact that points to our existence as testers (barring the creation of test cases and test plan documents). It’s possible for a tester(s) to discover thousands of bugs in a given software release (and that may be on the light side if the product in question is rather complex or has a lot of user features).

Using a bug system is one way to keep track of where and how issues are progressing (or not progressing) and how to work with them. While this chapter deals with Microsoft’s methods, I’ll also be dropping in my own comments about my experiences and how I have handled bugs and how it compares or contrasts with Microsoft’s approach.

A Bug’s Life

Bugs are subjective, but the easiest way I know how to describe it (and I’m paraphrasing a number of people’s comments with this including James Bach, Michael Bolton, and Cem Kaner), is that “it’s a pain point to someone or some entity that matters”. Whether it be design specific, implementation specific, interface or implementation, if you have someone who matters determining that it’s not what they want to see or expect to see, you have a bug.

Bugs are usually found in three ways; by developers, by testers, or by users. When developers find them, they are often re-factored into the code and the issues are fixed with potentially little to no mentions of their existence. If a tester finds them while performing tests, they are often recorded in a bug tracking system or written on note cards to be handed to the developer (I’ve done both methods). If they are found by users, often they are entered into a  Customer Relationship Management application (CRM) and the bug system (and injected into lots of meetings, often with much visible hand wringing and verbal haranguing about "how could we have let this out into the field... but that deserves its own post entirely :) ).

At Microsoft, a triage team will often review the issues and determine their severity and who (or whether) issues should be assigned and at what priority. One thing that Microsoft and work environments that I’ve been have in common is that, often, issues that seem like they would be trivial to fix can often spiral into huge problems. Sometimes the most practical “fix” is to leave it alone until later.

Often, issues that are not fixed are left to be resolved in a later release, or become the subject of Microsoft Knowledge Base articles (where workarounds or other potential options are published until the issue can be directly addressed). Other issues that are determined to require fixing go through a process of being assigned, worked on, submitted, approved and integrated into a build. If it works, the issue is considered resolved. If not, it gets reassigned and the process begins again. Even when a bug is resolved and closed, that may not be the end of the line. Often times, bugs that have been fixed are routinely examined to see if they can determine root cause or other related issues.

Alan makes the point that bugs are often seen as “something that has gone wrong, but in truth, bugs are also the lifeblood of engineering, in that those bugs are what prompt changes, design modifications, and even completely different paths of potential future development".

Note: More than 15 million bug and project management entries were created in Microsoft systems in 2007.

Attributes of a Bug Tracking System

Alan points out that most software test teams, regardless of the size of their organization, interact with or are users of a bug tracking system. Even at my company, where I am the lone tester, we do indeed have a bug tracking system and it is the center of my universe :). Seriously, it is very often the most accessed tool in any given test day.

Ease of use is critical, because the ability to enter issues rapidly and directly will be key in determining if people will actively use the bug tracking system. A difficult or cumbersome interface will discourage people from using the system. Being able to configure the system so that different projects can track different information is also important. Some projects require just a little information, while others that are dealing with specific components and the way that they interact with them will likely require more details to be able to save the issues. Bug tracking systems are one of the most used systems by testers and developers alike, and thus they are often one of the most mission-critical applications to have running.

Alan also presents other attributes to consider:

• Bug notification
• Interoperability
• External user access

Why Write a Bug Report?

While it may be easier to go and have a chat with a developer about various issues, it is important to understand how issues come about, what causes a particular bug to appear, and how that bug might interact with other parts of the system. For this reason, it’s important to encourage and have testers document their issues in an issue tracker system.

Reports allow engineers, managers and testers to see trends, look for root causes and determine the efficiency of groups being able to find and fix issues. They allow future development and sustaining teams the ability to look back and see which issues appeared where and to help guide future development initiatives. Sometimes, bug reports can be used as a legal defense to prove that the issue, if determined to be a defect, could be fixed or not and provide legal protection in the event of a lawsuit.

Anatomy of a Bug Report

Most reporting systems have the same features when it comes to creating and categorizing bugs. Likewise, between organizations and companies, knowing how to create effective and helpful bug reports is an essential skill for any tester.

Without going into too much detail issues all have the following in common:

  • A Title that makes clear what the problem is. Good titles are vital to communication. They shouldn't be too short or too wordy, and should be clear as to what the issue is.
  • A Description that gives a clear summary of the issue, including impacts, expected results vs. actual results, and steps to reproduce.
  • A Status to say what the bugs current state is (New, Assigned, Fixed, Closed, etc.)
  • A version number to identify which build or release the issue was found.
  • The feature area or component where the issue occurs.
  • Steps to reproduce (in my system at my current company, and in most places I’ve used a bug database, the steps to reproduce are part of the description).
  • An assigned to field so that everyone knows who is working on which issues.
  • The severity of the issue (Catastrophic, Severe, Normal, Minor, Cosmetic, etc.)
  • Priority or customer impact
  • Specific Environment details, if relevant.

Additional fields in common use in a bug database include the following:

  • How Found
  • Issue Type
  • Bug Type
  • Source

Having too many fields in a tracking system can make the bug tracking application difficult to interact with, or cause unneeded time to complete issue reporting or tracking. Alan recommends keeping the fields to the relevant ones that will help expedite the reporting, tracking and resolving of issues, whatever those may be. I concur :).

Bug Triage

Triage is a fancy way of saying priority, and comes from Hospital Emergency Rooms. When faced with many different critical cases, it’s important to get to the most critical issues first and then deal with less critical issues afterwards.

Bugs get the triage treatment as well, and some bugs are higher priority than others. An oft used phrase in testing is that the best testers are the ones that find the most bugs. That may be true, but Cem Kaner in Testing Computer Software opines that in reality, the best testers are those ones that report the most bugs that get fixed. There’s a subtle difference there, but it’s a profound one. Reporting lots of low priority bugs may look good in reports, but what does it mean for the productivity of the development team or the overall health of the product? The bugs that are the most critical need to get the attention first, and determine if they must be fixed now, should be fixed, or “it would be nice but it’s not all that important”.

The goal is to reach zero bugs, but that may not be practical or reasonable. Many of the teams I have been part of have had severity of bugs be the deciding factor, where there are no “A” bugs (catastrophic) or “B” bugs (severe) but several “C” (moderate or minor) bugs were allowed to ship. The point is that “zero bugs” really means “zero bugs that should prevent a product from shipping". It almost never means that there are zero bugs in the system that has shipped (at least I’ve never worked on a project that has ever reached that vaunted milestone!).

Common Mistakes in Bug Reports

Often times a bug reporting system can be rendered less effective or even useless if bugs are reported in a bad or haphazard way. Some common problems with reporting bugs are:

  • Treating the issue tracker as a messaging tool. Avoid including informal or distracting details that can dilute or de-rail the main message.
  • Bugs morphing from one problem into another. If there are two issues to be tracked, have two separate issues and keep the details separate.
  • Don’t track multiple bugs in the same report. It’s too difficult to keep the details straight as to which issue is fixed in a subsequent build.
  • Multiple testers entering in the same bug at different times: 

Out of all the “bug sins”, I personally (and from my reading, Alan agrees) think that duplicate bugs count lower than the others. It’s probably the most common issue testers may face, and they often get chastised for it. However, if indeed there are duplicate issues, most systems allow for duplicate bugs to be merged, so it shouldn’t be treated as a corporal offense. Additionally, though issues may be duplicates, each entry may provide unique information to help describe a problem more completely. Additionally, if I have to choose between dealing with two or three duplicate bugs, or not getting the information at all because a tester is gun shy and afraid of entering in a duplicate issue, heck, enter the duplicate issue. Again, that’s what the merge feature in many bug tracking systems is for. It’s also possible that the issue isn’t a duplicate at all, but is a similar issue that points to another problem than the original area reported.

Using the Data

One of the things a good bug tracking system will give you is reports, and those reports can, frankly, say anything an organization wants them to say. They can point to progress, they can show where they are being slowed down, they can extrapolate a “burn rate" over time, showing how many reported bugs are being fixed, etc. Additional helpful items that bug reports and their aggregated reports can provide:

  • Bugs found to bugs fixed
  • Total bugs per language
  • Bug fix rate over time
  • Bug fix rate over time
  • Bugs by code area
  • Bugs found by which department
  • Bugs by severity
  • Were found
  • When found
  • How found
  • When introduced
  • Bug reactivation rate
  • Average time to resolve
  • Average time to close

How Not to Use the Data: Bugs as Performance Metrics

Many testers know well the tyranny of the bug database. It’s often used as a measure of a testers ability and performance, and a yardstick to see which tester is better than the others, or who is “the weakest link”. Alan makes the case that there are too many variables in this to be an effective way to measure tester performance including the complexity of features, the ability of the developers on different projects, specification details, when bugs are reported, or even if one group does a lot more unit testing as opposed to another and there are fewer bugs down the river to find. Consider Alan’s example from a couple of chapters back and the fishing metaphor; two fishermen in two separate parts of a river. Odds are the conditions might be different enough to have one fisherman catch lots more fish than the other, even though they may be of equal abilities.

Other factors also come into play, such as severity of issues (one catastrophic bug vs. 10 cosmetic issues), time tracking down issues, level of detail in bugs, etc. Specifically, what does a high bug count mean? Does it mean we are looking at a great tester, or a poor developer (or even just an early iteration in a product life cycle)? The opposite may be true as well. A low bug count could mean a poor tester or excellent and low bug count code.

Bug Bars

This is an interesting idea that Microsoft is using. It’s a system where only a certain number of bugs can be assigned to any one developer at any given time. Of course, this is a system that could be abused or masked if taken to extremes.

The intent of a bug bar is to encourage issues to be resolved early in the process and not wait until the end of a project. Alan also states that, for this approach to work in its intended way, bugs also have to be found as early as possible, too.

Test Case Management

In most testing environments, the other most common artifact associated with testers are test cases. The reason is simple, the test cases document what a tester does, and often, following those test cases are what help the tester find the bugs in the first place.

My company has a few products released, and each has specific testing needs. At the current time, those test cases are managed in Excel spreadsheets. Even then, we still manage thousands of test cases for our projects. With a larger company like Microsoft, the need to handle tens or hundreds of thousands (or millions) of test cases goes way beyond any spreadsheet system. In this case, using a dedicated Test Case Management system makes a lot of sense. A TCM can define, version control, store, and even execute various test cases. Many test case management applications and bug tracking systems can be integrated together, and Microsoft Visual Studio Test Tools allow for that capability.

What Is a Test Case?

Any action that is performed against a software feature, component, or sub-system where an outcome can be seen and recorded can be considered a test case. Test cases can cover things as small as a function to as large as an installation procedure or beyond.

Test cases are typically described as a written method where a user can take an expected input value or action, and observe an expected output or result. Manual tests and automated tests share these attributes.

The Value of a Test Case

Test cases provide more than just steps to see if a system works as expected or determine if a bug exists. Other important purposes of tests include:

  • Historical reference
  • Tracking test progress
  • Repeatability

Drawbacks to test cases include:

  • Documentation time
  • Test cases get out of date as features change
  • Difficult to assume knowledge of reader

Not all test cases are documented specifically, and not all testing goes by scripted test cases. Exploratory testing is often performed “off script” where specific defined test cases are not in place, or the ideas go beyond the test case definitions as written. Good testing uses scripted tests and test cases, but also looks beyond those cases at times, too.

Anatomy of a Test Case

So what goes into a sample Microsoft test case?

  • Purpose
  • Conditions
  • Specific inputs and steps
  • Predicted results

Additionally test cases could also include:

  • Test frequency
  • Configurations
  • Automation (Manual Test Cases, Semi Automated Test Cases, Fully Automated Test Cases)

Test Case Mistakes

Test cases can be just as prone to mistakes as software code can be, and ill designed test cases can cause testing efforts to be less effective (or entirely ineffective). Areas where there may be problems with test case design are as follows:

  • Missing steps
  • Too verbose
  • Too much jargon
  • Unclear pass/fail criteria

Managing Test Cases

When dealing with very large systems or systems that deal with a lot of testers and a lot of components, eventually a case management system for test cases becomes a necessity. Microsoft uses their development tools to track test cases alongside their issue tracking system. (Product Studio and Visual Studio Team System). This allows the users to link test cases to issues found. Test cases can also be linked to functional requirements in Microsoft’s system as well.

This system allows the following views:

  • knowing how many test cases exist for each requirement
  • knowing which requirements do not have test cases
  • viewing the mapping between bugs and requirements.

Cases and Points: Counting Test Cases

Test cases allow the tester to confirm that a function works properly or that an error is handled correctly, or that other criteria are met such as performance or load capabilities. A test case may need to be run on multiple operating systems, on multiple processor types, or on multiple device types, or with multiple languages.

To help simplify this, Microsoft often refers to test cases and test points.

  • A test case is the single instance of a set of steps to carry out a testing activity
  • A test point is an instantiation of that test case in a particular environment.

The idea of a test point is that the same test case can be run on multiple hardware platforms and configurations of systems.

Microsoft breaks down the terminology in the following definitions:

  • Test case A plan for exercising a test.
  • Test point A combination of a test case and an execution environment.
  • Test suite A collection of related test cases or test points. Often, a test suite is a unit of tested functionality, limited to a component or feature.
  • Test run A collection of test suites invoked as a unit. Often, a test run is a collection of all the tests invoked on a single hardware or software context.
  • Test pass A collection of test runs. Often, a test pass spans several hardware and software configurations and is confined by a specific checkpoint or release plans.

Tracking and Interpreting the Test Results

TCM systems allow testers and test teams the ability to track how many test cases were run, how many cases passed, how many failed, etc. TCM’s can organize tests into groupings, or suites, that can be run each day, or each week, or against specific builds of an application. Sophisticated TCM’s, if integrated with a bug tracking system, could also determine the number of bugs found in a test pass, or if a fix for a bug actually resolved the issue.

Some test case metrics that can be examined are:

  • Pass rate
  • Number of pass/fail cases
  • Total number of test cases
  • Ratio of manual/automated cases
  • Number of tests by type/area
  • Number or percentage of tests finding bugs

Regardless of whether or not testing efforts are small or large, simple or complex, every tester deals with test cases and with bugs. How they are tracked and managed varies from organization to organization, but generally most systems have more similarities than differences. While my current company doesn’t have an integrated Bug Management and Test Case Management system (we do have a bug database but currently we handle our test cases in Excel), the standards that Microsoft uses and the standards that we use are pretty close. If anything, this chapter has given me some additional reasons to consider looking at a test case management system and to see if there’s an option to do so with our current bug tracking system (ah, but that may be worthy of a post all its own, well outside of this HWTSAM review :) ). The key is that the systems need to be simple to use and easy to follow. Over complicating either section will result in testers not using either to their effective capabilities.

No comments: