Wednesday, April 27, 2011

Pushing the Boundaries: A Weekend Testing Follow-Up

Last Saturday I had a chance to do a “double dip” with Weekend Testing. I facilitated the WTAmericas session at 11:00 AM, but prior to that, at 8:00 AM, I joined up with the European contingent to be a participant in their event.

This time out was interesting because we were testing the boundary conditions of Skype, the tool we were using to conduct our session. This had a lot of interesting possibilities… what happens if we bring down the service? Will we kill our session? Will the powers that be at Skype take a negative view of us for doing this? Could we be dealing with an ethical time bomb here?

In truth, every time we test we run into these situations. In some ways, I joke that we deal with a schizophrenic Hippocratic Oath; we must do no harm once the application is out in the wild, but we can be as evil and diabolical as we want to be while it’s in our labs. But how aggressively do we test when the actual app we mare testing is already in the wild? And how do we reconcile what one country thinks is fair play and another thinks is illegal, or at best, bad manners?

Weekend Testing is unique in self-directed education opportunities in that it’s a way to be 100% open about your learning progress and experience. Everyone who participates gets their actions published with every experience report. In short, there’s a full and unexpurgated record of everything we do, say and act upon. In the role of being great testers, we can get proof of our actions and point to our building experience in the testing we perform. In the event of a too aggressive outlay, however, we also have a transcript of our actions in those processes, attached to our names. It’s a double edged sword.

With this in mind, we went about looking at conditions that would cause our application to feel stress or otherwise not respond. I decided to try an old favorite tool when lots of text is required for a purpose. QAHatesYou described what he calls “the Hamlet Test” which is all of the text of hamlet in a copy/paste buffer, applied to various inputs. I use something similar that I call the Lorem Ipsum test because, well that’s the tool that I use. The key to Lorem Ipsum is that you can designate has much data as you need, and the site will generate that much text. The text is nonsensical Latin, so there’s really no rhyme or reason to the characters other than words formatted in paragraphs. What it does do it allow for an exact byte count or word count to be generated, and then that block can be used to test any text inputs you would like to use. Lorem Ipsum is great for testing buffer overflows, not so great for testing XSS or SQL injection. In this case, because Skype is an active service and application on a live network, I didn’t want to risk that.

Interestingly enough, there’s plenty that one can do with just plain alphanumeric text. I discovered that 29999 characters is the maximum that Skype will display in a message. I discovered that it will cut off the remaining characters from view from the user. WhatI didn’t discover was if those characters were lopped off from the delivered message, of it was limited to what it actually displayed on screen. There’s a subtle but not insignificant difference with that (i.e. what is the database actually holding beyond what it is displaying). I discovered the Skype was not running and transmitting messages via HTTP, but its own protocol. There are plenty of additional probing tests I could have done as well, but it would have required being more intrusive on a public site than I am personally comfortable with doing. We also found that varying the languages, so that we were entering unicode text such as Kanjio or Hanghul made for some very interesting results as well. In some cases, just a line of text was printed before the message was truncated.

There are time when “going medieval” on a application is necessary and warranted. Usually that is when the application is in the local testing stage. The more outward facing the application is, the more critical it is we find those strange and anomalous issues that could be catastrophic, but the closer we get to live deployment, the less our opportunity to “get medieval” becomes. Crashing a system in a sandbox with clear steps is a great help to developers. Crashing a live app and taking an app out of service that people depend on may be a help to developers, but it is definitely adding to the irritation of its users. Fortunately, we did not have that happen during our testing, but it’s a real threat to be aware of, and know how to deal with it in these endeavors.

All in all a great session, and a good reminder of our responsibilities as testers, knowing when “going Medieval” and “do no harm” need to be balanced and respected.

No comments: