Wednesday, February 12, 2014

Fix You Estimating Bad Habits: An Evening with Ted M Young and #BAST

Tonight in San Francisco, BAST (Bay Area Software Testers) will be hosting our first Meetup for 2014. For those who will be there, I'll be excited to see you there. For those who aren't able to make it, please follow this post in the coming hours, as I will be live blogging what I hear and do.

BAST (Bay Area Software Testers) has made the commitment to try to bring in topics that go beyond or outside of the common or typical tester MeetUp topics. To that end, we are hosting Ted Young (Twitter: @jitterted) who will be talking about  how to "Fix Your Estimating Bad Habits".

For those planning on attending, here's the details:

Date: Wednesday, February 12, 2014
Time: 6:15 p.m. to 8:30 p.m.
Location: Salesforce.com
Address: 50 Fremont St, 28th Floor, San Francisco, CA (map)

From our Meetup site description:

Ted is a self-styled Renaissance Coder who has been coding since 1979 and sold his first commercial program at the age of 14.  Ted is currently the chief practitioner and evangelist of lean and systems thinking for GuideWire software.  He works nicely with our mission statement to present you with outstanding thinkers and practitioners from all across the spectrum of software development and business.

I hope you'll be able to join us at the facilities of our gracious host, SalesForce. Food (pizza) and beer will be provided! Also, we will be giving away two books by O'Reilly Media, who has graciously started supporting us and is giving us books that we can give away. Tonight's titles are "Vagrant: Up and Running" and "LEAN UX". Must be present to win :)!

For the rest, you're just going to have to wait until I get there.
---


Ted started off the talk tonight describing some of the environments where he has worked and some of the challenges he's faced. He's asked up to use the slides as a token for conversation, and not to treat this as a "presentation' per se. This is meant to be a conversation, not Ted talking, which means this is likely to be even more fun than normal.

Think about the things you want to do, what are some options you'd like to try? Ted mentioned that he'd love to do TDD in his organization, but it's not practical with their environment as it currently exists. There are trade-offs in all things, and one of the tradeoff's he's had to deal with is that the nature of their tests. The real world is full of constraints, and we often have to deal with those constraints.

Ted through out a provocative question... what is an estimate? It took awhile for people to reply to this. A best guess based on information that we have (aka "bull---t"). SWAG, or "silly wild ass guess". A combination everything you need to do, paired with everything you've already done and know, and figuring out how to harmonize the two. Any wonder why we (collectively) are so bad at this;)?

How about bugs? Have you ever estimated how long a bug will take to fix? Of course, it depends on the bug. A mis-label of an element might take five minutes. A (seemingly) random performance drop under load might take days to figure out, maybe more, not even counting the time it might take to actually fix it. Some bugs seem obvious after the fact, but getting to "obvious" might pas our programmer and tester through days of hair pulling and mega frustration.

Ted refers to bugs, infrastructure, and other such items as "overhead", and while it's important to know how long it takes to take care of these things, trying to score points is counter productive. Ted isn't saying "don't rack the time it takes to do tasks" Those items are important, they take up time to do real feature coding (and feature testing, too).


Another bad habit is to "ignore all previous or other projects". Why do we save all of our spreadsheets, story point data, and other details if we have no intention of learning from it? The reason is that history proves that time and energy just seek a status quo. We record everything because we once intended to learn from what we found, but over time, we just end up recording everything because we've always recorded everything. Mining the data to make some projections of future work may not be entirely relevant, but it's not worthless either. Has anyone ever really delivered a project early with all scheduled features? Typically, projects come in over time, over budget and with jettisoned features. Examining previous projects to see trends allow us to counteract biases, if we are willing to pay attention. This approach is called Reference Cast Forecasting, and while it may not be very glamorous, it can give amazing guidance. Sadly, companies that are good at doing Reference Cast Forecasting are rare.



Another challenge we face is that we try to sample data without a clear reference of what we are sampling or even why we are sampling it. Think of audio. The lower the sample rate, the lower quality toe audio. The higher the sample rate, the better fidelity the audio you capture is. Sampling of data and history (think velocity of projects) could be very high fidelity or very low fidelity. Most of the time, we just don't know what we are really measuring. We are also terrible at statistics, on the whole. We tend to focus on perceived catastrophic instances, where the likelihood of that event happening was so incredibly low, winning the PowerBall lottery was more likely. Meanwhile, real and genuine dancers, much more prevalent, were not given the attention they deserved, especially considering their likelihood of occurring was significantly higher.

Another mistake we make is to believe that velocity (or defect rate, scope creep, expertise, skill, etc.) is linear. It's not. there comes a time where technical debt or other issues will slow down the process, or speed up one area to the mis-balance and detriment of another.

Ted talked a bit about the flaw of averages, and the danger of always taking the mean and adding it up. Granted, the mean may be correct 68% of the time, but that also means that 32% of the time we are wrong, or off the mark, or potentially *way* off the mark. That 32% chance is one in three. That's a lot of potential for getting it wrong, and all it takes is one unknown variable to totally throw us off. An unknown could, potentially, help us move unexpectedly faster, but generally speaking, unknowns tend to work negatively to our expectations. The mean may be a "3", but one out of three times, it will be a six, or an eight.

Another big danger is "attribute substitution". We don't know the answer to A< but we know B very well, and we kind of thing B is a lot like A, so we'll provide B as support for why A will take as long as it will. It's a dangerous substitution, and people do it all the time, most of the time not realizing it. It's the analogy stretched too far. Analogies can be helpful but analogies are never perfect fits. There's a real danger on relying on them too much. Measuring everything the same way is also problematic. Epics, themes, stories, tasks. Measuring them all the same way, there be dragons!

 Another danger we face is that we spend a lot of time on things that have little value. Why? Because it's much easier. Ted mentioned the idea of the cost of delay, where if I don't have a capability as a certain point in time, there is a cost, and I might be able to calculate that cost. Very often the hard stuff gets pushed back, and it's the hard stuff that's most at danger of becoming the victim to the cost of delay.

Ted suggests that we tend to underestimate our own completion time, but we are much better at estimating other people's efforts. Is it wishful thinking on our part, or are we just better observers of others and their capabilities over time? Perhaps taking our over estimation of another and added to our underestimation of ourselves might bring us to the sweet spot to how long something will really take. It's an interesting idea.

How many of us tent to only estimate our actual "touch time", meaning our in the zone, fully focus on what we are doing time? We tend to think way too much of how we allocate our time and attention. we forget all of the things that we may need to do, or just plain want to do. Do we ned to talk with other stakeholders? Do we need to coordinate with other members of the team? Those external dependencies can have a major impact on how much time and attention we can actually apply to an issue.

Another danger we face is that we tend to associate points and sizes to time in isolation. We completely neglect the actual complexity of the effort. we don't want to think of just the idea that a point equals a perfect engineering day. the whole idea of points was that we would abstract away the time factor, and that a difficult task or simple tasks would still have point value, but the points would be variable. Unfortunately, we've pegged points to time, so changing that attitude may be a lost cause. Additionally using things like Fibonacci numbers or planning poker, aspects that point to larger numbers, tends to increase the odds of under-estimation. The big numbers scare people. It's more comfortable to go with the values in the mean, an effect referred to as Anchoring Bias. We tend to anchor to what other people say, because they are close together. we don't want to be the outlier, even if we have a deep suspicion our pessimism may be well warranted. Our attempts to remove anchoring bias instead puts social anxiety into the mix. In short, we go with what feels safe. Remember, safe has a one in three chance to be wrong.

Another real danger is premature estimation. Have you explored the options before you commit? This can happen when someone tells us what to implement, rather than describing the problem and looking for the WHY of the problem. Exploring the WHY lets us see other potential avenues, whereas committing to a WHAT may mean committing to a course of action that is overkill for what is really needed to solve the actual problem. Sometimes we just have to experiment. We may not be able to come to an obvious course of action. There may be two competing ideas and both may look great on the surface. Sometimes we may just have to set up a few experiments to see what might be the best approach.

This was a deep talk, and there were some interesting aspects I hadn't considered. Some of these items are really familiar, and some are subconscious; there, but under the radar. Lots of food for thought to say the least :).


Our thanks to Salesforce for providing the venue for tonight, our thanks to Ted Young for speaking, and thanks to O'Reilly Media for providing free books for our Meetup. We would of course love to have more to give away, if you feel so inclined ;).

By the way, just want to point out, as the picture indicates, the spelling of "Fix You Estimate Bad Habits" was originally a typo, but became part of the title because it underlined the whole point of the topic :).

No comments: