TESTHEAD: AI

Showing posts with label AI. Show all posts

Monday, August 25, 2025

We're Back: CAST is in Session: Opening Keynote on Responsible AI (Return of the Live Blog)

Hello everyone. It has been quitre a while since I've been here (this feels like boilerplate at this point but yes, it feels like conferences and conference sessions are what get me to post most of the time now, so here I am :) ).

I'm at CAST. It has been many years since I've been here. Lots of reasons for that but suffice it to say I ws asl=ked to participate, I accepted, and now I am at the Zion's Bankcorp Tech Center in Midvale, UT (a suburb/neighborhood of Salt Lake City). I'm doing a few things this go around:

- I'm giving a talk about Accessibility and Inclusive Design (Monday, Aug. 25, 2025)

- I'm participating in a book signing for "Software Testing Strategies" (Monday, Aug. 25, 2025)

- I'm delivering a workshop on Accessibility and Inclusive Design (Wednesday, Aug. 27, 2025)

In addition to all of that, I'm donning a Red Shirt and acting as a facilitator/moderator for several sessions, so my standard Live Blog/post every session will by necessity be fewer this go around as I physically will not be able to do that this go around. Nevertheless, I shall do the best I can.

The opening keynote is being delivrered by Olivia Gambelin and she is speaking on "Elevating the Human in the Equation: Responsible Quality Testing in the Age of AI"

Olivia describers herself as an "AI Ethiscist" and she is the author of "Responsible AI". This of course brings us back to a large set of questions and quandaries. For a number of people, we may think of AI in the scope of LLM's like ChatGPT or Claude and many people may be thinking, "What's the big deal? It's just like Google only the next step." While that may be a common sentiment, that's not the full story. AI is creating a much larger load on our power infrastructure. Huge datacenters are being built out that are making tremendous demands on power, water consumption, and on polluion/emissions. It's argued that the growth of AI will effectively consume more of our power grid resources than if we were to entirely convert everyone over to electric vehicles. Thus, we have questions that we need to ask that go beyond just the fact that we are interacting with data and digital representations of information.

The common refrain of "just because we can do something doesn't necessarily mean that we should". While that is a wonderful sentiment, we have to accept the fact that that ship has sailed. AI is here, it is present, in both trivial and non trivial uses, and all of the footprint issues that that entails. All of us will have to wrestle with what AI means to us, how we use it, and how we might be able to use it responsibly. Note, I am thus far talking about a specific aspect of environmental degradation. I'm not even getting into the ethical concerns when it comes to how we actually look at and represent data.

AI is often treated as a silver bullet and something that can help us get answers for areas and situations we've perhaps mnot previously considered. One of the bigger questions/challenges is how we get to that information, and who/what is influencing it. AI can be biased based on the data sets that it is provided. Give it a limited amount of data, it will give a limited set of results based on the information it has or how that information was introduced/presented. AI as it exists today is not really "Intelligent". It is excellent pattern recognition and potential predictive text presentation. It's also good at repurposing things that it already knows about. Do you want to keep a newsletter fresh with information you present regularly? AI can do that all day long. We can argue the value add of such an endeavor but I can appreciate for those who have to pump out lots of data on a regukar basis, this is absolutely a game changer.

There are of course a number of areas that are significantly more sophisticated and data that is much more pressing. Medical imaging and interpreting the details provided is something that machines can crunch in a way that a group of humans will take a lot of time to do with their eyes and ears. Still, lots of issues can still come to bear because of these systems. For those not familiar with the "Texas Sharpshooter Fallacy", it's basically the idea of someone shooting a lot of shots into the side of a barn over time. If we draw a circle around the largest cluster of bullets, we can infer that whoever shot those shots was a good marksman. True? Maybe not. We don't know how long it took to shoot those bullets, how many shots are outside of the circle, the ratio of bullets inside vs. outside of the circle, etc. In other words, we could be making assumptions based on how we are grouping something that a bias and prejudice is leaning on. Having people look at these can help us counter those biases but it can also introduce new ones based on the people that have been asked to review the data. To borrow an old quote that I am paraphrasing because I don't remember who said it originally, "We do not see the world for what it is, we see it for who we are". AI doesn't counteract that tendency, it amplifies it, especially if we are spcifically looking for answers that we want to see.

Olivia is arguing, convincingly, that AI has great potential but also has significant liabilities. It is an exciting aspect of technology but it is also difficult to pin down as to what it actually provides. Additionally, based on its pattern matching capabilities, AI can be wrong... a lot... but as a friend of mine os fon of saying, "The danger of AI is not that it is often wrong, it's that it is so confidently wrong". It can lull one into a false sense of authority or reality of a situation. Things can seem very plausible and sensible based on our own experiences but the data we are getting can be based on thin air and hallucinations. If those hallucinations scratch a particular itch of ours, we are more inclined to accept the findings/predictions that match our world view. More to the point, we can put our finger on the scale, whether we mean to or not, to influence the answers we get. Responsible AI would make efforts to help combat these tendencies, to help us not just get thr answers that we want to have but help us challenge and refute the answers we are receiving.

From a quality perpective, we need to have a direct conversation as to what/why we would be using AI in the first place. Is AI a decent answer to looking at writing code in ways we might not be 100% familiar? Sure. It can introduce aspects of code that we might not be super familiar with. That's a plus and it's a danger. I can question and check for quality of noutput for areas I know about or have solid familiarity. I am less likely to question areas I am lacking knowledge in or actually look to disprove or challenge the findings.

For further thoughts and diving deeper on these ideas, I plan to check out "Responsible AI: Implement an Ethical Approach in Your Organization" (Kogan Page Publishing). Maybe y'all should too :).

Wednesday, October 16, 2024

What Are We Thinking — in the Age of AI? with Michael Bolton (a PNSQC Live Blog)

In November 2022, the release of ChatGPT 3 brought almost overnight the world of the Large Language Model (LLM) to prominence. With its uncanny ability to generate human-like text, it quickly led to lofty promises and predictions. The capabilities of AI seemed limitless—at least according to the hype.

In May 2024, GPT-4o further fueled excitement and skepticism. Some hailed it as the next leap toward an AI-driven utopia. Others, particularly those in the research and software development communities, took a more skeptical approach. The gap between magical claims and the real-world limitations of AI was becoming clearer.

In his keynote, "What Are We Thinking — in the Age of AI?", Michael Bolton challenges us to reflect on the role of AI in our work, our businesses, and society at large. He invites us to critically assess not just the technology itself, but the hype surrounding it and the beliefs we hold about it.

From the moment ChatGPT 3 debuted, AI has seen a lot of immense fascination and speculation. On one hand, we’ve heard the promises of AI revolutionizing software development, streamlining workflows, and automating complex processes. On the other hand, there have been dire warnings about AI posing an existential threat to jobs, particularly in fields like software testing and development.

For those in the testing community, we may feel weirdly called out. AI tools that can generate code, write test cases, or even perform automated testing tasks raise a fundamental question: Will AI replace testers?

Michael’s being nuanced here. While AI is powerful, it is not infallible. Instead of replacing testers, AI presents an opportunity for testers to elevate their roles. AI may assist in certain tasks, but it cannot replace the critical thinking, problem-solving, and creativity that human testers bring to the table.

One of the most compelling points Bolton makes is that **testing isn’t just about tools and automation**—it’s about **mindset**. Those who fall prey to the hype of AI without thoroughly understanding its limitations risk being blindsided by its flaws. The early testing of models like GPT-3 and GPT-4o revealed significant issues, from **hallucinations** (where AI generates false information) to **biases** baked into the data the models were trained on.

Bolton highlights that while these problems were reported early on, they were often dismissed or ignored by the broader community in the rush to embrace AI’s potential. But as we’ve seen with the steady stream of problem reports that followed, these issues couldn’t be swept under the rug forever. The lesson? **Critical thinking and skepticism are essential in the age of AI**. Those who ask tough questions, test the claims, and remain grounded in reality will be far better equipped to navigate the future than those who blindly follow the hype.

We should consider our relationship with technology. As AI continues to advance, it’s easy to become seduced by the idea that technology can solve all of our problems. Michael instead encourages us to examine our beliefs about AI and technology in greater depth and breadth.

- Are we relying on AI to do work that should be done by humans?
- Are we putting too much trust in systems that are inherently flawed?
- Are we, in our rush to innovate, sacrificing quality and safety?

Critical thinking, and actually practicing/using it, is more relevant than ever. As we explore the possibilities AI offers, we must remain alert to the risks. This is not just about preventing bugs in software—it’s literally about safeguarding the future of technology and ensuring that we use AI in ways that are ethical, responsible, and aligned with human values.

Ultimately, testers have a vital role in this new world of AI-driven development. Testers are not just there to check that software functions as expected, this is our time to step up and be the clarions we claim we are. We are the guardians of quality, the ones who ask “What if?”, and probe the system for hidden flaws. In the age of AI, we need to be and do this more than ever.

Michael posits that AI may assist with repetitive tasks, but it cannot match the *intuition, curiosity, and insight that human testers bring to the job.

It’s still unclear what the AI future will hold. Will we find ourselves in an AI-enhanced world of efficiency and innovation? Will our optimism give way to a more cautious approach? We don't know, but to be sure, those who practice critical thinking, explore risks, and test systems rigorously will have a genuine advantage.

When Humans Tested Software (AI First Testing) with Jason Arbon (a PNSQC Live Blog)

Are we at the edge of a new era in software development—an era driven by Generative AI? Will AI fundamentally change the way software is created? As GenAI begins to generate code autonomously, with no developers in the loop, how will we test all this code?

That's a lot of bold questions, and if I have learned anything about Jason Arbon over the years, bold is an excellent description of him. To that end, Jason suggests a landscape where AI is set to generate 10 times more code at 10 times the speed, with a 100-fold increase in the software that will need to be tested. The truth is, that our traditional human-based testing approaches simply won’t scale to meet this challenge.

Just like human-created code, AI-generated code is not immune to bugs. As GenAI continues to evolve, the sheer volume of code it produces will surpass anything we’ve seen before. Think about it: if AI can generate 10 times more code, that’s not just a productivity boost—it’s a tidal wave of new code that will need to be tested for reliability, functionality, and security. This surge is not just a matter of speed; it’s a "complexity crisis". Modern software systems, like Amazon.com, are far too intricate to be tested by human hands alone. According to Jason, AI-generated code will require AI-based testing. Not just because it’s faster, but because it’s the only solution capable of scaling to match this growth.

The current approach to software testing struggles to keep pace with traditional development cycles. In the future, with the explosion of AI-generated code, human-based testing methods will fall short unless we somehow hire a tenfold increase in software testers (I'm skeptical of that happening). Manual testing will absolutely not be able to keep up, and automated testing as we know it today won’t be able to keep up with the increasing volume and complexity of AI-generated systems.

What’s more, while GenAI can generate unit tests, it can’t test larger, more complex systems. Sure, it can handle individual components, but it stumbles when it comes to testing entire systems, especially those with many interdependencies. Complex applications, like enterprise-level platforms or global e-commerce sites, don’t fit neatly into a context window for GenAI to analyze. This is where Jason says the need for AI-based testing becomes critical.

The future isn’t just about AI generating code—it’s about AI testing that AI-generated code. According to Jason, AI-based testing is the key to addressing the 100X increase in software complexity and volume. Only AI has the ability to scale testing efforts to match the speed and output of Generative AI.

AI-first testing systems should be designed to:

Automate complex testing scenarios that would be impossible for traditional methods to cover efficiently.

Understand and learn from system behaviors, analyzing patterns and predicting potential failures in ways that humans or current automated tools cannot.

Adapt and evolve, much like the AI that generates code, enabling continuous testing in real-time, as software systems grow and change.

As Jason points out, AI is not a fad or a trend, it’s the only way forward. As we move into an era where Generative AI produces vast amounts of code at breakneck speed, AI-based testing will be the way that we help ensure that the software we create tomorrow will be reliable, functional, and secure.

Our Keynote is Finished, and I have an Announcement To Make

Today I had the chance to deliver my first keynote talk at a conference. Matt Heusser and I delivered a talk about "AI in Testing: Hip of Hype?" and by all accounts, I think it went well. We set up the talk to play off each other, where I represented the hip elements of AI, and Matt highlighted the Hype aspects. At times it may have come across as a bit of an Abbott and Costello routine but that added to the fun of it for me. I will do a more in-depth post on our keynote later but I did make an announcement here that needs to be broadcast.

On October 1, 2024, I started working as a Senior Development Test Engineer for ModelOP. They are based in Chicago and are focused on providing monitoring and management solutions for AI Governance. If that seems vague, it's because I'm literally learning about all this as I go. My job responsibilities will be testing-related, with a major emphasis on automation and accessibility.

We made a point in this talk that this would be the last of the series of times Matt and I have taught together or spoken together where I was working specifically for Excelon Development. Matt gave me many valuable insights into what it took to be an independent consultant and how to work effectively in that space. I hope to leverage those lessons in this new role and ultimately be effective in that capacity.

It's been a strange journey over the past fifteen and a half months but I learned a lot through it, I think I grew a great deal, and I learned I had capacity in areas I didn't think I had.

Tuesday, October 15, 2024

Humanizing AI with Tariq King (a PNSQC Live Blog)

I've always found Tariq's talks to be fascinating and profound and this time around we're going into some wild territory.

AI is evolving, and with each new development, it’s becoming more "human". It’s not just about executing tasks or analyzing data—it’s about how AI communicates, adapts, and even imitates.

So AI is becoming human... in how it communicates. That's a big statement but with that qualifier, it is more understandable. AI is no longer a cold, mechanical presence in our lives. Today’s AI can respond based on context, understanding the tone of requests and adjusting replies accordingly. It can mimic human conversation, match our language, and create interactions that feel amazingly real. Whether you’re chatting with a customer service bot or getting personalized recommendations, AI can engage with us in ways that were once the domain of humans alone.

Okay, so if we are willing to say that AI is "becoming human", how should we shape these interactions?What should the boundaries be for AI communication, and how do we ensure it serves us, rather than replaces us?

Beyond just communication, AI is showing remarkable creativity. AI can now write stories, compose music, and generate art, ranging from wild and weird to quite stunning (I've played around with these for several years, and I have personally seen the development of these capabilities and they have indeed become formidable and impressive). What once seemed like the exclusive realm of human creativity is now being shared with machines. AI is no longer just a tool—it’s being used as a collaborator that can generate solutions and creative works that blur the line between human and machine-generated content.

Tariq points out that this raises some significant and critical questions. Who owns AI output? How do we credit or cite AI authorship? How do we confirm the originality of works? Perhaps more to the point, as AI generates content, what is the human role in the creative process? And how do we ensure that the human element remains at the forefront of innovation?

AI is getting better at how convincingly it can imitate humans. But there’s a caveat: AI is prone to hallucinations, meaning it can produce plausible and relatable material that feels right for the most part but may be wrong (and often is wrong). I have likened this in conversations to having what I call the "tin foil moment". If you have ever eaten a food truck burrito (or any burrito to go, really) you are familiar with the foil wrapping. That foil wrapping can sometimes get tucked into the folds and rolls of the burrito. Occasionally, we bite into that tin foil piece and once we do, oh do we recognize that we have done that (sometimes with great grimacing and displeasure). Thus, when I am reading AI-generated content, much of the time, I have that "tin foil" moment and that takes me out of believing it is human (and often stops me being willing to read what follows, sadly).

The challenge here is not just humanization. We need to have critical oversight over it so that we can have it do what we want it to do and not go off the rails. How do we prevent AI from spreading misinformation? And how can we design systems that help us discern fact from fiction in a world where AI-generated content is increasingly common?

Okay, so we are humanizing AI... this begs a question... "Is this something we will appreciate or is it something that we will fear?" I'm on the fence a bit. I find a lot of the technology fascinating but I am also aware of the fact that humanity is subject to avarice and mendacity. Do we want AI to be subject to it as well, or worse, actively practice it? What unintended consequences might we see or incur?

For some of you out there, you may already be thinking of some abstract idea called "AI Governance", which is the act of putting guardrails and safety precautions around AI models so that they perform as we want them to. This means setting clear ethical guidelines, robust oversight mechanisms, and working to ensure that AI is used in ways that benefit society. More to the point, we need to continuously monitor and work with AI to help ensure that the data that it works with is clean, well-structured, and not poisoned. That is a never-ending process and one we have to be diligent and mindful of if we wish to be successful with it.

Make no mistake, AI will continue to evolve. To that end, we should approach it with both excitement and caution. AI’s ability to communicate, create, and imitate like humans presents incredible opportunities, but it also brings with it significant challenges. Whether AI becomes an ally or a threat depends on how we manage its "humanization".

AI-Augmented Testing: How Generative AI and Prompt Engineering Turn Testers into Superheroes, Not Replace Them with Jonathon Wright’s (a PNSQC Live Blog)

Sad that Jonathon couldn't be here this year as I had a great time talking with him last year but since he was presenting remotely, I could still hear him talking on what is honestly the most fun title of the entire event (well played, Jonathon, well played ;) ).

It would certainly be neat if AI was able to enhance our testing prowess, helping us find bugs in the most unexpected places, and create comprehensive test cases that could cover every conceivable scenario (editors note: you all know how I feel about test cases but be that as it may, many places value and mandate them, so I don't begrudge this attitude at all).

Jonathon is calling for us to recognize and use "AI-augmented testing" where AI doesn't replace testers but instead amplifies their capabilities and creativity. Prompt engineering can elevate the role of testers from routine task-doers to strategic innovators. Rather than simply executing tests, testers become problem solvers, equipped with "AI companions" that help them work smarter, faster, and more creatively (I'm sorry but I'm getting a "Chobits" flashback with that pronouncement. If you don't get that, no worries. If you do get that, you're welcome/I'm sorry ;) (LOL!) ).

The whole goal of AI-augmented testing is to elevate the role of testers. Testers are often tasked with running manual or automated tests, getting bogged down in repetitive tasks that demand "attention to detail" but do not allow much creativity or strategic thinking. The goal of AI is to "automate the routine stuff" so we can "allowing testers to focus on more complex challenges" ("Stop me! Oh! Oh! Oh! Stop me... Stop me if you think that you've heard this one before!") No disrespect to Jonathon. whatsoever, it's just that this has been the promise for 30+ years (and no, I'm not going to start singing When In Rome to you, but if that earworm is in your head now.... mwa ha ha ha ha ;) ).

AI-augmented testing is supposed to enable testers to become strategic partners within development teams, contributing, not merely bug detection but actual problem-solving and quality improvement. With AI handling repetitive tasks, testers can shift their attention to more creative aspects of testing, such as designing unique test scenarios, exploring edge cases, and ensuring comprehensive coverage across diverse environments. This shift is meant to enhance the value that testers bring to the table and make their roles more dynamic and fulfilling. Again, this has been a promise for many years, maybe there's some headway here.

The point is that testers who want to harness the power of AI will need a roadmap for mastering AI-driven technologies. there are many of them out there and there is a plethora of options in a variety of implementations from LLMs to dedicated testing tools. No tester will ever master them all but even if you only have access to a LLM system like Chat GPT, there is a lot that can be done with Prompt Engineering and harnessing the output of these LLM systems. They are of course not perfect but they are getting better and better all the time. AI can process vast amounts of data, analyze patterns, and predict potential points of failure, but it still requires humans to interpret results, make informed decisions, and steer the testing process in the right direction. Testers who embrace AI-augmented testing will find themselves better equipped to tackle the challenges of modern software development. In short, AI will not take your job... but a tester who is well-versed in AI just might.

This brings us to Prompt engineering. This is the process of precise, well-designed prompts that can guide generative AI TO perform specific testing tasks. Mastering prompt engineering will allow testers to customize AI outputs to their exact needs, unlocking new dimensions of creativity in testing.

Ss What Can we Do With Prompt Engineering? We can use it to...

- instruct AI to generate test cases for edge conditions
- simulate rare user behaviors
- explore vulnerabilities in ways that would be difficult or time-consuming to code manually.
- validating AI outputs so that we ensure that generated tests align with real-world needs and requirements.

Okay, so AI can act as a trusted companion—an ally helping testers do their jobs more effectively, without replacing the uniquely human elements of critical thinking and problem-solving. Wright’s presentation provides testers with actionable strategies to bring AI-augmented testing to life, from learning the nuances of prompt engineering to embracing the new role of testers as strategic thinkers within development teams. We can transform workflows so they are more productive, efficient, and engaging.

I'll be frank, this sounds rosy and optimistic but wow, wouldn't it be nice? The cynic in me is a tad bit skeptical but anyone who knows me knows I'm an optimistic cynic. Even if this promise turns out to be a magnitude of two less than what is promised here... that's still pretty rad :).

Vulnerabilities in Deep Learning Language Models (DLLMs) with Jon Cvetko (A PNSQC Live Blog)

Vulnerabilities in Deep Learning Language Models (DLLMs)

There's no question that AI has become a huge topic in the tech sphere in the past few years. It's prevalent in the talks that are being presented at PNSQC (it's even part of my talk tomorrow ;) ). The excitement is contagious, no doubt exciting but there's a bigger question we should be asking (and John Cvetko is addressing)... what vulnerabilities are we going to be dealing with, specifically in Deep Learning Language Model Platforms like ChatGPT?

TL;DR version: are there security risks? Yep! Specifically, we are looking at Generative Pre-trained Transformer (GPT) models. As these models evolve and expand their capabilities, they also widen the attack surface, creating new avenues for hackers and bad actors. It's one thing to know there are vulnerabilities, it's another to understand them and learn how to mitigate them.

Let's consider the overall life cycle of a DLLM. we start with our initial training phase, then move to deployment, and then monitor its ongoing use in production environments. DLLMs require vast amounts of data for training. What d we do when this data includes sensitive or proprietary information? If that data is compromised, organizations can suffer significant privacy and security breaches.

John makes a point that federated training is growing when it comes to the development of deep learning models. Federated training means multiple entities will contribute data to train a single model. The benefit is that it can distribute learning and reduce the need for centralized data storage, it also introduces a new range of security challenges. Federated training increases the risk of data poisoning, where malicious actors intentionally introduce harmful data into the training set to manipulate the model’s generated content.

Federated training decentralizes the training process so that organizations can develop sophisticated AI models without sharing raw data. However, according to Cvetko, a decentralized approach also expands the attack surface. Distributed systems are nearly by design more vulnerable to tampering. Without proper controls, DLLMs can be compromised before they even reach production.

there is always a danger of adversarial attacks during training. Bad actors could introduce skewed or intentionally biased data to alter the behavior of the model. This can lead to unpredictable or dangerous outcomes when the model is deployed. These types of attacks can be difficult to detect because they occur early in the model’s life cycle, often before serious testing begins.

OK, so that's great... and unnerving. We can make problems for servers. So what can we do about it?

Data Validation: Implement strict data validation processes to ensure that training data is clean, accurate, and free from malicious intent. By scrutinizing the data that enters the model, organizations can reduce the risk of data poisoning.

Model Auditing: Continuous monitoring and auditing of models during both training and deployment phases. This helps detect oddities in the model behavior early on, allowing for quicker fixes and updates.

Federated Learning Controls: Establish security controls around federated learning processes, such as encrypted communication between participants, strict access controls, and verification of data provenance.

Adversarial Testing: Conduct adversarial tests to identify how DLLMs respond to unexpected inputs or malicious data. These tests can help organizations understand the model’s weaknesses and prepare for potential exploitation.

There is a need today, for "Responsible AI development." DLLMs are immensely powerful and can carry significant risk potential if not properly secured. While this "new frontier" is fun and exciting, we have a bunch of new security challenges to deal with. AI innovation does not have to come at the expense of security. By understanding the life cycle of DLLMs and implementing the right countermeasures, we can leverage the power of AI while at the same time safeguarding our systems from evolving threats.

Friday, April 21, 2023

Low-level Approaches for Testing AI/ML: an #InflectraCON2013 Live Blog

One of the great parts of conferences like this is that I meet people I have interacted with for years. Jeroen is one of those people. We worked together on the book "How to Reduce the Cost of Software Testing" back in 2010 but we have never met in person before this week. We've had some great conversations here and now I finally see him present.

Jeroen Rosink

Sr. Test consultant, Squerist

I think it's safe to say everyone has been hit with some form of AI and ML in some capacity. If you need an explanation of AI and Machine learning, I'll let Chat GPT tell you ;).

AI, or Artificial Intelligence, refers to the development of computer systems that can perform tasks that would typically require human intelligence. These tasks might include things like recognizing speech or images, understanding natural language, making decisions, and solving problems. AI can be classified into various categories such as supervised learning, unsupervised learning, reinforcement learning, and deep learning.

Machine learning is a subset of AI that focuses on teaching computers how to learn from data without being explicitly programmed. In other words, it's a method of training algorithms to make predictions or decisions based on patterns in data. Machine learning algorithms can be trained on a variety of data types, including structured data (like spreadsheets) and unstructured data (like text or images). The most commonly used machine learning algorithms are supervised and unsupervised learning algorithms.

I mean, that's not bad, I'll take it. So I used AI to explain AI. What Inception level is this ;).

AI is always learning and it has been trained on large data sets. I often look at AI as a good research assistant. It can do some pretty good first-level drafting but it may miss out on some of the nuances and it may also not be completely up to date with the information it provides. Also, Machine Learning really comes down to ranking agents and probability. The more successes it establishes, the higher it ranks certain responses. To be clear, even with how rad AI and ML seem to be, we are still in the early days of it. We can have all sorts of debates as to how much AI will take over our work lives and make us obsolete. Personally, I don't think we are anywhere near that level but I'd be a fool to not pay attention to its advances. Therefore, we need to consider not just how we are going to deal with these things but how we are going to test them going forward.

Jeroen talks about the confusion matrix and how that is used to test ML.

The confusion matrix is used to evaluate machine learning models, particularly in classification tasks. Think of it as a table with a number of correct and incorrect predictions made by a model for each class in a set of data.

The four possible outcomes are:
- true positives (TP)
- false positives (FP)
- true negatives (TN)
- false negatives (FN).

A true positive occurs when the model correctly predicts a positive instance.
A false positive occurs when the model incorrectly predicts a positive instance.
A true negative occurs when the model correctly predicts a negative instance.
A false negative occurs when the model incorrectly predicts a negative instance.

Jeroen has two approaches that he is recommending:

The Auditor's Approach

First, we perform a walkthrough so that we can see if the data is reliable and useful. From there, we do a Management Test to use data in enough volume to see if the data as presented works with small and larger numbers. If we can see that the data is relevant with one, and with 25, then we can see if it's relevant with 50 or 100, or 1000 and so on. We can't predict the output but we can have some suppositions as to what they might do.

The Blackhole Approach

This is an interesting approach in which we don't necessarily know what the data is or what we would actually have as data. We can't describe what is actually inside the black hole but we can describe what surrounds or is visible around the black hole. In this capacity, we look for patterns and anomalies that don't correspond with our expectations. If we see a pattern that doesn't match what we expect, we may have an issue or something that we should investigate but we are not 100% sure of that fact. Jeroen explained that there's a technique that can be used in the classic illustrations for "Where's Waldo?" The idea is that with a pen and making some marks on the page, we can figure out where Waldo is in about ten passes. To be clear, the system doesn't know where Waldo is, but it examines patterns in the image and breaks down the patterns to figure out where the item it is looking for might be.

These are neat ideas and frankly, I would not have considered these prior to today but be sure I'm going to think a lot more about these going forward :).

Tuesday, October 11, 2022

Digitizing Testers: A #PNSQC2022 Live Blog with @jarbon

I must confess, I usually smile any time I see that Jason Arbon is speaking. I may not always agree with him but I appreciate his audacity ;).

I mean, seriously, when you see this in a tweet:

I’m sharing perhaps the craziest idea in software testing this coming Tuesday. Join us virtually, and peek at something almost embarrassingly ambitious along with several other AI testing presentations.

You know you're going to be in for a good time.

Jason Arbon

I'm going to borrow this initial pitch verbatim:

Not everyone can be an expert in everything. Some testers are experts in a specific aspect of testing, while other testers claim to be experts. Wouldn’t it be great if the testing expert who focuses on address fields at FedEx could test your application’s address fields? So many people attend Tariq King’s microservices and API testing tutorials–wouldn’t it be great if a virtual Tariq could test your application’s API? Jason Arbon explores a future where great testing experts are ultimately digitized and unleashed will test the world’s apps–your apps.

Feeling a little "what the...?!!" That's the point. Why do we come to conferences? Typically it's to come and learn things from people who know a thing or three more than we do. Of course, while we may be inspired to learn something or get inspired to dig deeper, odds are we are not going to develop the same level of expertise as, say, Tariq King when it comes to using AI and ML in testing. For that matter, maybe people look to me and see me as "The Accessibility and Inclusive Design Expert" (yikes!!! if that's the case but thank you for the compliment). Still, here's the point Jason is trying to make... what if instead of learning from me about Accessibility and Inclusive Design, *I* did your Accessibility and Inclusive Design Testing? Granted, if I were a consultant in that space, maybe I could do that. However, I couldn't do that for everyone... or could I?

What if... WHAT IF... all of my writings, my presentations, my methodologies & approaches, were gathered, analyzed, and applied to some kind of business logic and data model construction. Then, by calling on all of that, you could effectively plug in all of my experience to actually test your site for Accessibility and Inclusive Design. In short, what if you could purchase "The Michael Larsen AID" testing bot and plug me into your testing scripts. Bonkers, right?! Well... here's the thing. Once upon a time, if someone were to tell me that I could effectively buy a Mesa Boogie Triple Rectifier tube amp and a pair of Mesa 4x12 cabinets loaded with Celestion Vintage 30s, be able to select that as a virtual instrument and impulse controllers, and get a sound that sounds indistinguishable compared to the real thing? Ten years ago. Impossible. Today? Through Amplitube 5, I literally own that setup and it works stunningly well.

Arguably, the idea of taking what I've written about Accessibility and Inclusive Design and compartmentalizing that as a "testing persona" is probably a lot easier than creating a virtual tube amp. I'm not saying that the results would be an exact replica of what I would do while I test... but I think the virtual version of me could reliably be called upon to do what I at least have said I did or at least what I espouse when I speak. Do you like my overall philosophy? Then maybe the core of my philosophy could be written into logic so that you can have my overall philosophy applied to your application.

I confess the idea of loading up the "Michael Larsen AID" widget cracks me up a bit. For it to be effective, sure, I could go in the background and look at stuff and give you a yes/no report. However, that skips over a lot of what I hope I'm actually bringing to the table. When I talk about Accessibility and Inclusive Design, only a small part of it is my raw testing efforts. Sure, it's there and I know stuff but what I think makes me who and what I am is my advocacy and my frenetic energy of getting into people's faces and advocating about these issues. Me testing is a dime a dozen. Me advocating and explaining the pros and cons as to why your pass might actually be a fail is where I can really be of benefit. Sure, I could work in the background, but I'd rather be the present Doctor as we remember him on Star Trek: Voyager.

Thanks, Jason. This is a fun and out-there thought experiment. I must confess the thought of buying me as a "Virtual Instrument" both cracks me up and intrigues me. I'm really curious to see if something like this could really come to be. Still, I think you may be able to encapsulate and abstract my core knowledge base but I'd be surprised if you could capture my advocacy. IF you want to try, I'm game to see if I could be done ;).

Pages