AI Risk Mitigation: The Role of Testing

AI risks have grown with enterprise use but at the intersection of AI and risk mitigation lies the critical testing field

David Colwell, VP of AI & ML, Tricentis

October 15, 2024

3 Min Read
A software tester at a desk
Getty images

Organizations regularly using generative AI have almost doubled since 2023, reporting measurable benefits. Further research reveals that nearly a third of global DevOps teams estimate that AI-augmented tools will save the equivalent of an entire working week each month. Yet, there has also been a 474% increase in Fortune 500 companies listing AI as a risk to their business. As shown by MIT's recent AI Risk Repository, AI risks have grown with enterprise use. At the intersection of AI and risk mitigation lies the critical testing field.

As new legislation passes, testing and quality assurance become the bedrock of safe and responsible AI deployment and regulatory compliance. The EU AI Act and US Executive Order both reference test reports as core assets, while technology vendors such as Microsoft increasingly require them.

AI systems are less transparent than traditional algorithms, introducing new uncertainties and failure types. Therefore, we require novel testing approaches to ensure tools function as intended without unintended consequences. Testing must investigate edge cases, pressure improper responses, and expose unobserved vulnerabilities, biases, and failure modes. Only by doing this can we more confidently establish integrity and stability, defend against security breaches, and ensure optimal performance.

Related:Is AI the Answer to Achieving the 4-Day Week?

Rigorous Testing Approach to AI

Establishing a rigorous testing approach to AI begins with risk assessment. Software development and delivery teams must appraise how users interact with an AI system's functionality to determine the likelihood of failures and their potential severity. Identifying associated risks, whether legal, operational, reputational, security or cost-based, is an essential first step.

Human input is critical. AI systems lack human judgment, ethical reasoning, or an understanding of social nuance. They can produce false, biased, or harmful outputs. To manage these issues and generate the greatest value from AI output, development teams must first understand the system's behavior, capacity limitations, and complexities. They must get to grips with data science basics, the nuances of different AI models, and their training methods. They must also possess insight into their system's unique failure modes, from lack of logical reasoning to hallucinations. 

Red teaming reports are becoming a recognized AI standard, akin to the SOC 2 cybersecurity framework. This structured testing technique uncovers specific AI system flaws and identifies priorities for risk mitigation by recreating real-world attacks and threat actor techniques. Examining an AI model in this way tests the limits of its capabilities and ensures the system is safe, secure, and prepared for real-world scenarios.

Related:Professional Services Face Generative AI-Driven Pricing Conundrum

Transparency, communication, and documentation are also critical elements of a successful AI testing strategy, especially in meeting compliance and audit requirements outlined by recent regulations.

Continuous Evolution

However, we must remember that AI systems are constantly developing, meaning testing strategies must change with them. Continuous testing and regularly monitored testing approaches ensure that AI systems adapt to new developments, requirements, and emerging threats to maintain their integrity and reliability over time. 

New approaches like retrieval-augmented generation (RAG) are emerging as practical testing tools to reduce AI risks. By pulling real-time, relevant information from external knowledge bases, RAG grounds an AI's outputs in verified data, providing more precise and contextually accurate answers and significantly reducing hallucinations. As such, RAG can be implemented to create powerful, specialized AI tools capable of handling complex software testing tasks that general-purpose models might not effectively address.

Without comprehensive testing, software development teams will struggle to secure reliable, accessible, and responsible AI tools, which will, in turn, make regulatory compliance difficult. Therefore, crafting effective testing strategies is crucial for delivering safe and secure user experiences grounded in trust and dependability. Combining human oversight, recognition of AI limitations, and techniques such as red teaming and RAG can create safer, more effective AI systems that better serve human and business needs.

About the Author

David Colwell

VP of AI & ML, Tricentis, Tricentis

David Colwell is a credible voice in the field of AI, having built Tricentis’ AI team from the ground up and with over a decade of deep experience in developing, testing, and deploying AI, Machine Learning and neural networks to production. He is the co-inventor of the Vision AI product, an AI-based test automation feature in Tricentis’ flagship intelligent test automation product, Tosca, for which he received a patent from the USPTO for a new method for single-pass optical character recognition, with the aim of accelerating and enabling faster AI text recognition.

Keep up with the ever-evolving AI landscape
Unlock exclusive AI content by subscribing to our newsletter!!

You May Also Like