Microsoft's Free AI Security Tester for Gen AI Models

Microsoft's PyRIT generates malicious prompts to stress-test models, reducing testing time from weeks to hours

Ben Wodecki, Jr. Editor

February 23, 2024

2 Min Read
Image of a red lock
Getty Images

At a Glance

  • Microsoft is offering free use of its internal security testing tool for language models.

Microsoft is releasing to the public the internal tool it uses to identify security vulnerabilities in its generative AI models.

Dubbed PyRIT (Python Risk Identification Toolkit), the tester can be used to assess language model endpoints for hallucinations, bias and the generation of prohibited content.

It can also identify potential ways the model can be used, like malware generation and jailbreaking, as well as potential privacy harms like identity theft.

The tool automates ‘red teaming’ tasks by sending malicious prompts. Upon receiving a response, it scores the model and then sends a new prompt to provide further testing.

Red teaming is when developers and security professionals stress-test AI models by exploiting gaps within the models' security architecture.

Microsoft used PyRIT to test one of its Copilot AI assistant systems, generating several thousand malicious prompts to evaluate its ability to cope with nefarious inputs. Using PyRIT, the testing process was completed in a matter of hours. Typically, testing would take weeks.

PyRIT, available via GitHub, can be used to find weaknesses in commercial applications under its MIT license, meaning it is free to use, revise and distribute.

The toolkit contains a list of demos including common scenarios notebooks, such as how to use PyRIT to automatically jailbreak systems.

Related:Generative AI and the New Frontier in Cybersecurity

Automate tedious testing tasks

PyRIT performs the red teaming process for you but it is not designed to replace manual evaluations, merely automating the more tedious parts of the process.

Instead, PyRIT’s goal is to provide developers with a baseline of how well their model and entire inference pipeline is doing so they can compare that baseline to future model iterations.

Microsoft said it made PyRIT open in the hopes of empowering security professionals and machine learning engineers to find risks in generative AI systems.

“This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements,” the PyRIT GitHub page reads.

Microsoft previously opened access to its Counterfit testing tool but found it struggled with generative AI systems. Counterfit is still applicable, however, for traditional machine learning systems.

Read more about:

ChatGPT / Generative AI

About the Author(s)

Ben Wodecki

Jr. Editor

Ben Wodecki is the Jr. Editor of AI Business, covering a wide range of AI content. Ben joined the team in March 2021 as assistant editor and was promoted to Jr. Editor. He has written for The New Statesman, Intellectual Property Magazine, and The Telegraph India, among others. He holds an MSc in Digital Journalism from Middlesex University.

Keep up with the ever-evolving AI landscape
Unlock exclusive AI content by subscribing to our newsletter!!

You May Also Like