Sponsored by Google Cloud
Choosing Your First Generative AI Use Cases
To get started with generative AI, first focus on areas that can improve human experiences with information.
Microsoft's PyRIT generates malicious prompts to stress-test models, reducing testing time from weeks to hours
Microsoft is releasing to the public the internal tool it uses to identify security vulnerabilities in its generative AI models.
Dubbed PyRIT (Python Risk Identification Toolkit), the tester can be used to assess language model endpoints for hallucinations, bias and the generation of prohibited content.
It can also identify potential ways the model can be used, like malware generation and jailbreaking, as well as potential privacy harms like identity theft.
The tool automates ‘red teaming’ tasks by sending malicious prompts. Upon receiving a response, it scores the model and then sends a new prompt to provide further testing.
Red teaming is when developers and security professionals stress-test AI models by exploiting gaps within the models' security architecture.
Microsoft used PyRIT to test one of its Copilot AI assistant systems, generating several thousand malicious prompts to evaluate its ability to cope with nefarious inputs. Using PyRIT, the testing process was completed in a matter of hours. Typically, testing would take weeks.
PyRIT, available via GitHub, can be used to find weaknesses in commercial applications under its MIT license, meaning it is free to use, revise and distribute.
The toolkit contains a list of demos including common scenarios notebooks, such as how to use PyRIT to automatically jailbreak systems.
PyRIT performs the red teaming process for you but it is not designed to replace manual evaluations, merely automating the more tedious parts of the process.
Instead, PyRIT’s goal is to provide developers with a baseline of how well their model and entire inference pipeline is doing so they can compare that baseline to future model iterations.
Microsoft said it made PyRIT open in the hopes of empowering security professionals and machine learning engineers to find risks in generative AI systems.
“This allows them to have empirical data on how well their model is doing today, and detect any degradation of performance based on future improvements,” the PyRIT GitHub page reads.
Microsoft previously opened access to its Counterfit testing tool but found it struggled with generative AI systems. Counterfit is still applicable, however, for traditional machine learning systems.
Read more about:
ChatGPT / Generative AIYou May Also Like