OpenAI Develops AI Text Detector, Debates Public Release
OpenAI develops accurate AI text detector but hesitates to release them, focusing instead on image and audio authentication tools
OpenAI says it has developed a solution that would let people detect if ChatGPT generated a piece of text, but reportedly is unsure whether it should release it.
The Microsoft-backed AI firm sought to develop an AI detector tool after its AI Text Classifier system was shut down due to its poor performance.
OpenAI said earlier this week that it developed a “highly accurate” text watermarking method.
However, according to the Wall Street Journal, OpenAI has been hesitant to release it, with internal leaders debating a potential public launch for two years.
The company is concerned that the tool, which has been ready for public use for at least a year, “could stigmatize [the] use of AI as a useful writing tool for non-native English speakers.”
The detection tool can also be circumvented by what OpenAI described as “globalized tampering” such as using a translation system or another AI model to reword outputs or ask ChatGPT to insert special characters in between every word and then deleting that character.
OpenAI said it’s also been testing whether metadata can be used to detect AI-generated text.
While in the early stages of its research, OpenAI said characteristics of metadata make the approach “particularly promising.”
“Unlike watermarking, metadata is cryptographically signed, which means that there are no false positives,” according to a company blog post. “We expect this will be increasingly important as the volume of generated text increases.”
Focusing on Images, Video and Audio
OpenAI said it’s instead prioritizing plans to launch detection tools for audiovisual content as they “present higher levels of risk at this stage of capabilities of our models.”
Among those audiovisual plans include adding C2PA metadata into images generated by DALL-E in ChatGPT.
OpenAI joins TikTok, Adobe and Microsoft in using the content credentials technology. It was developed by the Coalition for Content Provenance and Authenticity and attaches metadata to a piece of content which can then be traced back to its origin, even if the image is later edited.
Credit: OpenAI
OpenAI also confirmed that C2PA metadata will also be included in content created by the video generation model, Sora. OpenAI said the model is still being developed and that the C2PA inclusion will be featured upon its broad launch.
“People can still create deceptive content without this information (or can remove it), but they cannot easily fake or alter this information, making it an important resource to build trust,” according to an OpenAI blog post. “Over time, we believe this kind of metadata will be something people come to expect, filling a crucial gap in digital content authenticity practices.”
OpenAI also announced that users can apply to test its new tamper-resistant watermarking solution for images. The company wants feedback on the solution from research labs and research-oriented journalism nonprofits.
The tool predicts the likelihood that an image was generated by OpenAI’s image generation model.
OpenAI said the tool correctly identified around 98% of images created by DALL-E 3 while just 0.5% of non-AI generated images were incorrectly tagged as being made by the model.
Audio detection solutions are also in the works as OpenAI aims to enhance ChatGPT’s capabilities with an upgraded Voice Mode, focusing on expanding its audio generation features.
OpenAI said it’s added audio watermarking into Voice Engine, its new voice generation model currently kept in a limited research preview.
First showcased in April, Voice Engine can generate “emotive and realistic voices,” however, OpenAI stopped short of releasing it over concerns about potential misuse.
“We are committed to continuing our research in these areas to ensure that our advancements in audio technologies are equally transparent and secure,” the company said.
Read more about:
ChatGPT / Generative AIAbout the Author
You May Also Like