TikTok Parent Allegedly Used OpenAI’s API to Build Rival Models

ByteDance reportedly used OpenAI's API to build rival foundation models in violation of its terms of service

December 19, 2023

2 Min Read

At a Glance

The Verge is reporting that ByteDance used OpenAI's API to build rival foundation models, violating its terms of service.
OpenAI has suspended ByteDance's account and is investigating the accusations.
The TikTok parent told AI Business its use of the OpenAI API is “to a very limited extent.”

TikTok parent ByteDance reportedly is violating OpenAI’s terms of service by using its tech to develop rival large language models.

According to the Verge, ByteDance is using OpenAI’s API to gather data to build its own foundation model, under the working name Project Seed. The Chinese company has been working on generative AI for some time, with its researchers creating powerful 3D generation models.

OpenAI’s rules for using its tech explicitly state that output from models like GPT-4 cannot be used to develop rival models. However, ByteDance is reportedly purchasing access to OpenAI’s tech via Microsoft - which has similar rules in place - and it has been regularly maxing out its API access.

ByteDance is alleged to have used the API for almost the entirety of Project Seed's development, including training and model evaluation.

The Verge got hold of employee chatter about it on Lark, ByteDance’s internal messaging platform, about how to “whitewash” evidence that the company has been using OpenAI’s tech for illicit purposes.

ByteDance developers, largely based in China, are alleged to have obfuscated their use of OpenAI’s API via data desensitization, where data is masked to protect it. This technique is usually used to protect business-sensitive information or personal data.

OpenAI told The Verge that ByteDance's ChatGPT account has since been suspended with an investigation ongoing.

ByteDance using OpenAI’s API ‘to a very limited extent’

A ByteDance spokesperson told AI Business that it “places great emphasis on following OpenAI's terms of use.”

“We use GPT to power products and features in non-China markets, but use our self-developed model to power Doubao, which is only available in China.”

Doubao is a conversational AI system built by ByteDance where users interact via images and text. According to ByteDance, a small group of its engineers used OpenAI’s API service for “an internal small experimental model which was never launched.”

The TikTok parent said that practice was “stopped immediately” back in April with a new internal requirement being introduced that text produced by GPT models should not be added to the training datasets of the company's self-developed models.

ByteDance then said its team conducted examinations and took measures to ensure its engineers were compliant, including conducting batch sampling and then compared the similarity of its labeled data to OpenAI’s results to “prevent inappropriate use by data annotators.”

“As of now, the engineering team uses the GPT APIs to a very limited extent during the evaluation/testing process, such as score benchmarking,” according to ByteDance.

Chinese tech giants like ByteDance as well as Baidu and Alibaba have rushed to build their own large language models, in the wake of ChatGPT's popularity. Last week, a new Chinese supercomputer for training AI models was launched to support local efforts.

About the Author(s)

Ben Wodecki

Jr. Editor

Ben Wodecki is the Jr. Editor of AI Business, covering a wide range of AI content. Ben joined the team in March 2021 as assistant editor and was promoted to Jr. Editor. He has written for The New Statesman, Intellectual Property Magazine, and The Telegraph India, among others. He holds an MSc in Digital Journalism from Middlesex University.

See more from Ben Wodecki

Related Topics

Recent in ML

Related Topics

Recent in NLP

Related Topics

Recent in Data

Related Topics

Recent in Automation

Related Topics

Recent in Verticals

Related Topics

Recent in Responsible AI

Related Topics

Recent in Companies

Related Topics

TikTok Parent Allegedly Used OpenAI’s API to Build Rival Models

At a Glance

ByteDance using OpenAI’s API ‘to a very limited extent’

About the Author(s)

Latest News

Trending articles