That Was Fast: Stanford Yanks Alpaca Demo for Hallucinating

Removal due to ‘hosting costs and the inadequacies of our content filters’

March 24, 2023

3 Min Read

At a Glance

Stanford researchers confirm to AI Business that the demo was taken down due to ‘inadequacies of our content filters.’
The cost of the demo was also a factor for its shutdown.
Underlying dataset and fine-tuning codes still available for download

Researchers from Stanford University created a ChatGPT-style application for just $600. Days later, the team took down a demo of the bot over concerns about its responses.

Alpaca, a model built from a Meta language model, was developed as a cheaper, open-source but still powerful alternative for researchers so they can help solve problems found in large language models (LLMs). Most powerful LLMs are not open but remain proprietary to the company that created it.

The researchers had published an interactive demo of Alpaca so it could gather feedback from the public, not too dissimilar to what OpenAI did with ChatGPT. However, the model hallucinated responses – where the output sounds authoritative but is wrong or nonsensical − to user queries to the point where the team opted to take the model down.

In a statement sent to AI Business, researchers from Stanford’s Center for Research on Foundation Models confirmed the move.

“The original goal of releasing a demo was to disseminate our research in an accessible way. We feel that we have mostly achieved this goal and given the hosting costs and the inadequacies of our content filters, we decided to bring down the demo,” the center said.

The Stanford research team acknowledged in its initial publishing of Alpaca that the model was susceptible to hallucinating responses, like most large language models. They cited its size as the reason for this shortcoming.

“Alpaca likely contains many other limitations associated with both the underlying language model and the instruction tuning data. However, we believe that the artifact will still be useful to the community, as it provides a relatively lightweight model that serves as a basis to study important deficiencies,” the Stanford team had said upon Alpaca’s launch.

Alpaca was created to be accessible − it can run on devices as small as a smartphone.

The research team said only the demo of Alpaca has been taken down; both the dataset used to train the model and the code used to fine-tune it are still available to download via GitHub.

The researchers are planning to release the Alpaca model weights but are waiting for guidance from Meta before doing so, as Alpaca is built on top of the seven billion parameter version of LLaMA.

Biggest weakness of large language models

AI language models, whether it is ChatGPT, LLaMA, Alpaca or GPT-4, share one major weakness: Hallucinations.

Hallucinations occur when a conversational AI application powered by a large language model generates a response to a prompt that is either false or irrelevant to the original request.

As Big Tech rushes to get its AI models out as fast as possible to avoid being left behind, the lack of stronger guardrails around hallucinations could make or break a deployment. Take Google, for example. Upon unveiling its ChatGPT rival, Bard, last month at its ‘Live from Paris’ event, the chatbot gave a wrong answer in its first public demo. The stock of parent Alphabet plunged, wiping out $100 billion of market value overnight.

Ilya Sutskever, chief scientist and cofounder of ChatGPT creator OpenAI, recently said in an interview that such behaviors “stopped happening” with its newest model, GPT-4. While hallucinations may be reduced, LLMs routinely spout them and they remain an issue AI researchers are working to resolve.

About the Author(s)

Ben Wodecki

Jr. Editor

Ben Wodecki is the Jr. Editor of AI Business, covering a wide range of AI content. Ben joined the team in March 2021 as assistant editor and was promoted to Jr. Editor. He has written for The New Statesman, Intellectual Property Magazine, and The Telegraph India, among others. He holds an MSc in Digital Journalism from Middlesex University.

See more from Ben Wodecki

Related Topics

Recent in ML

Related Topics

Recent in NLP

Related Topics

Recent in Data

Related Topics

Recent in Automation

Related Topics

Recent in Verticals

Related Topics

Recent in Responsible AI

Related Topics

Recent in Companies

Related Topics

At a Glance

Biggest weakness of large language models

About the Author(s)

Latest News

Trending articles