November 17, 2022
AI researchers from Meta have joined forces with open-source ML community Papers with Code to develop Galactica: a large language model that can organize the massive trove of content in scientific papers.
Galactica can scour scientific papers for answers, explore available literature and even write scientific code and academic papers. It can also create citations and other references to help authors in their writing.
But renowned experts quickly criticized the output as "statistical nonsense," "dangerous" and will "usher in an era of deep scientific fakes." After a few days, Galactica was pulled.
The researchers had been ambitious, training Galactica on 48 million papers, textbooks and lecture notes, along with proteins, scientific websites and encyclopedias.
Five Galactica models ranging in parameter levels from 125 million up to 120 billion were created. Galactica’s performance “increases smoothly with scale,” according to a paper by the AI researchers. All models are open source and freely available on GitHub.
Galactica vs. GPT-3
In terms of benchmarks, the developers behind Galactica state it outperforms other large language models trained with generic text data.
“On technical knowledge probes such as LaTeX equations, Galactica outperforms the latest GPT-3 by 68.2% versus 49.0%,” the paper reads.
The researchers also claim Galactica performs well on reasoning, outperforming DeepMind’s Chinchilla on the mathematical MMLU benchmark test by 41.3% to 35.7%, and Google’s PaLM 540B on MATH with a score of 20.4% versus 8.8%
“We believe these results demonstrate the potential for language models as a new interface for science. We open source the model for the benefit of the scientific community,” the authors wrote.
Galactica Generates Fake Papers?
After testing Galactica, however, several experts publicly shared their concerns.
“This could usher in an era of deep scientific fakes,” said Michael Black, director of the renowned Max Planck Institute for Intelligent Systems, via Twitter. The answers Galactica surfaced can be incorrect but written in a way that is “grammatical and feels real,” he said.
Black said he asked Galactica about facts he personally knows and “in all cases, it was wrong or biased but sounded right and authoritative. I think it's dangerous.”
“It offers authoritative-sounding science that isn't grounded in the scientific method. It produces pseudo-science based on statistical properties of science ‘writing.’ Grammatical science writing is not the same as doing science. But it will be hard to distinguish,” wrote Black.
Black asked the model for information on estimating realistic 3D human avatars in clothing from a single image or video. The model offered "a fictitious paper and associated GitHub repository” from a real author, Albert Pumarola from Meta’s Reality Labs.
The instance occurred again when Black was given an abstract from a fictitious paper by a real Google AI researcher Thiemo Alldieck.
“Alldieck and Pumarola will get citations for papers they didn't write. These papers will then be cited by others in real papers. What a mess this will be.”
Renowned software engineer Grady Booch, one of the trio that developed the Unified Modeling Language, also had concerns. He described Galactica as “little more than statistical nonsense at scale.”
Prof. Emily M. Bender, director of the University of Washington’s Computational Linguistics Laboratory, said it's "not surprising in the least" that Galactic generates text that is both “fluent and … wrong.”
She added that it was “entirely predictable that it would behave this way.”
Galactica itself posted warnings about the results. In every generation, the user sees fine print that reads, "WARNING: Outputs may be unreliable! Language Models are prone to hallucinate text."
AI Business has reached out to Meta and Papers with Code for comment.
The original article was updated on Nov. 21 with the researchers' decision to pull Galactica from public use.