Keep up with the ever-evolving AI landscape
Unlock exclusive AI content by subscribing to our newsletter!!
February 6, 2024
The likes of Meta and Mistral have been touting their open source AI models as viable alternatives to proprietary systems from OpenAI.
But what makes an open source machine learning system truly open source? Consider Meta’s Llama 2 – while the model’s weights and evaluation code were made available, the company did not disclose its training data.
Amazon machine learning strategist Julia Ferraioli said that just because a system is free, it does not make it open.
Speaking at the State of Open Con event in London, Ferraioli cautioned that being able to view model checkpoints or weights does not explicitly define what it means to be an open source machine learning system.
“For [a machine learning system] to be open. I need to be able to question it,” Ferraioli said. She proposed her litmus test to determine whether a system is truly open source: Whether a user can access the model, underlying data, code and metadata.
“As models are essentially just very large matrices. I need all of that other information to be open and disclosed,” Ferraioli said. “If yes, I can verify it. I can reproduce it. I can change it. And what's more, I can vehemently disagree with it, which is an important aspect of open source.”
The open source AI field is constantly evolving, with new systems emerging frequently amid the generative AI wave.
Ferraioli described machine learning as the foundation of the emergence of generative AI systems. But for scientists to be able to trust these generative systems, Ferraioli said experts need to know how a system was trained, what it was trained on and what tasks for which it is appropriate.
Companies and community groups looking to open source their systems need to disclose a lot of information to make them truly open source.
The Amazon strategist said that while some may question whether open source machine learning needs all the underlying aspects, it is important to provide access to unlock true open source.
“Just because something is hard, does not mean you should not try,” she added. “By breaking things down into their component parts, and not boiling things down into an overly reductionist model, we can create a specification of open source machine learning that focuses on what is important.”
Read more about:Conference News
Ben Wodecki is the Jr. Editor of AI Business, covering a wide range of AI content. Ben joined the team in March 2021 as assistant editor and was promoted to Jr. Editor. He has written for The New Statesman, Intellectual Property Magazine, and The Telegraph India, among others. He holds an MSc in Digital Journalism from Middlesex University.
You May Also Like
Generative AI Journeys with CDW UK's Chief TechnologistFeb 28, 2024
Qantm AI CEO on AI Strategy, Governance and Avoiding PitfallsFeb 14, 2024
Deloitte AI Institute Head: 5 Steps to Prepare Enterprises for an AI FutureJan 31, 2024
Athenahealth's Data Science Architect on Benefits of AI in Health CareJan 19, 2024