6 Latest AI Models from Meta, OpenAI, Apple and More

AI Business highlights the latest and most significant AI models from industry leaders like Meta, OpenAI and Apple

Ben Wodecki, Jr. Editor

May 1, 2024

10 Min Read
Digital drawing of a blue light up microprocessor with AI written atop it
Getty Images

It's hard to keep up when AI models launch daily. AI Business presents six of the most significant models that dropped in the past few weeks you may have missed.

Apple’s OpenELM

Is it open source? Yes

OpenELM sees Apple join the growing list of companies developing open source AI systems.

The iPhone maker is a relative newcomer to open source AI and OpenELM marks its first model release. Last month, the company offered a glimpse at a multimodal AI system called MM1.

OpenELM, which stands for Open-source Efficient Language Models, is small in stature. It comes in sizes ranging from 270 million parameters up to 3 billion.

View post on X

There are two types of OpenELM, a pre-trained version and one that’s instruction-tuned which is suitable for responding to natural language instructions.

Each is a text generation model trained on CoreNet, Apple’s new deep neural network library which contains a total of 1.8 trillion tokens. Apple sourced the training data from publicly available sources including RefinedWeb, which was used to build the Falcon model, deduplicated PILE and a subset of Dolma v1.6 corpus.

Apple’s models use an innovative underlying architecture to improve the accuracy of their responses, employing layer-wise scaling that allows for a more efficient allocation of parameters within each of the model’s layers.

Related:Apple Launches First Multimodal AI Model

How to Access OpenELM

Apple’s OpenELM models can be found on Hugging Face while the CoreNet library can be found on GitHub.

Apple’s OpenELM models can be used to power commercial applications. The license, however, is a lot more strict compared to those used for traditional open source models, with specific terms that need to be adhered to, including avoiding implying endorsement by Apple.

Read more about Apple Open ELM

Apple/OpenELM

OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

Read more Apple stories on AI Business

Apple Launches First Multimodal AI Model

Apple Launches Open-Source Tools for AI Developers Using Macs

Apple Puts Brakes on EV Plans to Focus on Gen AI

Snowflake’s Arctic

Is it open source? Yes

Snowflake recently unveiled Arctic, a large language model the company claims is the “most open enterprise-grade large language model.”

Arctic is a large language model that’s designed for enterprises. It’s 17 billion parameters in size, so it does not need huge amounts of power to run.

The model is designed to expertly follow instructions and perform tasks like code generation while being more cost-effective to run compared to other open source models.

Related:Meta Unveils Llama 3, the Most Powerful Open Source Model Yet

Users can also build atop Arctic, designing custom models optimized for specific enterprise use cases.

Arctic performs on par or better with more expensive-to-run models like the eight billion parameter version of Meta’s new Llama 3 on enterprise-focused benchmarks. Snowflake claims its new model runs at half the cost.

Unlike its Meta rival, Snowflake touts its Arctic model as being truly open, in that users have access to the model, its weights, code and all the data recipes used to power it.

Meta has not disclosed datasets it has used to train its latest Llama models, despite opening access to it.

Mike Finley, CTO and co-founder of AnswerRocket, said the cost to train Arctic will “make your chief financial officer smile.”

“This snowflake trains like a butterfly and thinks like a bee,” Finley said. “The parameter size, at 17 billion, is notable because it is in an empty band in this competitive space. It's smaller than the full-sized models by 50% but CTOs will note that it beats many of those larger models on important benchmarks.”

How to Access Snowflake Arctic?

Snowflake Arctic can be downloaded from Hugging Face.

The model is also available from a variety of cloud providers, including AWS, Azure and Nvidia’s API catalog.

Related:Hugging Face Launches New Code Generation Models

The company’s GitHub repository contains recipes to improve Arctic’s inference and fine-tuning.

Read more about Snowflake Arctic

Snowflake Arctic: The Best LLM for Enterprise AI — Efficiently Intelligent, Truly Open

Snowflake Arctic Cookbook

Read more Snowflake stories on AI Business

Snowflake’s $100 Million Startup Fund

Snowflake's New AI Services to Tap Large Language Models

Microsoft’s Phi-3 Mini

Is it open source? Yes

The latest in Microsoft’s small language model efforts, Phi-3 Mini is just 3.8 billion parameters in size but outperforms models more than double its size.

The new model boasts improved reasoning, coding and math capabilities compared to prior Phi models.

It’s the first small-sized model to boast a context window of up to 128K tokens with the small model able to handle sizable inputs without impacting the quality of its response.

Phi-3 can be used out of the box as it’s instruction-tuned, meaning it’s suitable for deployments that require it to follow instructions right from the get-go.

Due to its small size, Phi-3 mini can be used on edge applications like in smartphones or on sensors in industrial environments.

Microsoft had only launched Phi-2 last December but is continuing to work on small model development.

“What we’re going to start to see is not a shift from large to small but a shift from a singular category of models to a portfolio of models where customers get the ability to make a decision on what is the best model for their scenario,” said Sonali Yadav, Microsoft’s principal product manager for generative AI.

Further Phi-3 models are on the way, with Phi-3-small (7 billion parameters) and Phi-3-medium (14 billion parameters) launching “in the coming weeks.”

How to Access Microsoft Phi-3

The new Phi-3 mini model is available on the Azure AI Studio, Hugging Face and Ollama.

It’s also available on Nvidia’s Nim platform with a standard API interface that can be deployed anywhere.

Read more about Microsoft Phi-3 Mini

Introducing Phi-3: Redefining what’s possible with SLMs

Getting Started - Generative AI with Phi-3-mini: A Guide to Inference and Deployment

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Read more Microsoft stories on AI Business

Microsoft Launches London AI Hub Focused on Language Model Research

Microsoft Invests $2.9 Billion to Enhance Cloud, AI Infrastructure in Japan

Microsoft is Now Worth $3 Trillion, Thanks to Gen AI

Read more about small language models

Microsoft Launches Phi-2: A Small Yet Mighty AI Model

Small Language Models Gaining Ground at Enterprises

3 Most Common Problems with Small Language Models

Megalodon

Who built it? Meta, University of Southern California, Carnegie Mellon University, University of California San Diego

Is it open source? Yes

Meta’s AI work is best known for naming models after Llamas. The company also borrowed the name of another larger animal last August, Humpback.

Now, the Facebook parent has teamed up with university researchers to build a giant AI model named after one of the largest animals in history, the Megalodon.

Named after a giant species of shark that lived millions of years ago, Meta’s Megalodon model is meant to be massive in scope and is significantly larger than its Llama line of models. No official word was given on the model’s parameter levels, however.

Meta’s Megalodon is designed to tackle huge and complex tasks, with the model capable of handling incredibly long pieces of information smoothly.

The model can understand and generate responses for extended conversations or documents without losing context.

It's also designed to be faster and more scalable than older models. This makes Megalodon not just powerful, but also quick when dealing with big tasks that involve lots of data.

However, the new Meta model was not compared against Llama 3, the latest in the Llama series of large language models.

Meta is working on a giant 400-billion-parameter version of its new Llama 3. Megalodon is not that model.

How to Access Meta Megalodon?

Meta Megalodon can be found on GitHub, with a Discord server available where users can troubleshoot with other AI experts.

Read more about Meta Megalodon

Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length

Read more Meta stories on AI Business

Meta Unveils Llama 3, the Most Powerful Open Source Model Yet

Meta Ray-Ban Smart Glasses Get AI Boost

Meta Launches AI Assistant Across Facebook, Instagram, WhatsApp

Mistral AI’s Mixtral 8x22B

Is it open source? Yes - It’s available under an Apache 2.0 license

Parameters: 141 billion

Mixtral 8x22B is released by French AI startup Mistral AI. Despite being 141 billion parameters in size, its file size is just 218GB, meaning most consumer laptops can store it.

Businesses can use Mixtral 8x22B to power their AI applications. It is available under an Apache 2.0 license so users can create their own proprietary software and offer the licensed code to customers.

It’s as powerful as Meta’s Llama 2 and OpenAI’s GPT 3.5, with benchmark tests and only uses a portion of its power to generate a response, a feature Mistral touts as “offering unparalleled cost efficiency for its size.”

Mixtral 8x22B works differently compared to other models in that it is a mixture of expert (MoE) systems. MoE models produce responses based on the inputs of multiple smaller systems working in tandem to come up with an answer, akin to assembling a committee rather than having one sole decision-maker.

Mistral has previously developed MoE models, including Mixtral 8x7B. Other notable examples of MoE AI systems include Switch Transformers from the team at Google Brain (now Google DeepMind) and the new Gemini 1.5, also from Google.

How to Access Mixtral 8x22B

The new Mistral model can be demoed via Together AI's API and tested in Perplexity.ai's model playground, allowing users to experience its language generation capabilities firsthand.

To download it, you’ll need to decipher a cryptic tweet from Mistral.

View post on X

The tweet is a torrent link. Copying and pasting it into a platform like BitTorrent provides access to the model, with users able to download it via the link onto a computer.

The model can also be downloaded from Hugging Face.

Read more about Mixtral 8x22B

Mixtral 8x22B: Cheaper, Better, Faster, Stronger

Read more Mistral stories on AI Business

Mistral Goes Large: Microsoft Deal, New Flagship Model

Mistral AI’s New Language Model Aims for Open Source Supremacy

OpenAI Rival Mistral AI Set to Raise $485 Million

OpenAI’s GPT-4 Turbo

Is it open source? No

Rounding off the list is GPT-4 Turbo from OpenAI, the Microsoft-backed company’s most powerful large language model to date.

Premium ChatGPT subscribers now have access to GPT-4 Turbo, which was first unveiled last November at OpenAI DevDay. It is designed to be an upgrade on GPT-4, offering improved coding, math and reasoning abilities at a reduced cost to run.

OpenAI CEO Sam Altman said the new model is “now significantly smarter and more pleasant to use.”

The new model has an improved context length or the level of input the model can handle.

GPT-4 Turbo has a context length of 128,000 tokens (pieces of words), which is the equivalent of 300 pages of a book. GPT-4 could only manage more than 8,000 tokens.

The new model powering ChatGPT’s premium features also has a boosted knowledge date up until December 2023, with users also able to expand that with the platform’s access to the internet via Bing.

GPT-4 Turbo with Vision can take in images and respond to user queries about them, a feature previously unavailable in ChatGPT.

Among those using the new GPT-4 Turbo with Vision is Devin, an AI software engineering platform that can automate entire projects. Cognition Labs which built Devin is using the Vision model to power visual coding tasks in its platform.

View post on X

Read more about GPT-4 Turbo

OpenAI DevDay: GPT-4 Turbo, Custom ChatGPT and API Updates

GPT-4 Turbo in the OpenAI API

Read more OpenAI stories on AI Business

Why OpenAI Fired Its CEO Sam Altman

OpenAI: ‘Impossible’ to Train Models Without Copyrighted Content

Microsoft-backed OpenAI Expands, Opens Tokyo Office

Want more AI models? For more AI news stories like this, sign up for our free email newsletter to stay updated!

Read more about:

ChatGPT / Generative AI

About the Author(s)

Ben Wodecki

Jr. Editor

Ben Wodecki is the Jr. Editor of AI Business, covering a wide range of AI content. Ben joined the team in March 2021 as assistant editor and was promoted to Jr. Editor. He has written for The New Statesman, Intellectual Property Magazine, and The Telegraph India, among others. He holds an MSc in Digital Journalism from Middlesex University.

Keep up with the ever-evolving AI landscape
Unlock exclusive AI content by subscribing to our newsletter!!

You May Also Like