May 16, 2023
At a Glance
- Qualcomm is planning to demonstrate running a 10 billion-parameter language model on smartphone later this year.
- CEO Cristiano Amon said this AI processing at the edge will not compromise a smartphone's battery life.
- At a trade show earlier this year, Qualcomm demonstrated an on-device Stable Diffusion with over one billion parameters.
Chipmaker Qualcomm is committed to bringing AI processing to the smartphone – and more companies will likely follow in its footsteps.
AI and large language models (LLMs) currently are processed in data centers full of Intel Xeons and Nvidia GPUs, chewing through parameters and generating enough heat to keep a building warm in an Alaskan winter.
However, that is not stopping tech companies from trying to bring AI processing to other devices, namely smartphones. On its latest quarterly earnings call with Wall Street analysts, Qualcomm CEO Cristiano Amon spoke of running LLMs on handsets.
He said the company will have the ability to run 10 billion-parameter models on the phone without compromising battery life and to "demonstrate that very shortly in this year." Amon noted that this creates "an even larger opportunity for us in automotive as well the entering of next generation of personal computing.”
At a trade show a few months ago, Qualcomm demonstrated what he said was the world's first on-device Stable Diffusion, a greater than 1 billion parameter foundational model for text-to-image applications running completely on a Snapdragon-powered Android smartphone.
Qualcomm hopes to achieve this through its next-generation Snapdragon compute platform, code-named Oryon, due to arrive later this year. Oryon is built on the Nuvia platform, a CPU maker that Qualcomm bought in January 2021 for $1.4 billion. It replaces Kryo, the current CPU Qualcomm has used since 2015. The chipmaker has said that 5G and AI will be crucial parts of the new CPU.
Amon believes that LLMs will evolve quickly, continue to grow in popularity and transform user experiences across mobile personal computing and automotive. “Beyond changing internet search, this model will have an impact on content creation, such as text, images, audio and video for both entertainment and productivity. It will also transform many industries for these models. To realize their full potential and scale, they will need to run locally on devices at the edge,” he said on the call.
At its developer conference last week, Google unveiled its PaLM 2 large language model, which comes in four sizes but did not disclose parameters. The smallest, Gecko, is "so lightweight that it can work on mobile devices and is fast enough for great interactive applications on-device, even when offline."
Sticking to the edge
Qualcomm needs to stick to the edge, including smartphones, because the datasets that Amon is talking about are much smaller than OpenAI’s GPT-3 at 175 billion parameters, and GPT-4’s 170 trillion.
Edge and mobile devices are best suited for inference, not for training, since a 10- to 15-billion parameter model is the largest model that can run on devices with 10/15 TOPS like a smartphone, give or take. (TOPS, or Tera or Trillions of Operations per Second, is a measure of AI chip performance.)
As TOPS increases to 30 TOPS or so (Qualcomm’s SnapDragon 888 Plus is reportedly capable of 32 TOPS), devices can run a larger model but still not, and likely never will, accommodate a large model the size of GPT-4 or above, said Ben Bajarin, principal analyst with Creative Strategies.
This means the models that run will be smaller, more limited, but that does not exclude these models’ relationship to the cloud. “This is literally all changing so quickly, so it is hard to keep up but the point being is that large models will be trained in the cloud," he explained. "But the real value is in the smaller, more efficient models tuned to use cases that will run on devices."
Jack Gold, president and principal analyst of J.Gold Associates, says Qualcomm can do quite a bit on the edge. “There is a lot of inference processing that will happen there and having a high-power AI accelerator can be very helpful, even in the smartphone device,” he said.
Gold believes that in the next couple of years, virtually all chips that are processing significant data will include AI accelerators that can improve AI performance from small devices up to larger edge systems and beyond, and that includes a full range of accelerators built into PCs.
“This is not a direct threat to Nvidia for large scale AI/ML, but it does put a stake in the ground that says everyone will need AI acceleration to handle all the future models running on the local device,” he said.
Apple stays mum
Gold said that virtually all of the mobile chipmakers like MediaTek and Samsung have some kind of AI project in the works, but one company is conspicuously absent, at least publicly. Apple CEO Tim Cook was asked on the recent Apple earnings call about Apple’s plans for AI. He acknowledged that Apple is thinking about it but declined to provide more details.
“I do think it’s very important to be deliberate and thoughtful in how you approach these things,” Cook said. “And there’s a number of issues that need to be sorted, as is being talked about in a number of different places, but the potential is certainly very interesting.”
It could be argued that Apple is already embracing AI through the natural language processing elements of its Siri assistant as well as the new crash detection feature in the iPhone. Where Apple goes from here can only be speculated, although it does make impressive silicon with its M2 processor.
Read more about:ChatGPT / Generative AI
About the Author(s)
You May Also Like