With its most recent study on the topic of running large language models on smartphones, Apple appears to be preparing to overtake its rivals in Silicon Valley when it comes to generative artificial intelligence.
According to the researchers, the work, “LLM in a Flash,” provides a “solution to a current computational bottleneck.”
They stated that their method “paves the way for effective inference of LLMs on devices with limited memory.” The way large language models—the vast data repositories that underpin ChatGPT and other apps—respond to user inquiries is known as inference. Typically, chatbots and LLMs operate in sizable data centers with significantly more processing power than an iPhone.
The paper was published on December 12 but gained more exposure late on Wednesday when Hugging Face, a well-known platform for AI researchers to present their work, featured it. This month, Apple has released two papers on generative AI; the first was to allow models that create images, like Stable Diffusion, to operate on its proprietary chips.
According to Counterpoint Research, the smartphone market has had its worst year in a decade, with shipments falling by an estimated 5%. Chipmakers and device manufacturers are hoping that new AI features will help revive the market.
Although Apple introduced Siri, one of the first virtual assistants, back in 2011, the company has largely been left out of the generative AI frenzy that has swept Silicon Valley in the year since OpenAI unveiled ChatGPT, their ground-breaking chatbot. Many in the AI community believe that Apple is falling behind its Big Tech competitors, even though the company hired John Giannandrea, the top AI executive at Google, in 2018.
Based on their extensive cloud computing platforms, Microsoft and Google have primarily concentrated on providing chatbots and other generative AI services via the Internet. Apple, on the other hand, appears to be focusing on AI that can operate directly on an iPhone.
Competitors of Apple, like Samsung, are getting ready to release a brand-new category of “AI smartphones” in 2019. According to Counterpoint, more than 100 million smartphones with AI features would be shipped by 2024, and by 2027, 40 percent of new devices would have these features.
Cristiano Amon, the chief executive of Qualcomm, the largest mobile chipmaker in the world, predicted that integrating AI into smartphones would transform the user experience and stop the decline in mobile sales.
In a recent interview, he told the Financial Times, “You’re going to see devices launch in early 2024 with a number of generative AI use cases.” “As those things get scaled up, they start to make a meaningful change in the user experience and enable new innovation which has the potential to create a new upgrade cycle in smartphones.”
He stated that devices will also be able to perform new types of photo editing techniques, and that increasingly advanced virtual assistants will be able to predict users’ actions, such as texting or setting up a meeting.
This month, Google unveiled the Gemini LLM, a new version that will operate “natively” on Pixel smartphones.
Due to the lack of massive processing power and energy available in a data center, running the kind of large AI model that powers ChatGPT or Google’s Bard on a personal device presents formidable technical challenges. If this issue is resolved, AI assistants might operate offline and react faster than when accessed through the cloud.
Ensuring that questions are answered on a user’s own device without sending data to the cloud is also anticipated to provide benefits related to privacy, which have been a major point of differentiation for Apple in recent times.
The researchers stated, “Our experiment is designed to optimize inference efficiency on personal devices.” Apple used the Falcon 7B, a scaled-down version of the open-source LLM that was first created by the Technology Innovation Institute in Abu Dhabi, to test its methodology.
An increasing number of AI researchers are concentrating on optimizing LLMs to run on battery-powered devices. Academic publications provide a unique window into Apple’s closed research facilities and the most recent technological advancements made by the company, but they do not directly reveal how Apple plans to enhance its products.
In the paper’s conclusion, Apple researchers stated, “Our work not only provides a solution to a current computational bottleneck but also sets a precedent for future research.” “We believe as LLMs continue to grow in size and complexity, approaches like this work will be essential for harnessing their full potential in a wide range of devices and applications.”