Apple has announced its next era. Your experience of using an iPhone, Mac, or iPad will be guided by, and suffused with, artificial intelligence. Apple calls it, of course, Apple Intelligence. It’s coming later this year. That’s right: We have another “AI” to deal with.
You may have heard plenty about how it makes Siri smarter, rewrites your emails and essays, creates never-before-seen emoji, and turns rough sketches into bland AI art.
It truly is a vision of the future. And, while not groundbreaking, thanks to the usual Apple gloss it may well be one of the most friendly, intuitive, and useful implementations of generative AI seen to date.
However, the pressing factor for most of us is that we are not invited, and the iPhone is the worst affected of Apple’s devices.
To use Apple Intelligence, you need an iPhone 15 Pro or iPhone 15 Pro Max. A regular iPhone 15 won’t do, meaning a mobile well under a year old is, at least in this specific sense, obsolete. Mac users just need an Apple Silicon computer, meaning one released in 2020 or newer.
Exclusion Zone
A more cynical take on this is that these exclusion timescales are tied to the average upgrade cycle of phones and laptops. A person might be considered normal if they upgrade their phone every year. Buying a new laptop every year means you are probably foolish, a theft-magnet, or just plain clumsy.
The reality is a lot more complicated. The computation required for at least some parts of Apple Intelligence is quite different to that of the average iPhone or Mac task.
And this has all been obscured to the average generative AI or chatbot dabbler so far because of the way all of us have been introduced to the form. When you use ChatGPT, Midjourney, or even Adobe Photoshop’s Generative Fill feature, your own computer is doing almost none of the real work.
That is done on remote cloud servers, which perform the necessary calculations then simply beam the final result over to your phone or laptop. In this sense, generative AI to date has been rather like a digital assistant, such as Siri or Alexa. It can, at times, do great stuff. But little to none of it is really happening on the device on which it is used.
Apple Intelligence will try, at least in part, to change that.
Apple’s Familiar Privacy Play
Why? “You should not have to hand over all the details of your life to be warehoused and analyzed in someone’s AI cloud,” said Craig Federighi, Apple's senior vice president of software engineering, during Apple's announcement of the new features.
“The cornerstone of the personal intelligence system is on-device processing. We have integrated it deep into your iPhone, iPad, and Mac, and throughout your apps, so it’s aware of your personal data without collecting your personal data.”
On-device AI processing is a privacy play, a classic Apple strategy. But it’s not the first to make this move. Microsoft was. Its Copilot+ standard is a similar concept, but based solely around laptops. Copilot+ laptops have dedicated AI hardware designed to allow for on-device AI processing. The two even rely heavily on the same core artificial intelligence, that of ChatGPT creator OpenAI.
Which leads us to this completely justified question: Why the hell can’t my $900 iPhone 15 do this stuff too?
Neural Thinking
NPU is the term: the tech under the spotlight that separates older “non-AI” hardware from the new stuff our purses are meant to quiver in anticipation for. It stands for “neural processing unit.”
The latest phones and tablets have one of these in addition to the CPU (central processing unit) and the GPU (graphics processing unit). Its specialization is in carrying out an awful lot of operations simultaneously—and in the context of phones and laptops, doing so while using minimal power is key.
Just as generative AI has pebble-dashed the internet and wider society with what might charitably be called “stuff,” the creation of that stuff also requires a wide and shallow type of work.
For those who want to get slightly more technical, NPUs excel at matrix manipulation. This is the core form of work that powers chatbot LLMs and other generative AI. In each case, an AI starts off with a prompt and gradually homes in on a final result—be it a picture or a sentence of prose—through matrix manipulation.
Beyond the NPU
Case closed? Are we simply in an NPU world now? Sure, but we have been since at least 2017. Apple introduced the Apple ANE (Apple Neural Engine) in 2017, seen in the iPhone 8, iPhone 8 Plus, and iPhone X. It’s an NPU.
It was needed back then, years before AI was as much of a buzzword as it is today, because Apple has actually used forms of AI and machine learning in iOS for ages.
Perhaps Apple Intelligence isn’t all that different—there’s just more of it. And the software scene has already proved some of these new “AI” features are being artificially gated, paywalled, behind new hardware.
A hobbyist software historian who goes by the name of Albacore online has got some of Microsoft’s Copilot+ features working on hardware that shouldn’t, in theory, be able to run it—a cheap Samsung Galaxy Book2 Go notebook. It costs a fifth to a quarter of the price of a Copilot+ laptop.
“Getting Recall working was moderately challenging, I suppose. I've been reverse engineering Windows bits for close to 10 years, so tracking down checks and restrictions is practically second nature to me,” Albacore tells WIRED.
How does it work? “Track down what guards the features, devise a plan for altering the check so it's always happy no matter the real requirements, and then wrap it in a nice installer. Some testing and edge case hunting later, the app was ready,” Albacore says.
You can read about how to get it working yourself in an article over at Tom’s Hardware.
“I'd say Recall works nicely even on older, less powerful hardware—the only aspect where you can notice lag is searching across the timeline,” says Albacore.
Recall is the most contentious, and headline pick, of Microsoft’s new AI features. Much like the new version of Siri, it can access your past actions, browsing history, and emails to turn Windows’ Universal search into something closer to an omniscient PA.
The part that doesn’t work so well on the Samsung Galaxy Book2 Go is Cocreator in the Paint app, Microsoft’s generative equivalent to the many image-generation features in Apple Intelligence.
“You can see the image generation process spin up and take a lot of resources, but it still crashes in the end, unfortunately. Perhaps an equally old chip but with more RAM would fare better, but I don't have that at my disposal at the moment,” says Albacore.
Making Memory
RAM: This may be the crux of why recent iPhones don't support Apple Intelligence even though they have as much AI and machine learning power as a MacBook that does support it. According to the Geekbench benchmarking tool, an iPhone 14 actually has a more powerful NPU AI chipset than an M1 MacBook.
However, those older phones have only 6 GB of RAM. All iPhones on the Apple Intelligence guest list have 8 GB of RAM.
Why does RAM matter? When AI models are run locally rather than in the cloud, they have to be stored in RAM or vRAM, the graphics card equivalent. Even the fastest SSD drives are not nearly fast enough for the job.
A lot of fanfare was made about the PlayStation 5’s 5,000 MB per second SSD, and how it let game developers stream visual assets off storage in real time. The DDR5 RAM in the average PC can reach bandwidth of up to 64 GB per second, while the fastest of Nvidia’s cards made for machine learning and AI, the H200, has a 4.8 TB per second bandwidth. That is almost 1,000 times higher than the transfer rate of the PS5 SSD.
The Nvidia H200 and its predecessors also provide handy context that should bring expectations for the iPhone 16’s offline AI abilities back into line. These graphics cards power the servers that perform the cloud AI computing required for all our frivolous ChatGPT and Stable Diffusion requests. They have up to 141 GB of vRAM per card, and cost tens of thousands of dollars a piece. And you need a bunch of them to house the latest GPT-4 backend intelligence—hundreds of GB of vRAM in total is necessary.
We’re miles away from consumer tech stuff here.
Even the consumer-grade Nvidia RTX gaming cards are going to be far more adept at most AI tasks than this new supposedly-made-for-AI hardware.
“GeForce RTX GPUs are ideal for AI PC users looking for the best AI experience,” says Jesse Clayton, Nvidia’s director of PC AI. “They deliver the fastest AI performance, up to 1,300 TOPS, and the most mature AI software stack, and accelerate more than 500 AI-enabled applications and games.”
For reference, Microsoft Copilot+ PCs set the benchmark at just 45 TOPS, which stands for trillions of operations per second.
Back to Earth
The iPhone 15 Pro Max can afford maybe 2 to 4 GB of RAM, and its NPU is rated at 35 TOPS. Whatever features Apple Intelligence can use offline will have to be based on a small, relatively simple AI model.
Apple’s own website tells us that Sharing Suggestions, Memories, and scene recognition in Photos are “on-device,” as are Siri Suggestions, voice recognition, and transcription.
A little logic suggests all the generative AI stuff primed to get people excited, and steal all the jobs kids once dreamed of having, can’t be done locally on an iPhone 15 Pro Max. Or an iPhone 16 Pro Max, come to that.
And what’s left starts to look an awful lot like features other phones have already, or similar to what Albacore has managed to get working on modest NPU-free hardware.
The next question is whether the iPhone 16 will even get the full suite of Apple Intelligence features, or whether it will continue to be used as a Pro-series upsell in the next generation.
The first Apple Intelligence features will be available later this year on the public release of iOS 18, expected to land in September alongside the iPhone 16. We'll find out then.