The Compute Foundation Powering AI's Present And Future
From the data center to the edge, Arm will be the foundation for inference as AI goes everywhere.
Dec 4, 2023,11:54am EST
The explosion of popular interest in artificial intelligence (AI) has clearly captured the public’s imagination about what the future of computing could look like and how it might transform our lives. Yet we’re only scratching the surface of its potential.
That’s because AI and machine learning (ML) are being deployed everywhere, from the big data centers that use countless processors to train large language models (LLMs) to the increased AI capabilities being deployed at “the edge” – on the actual devices.
What ties together AI in the data center and at the edge is the need for efficient computing. As AI evolves, there will be a heightened focus on power efficiency alongside performance. The unique combination of performance and efficiency enabled Arm (NASDAQ: ARM) to power the smartphone revolution, and we see the same happening as part of the ongoing AI story.
AI training vs. inference
Much of the news coverage recently has been on LLMs, their development and their training – and the high cost and energy consumption required to do so. A recent study1 estimated that Chat GPT3, comprised of 175 billion parameters, needed 1,287 MWh to train, and also emitted 552 tons of CO2. This is roughly the equivalent to driving a car 1.3 million miles – but in one hour!
But those huge training workloads running in data centers represent just 15 percent of AI workloads today, according to Omdia’s Data Center Compute Intelligence Service.2
The rest – 85 percent of all data center AI workloads – lies in AI inference, and that’s not accounting for inference that’s happening outside the data center on the edge. AI workloads like inference will remain a key workload in the cloud, but an increasing number will move to the edge as more efficient models continue to evolve. Inference, which is the process of using a trained model that has been deployed into a production environment to make predictions on new real-world data, will power AI applications for countless, specialised use cases and industries to benefit everyone in a wide range of different ways.
Driving efficient computing
The exponential increase in AI inference at the edge will be accelerated by the industry-wide drive for more efficient computing. This brings a number of benefits to users and businesses, including improved bandwidth and data privacy. However, the challenge with AI at the edge is the increasing compute demands across many different devices, especially those in the IoT and consumer technology space. Fortunately, we are already seeing models and the actual devices becoming more power efficient as the industry drives AI to be deployed at scale across all technology touchpoints, from the smallest sensor to the largest supercomputer.
AI on processor technologies
The move towards AI at the edge and efficient computing is taking place on the CPU, whether it is handling workloads in their entirety or in combination with a co-processor like a GPU or NPU. As these processors have evolved and improved over time with greater performance and efficiency capabilities, AI workloads have already begun to shift to the edge.
Despite the rise of AI being a relatively new tech phenomenon, Arm’s CPU and GPU performance improvements have doubled AI processing capabilities every two years in the past decade. This helps to put computing closer to where the data is captured, providing quicker, more secure AI experiences for the end-user.
Developer innovation
AI’s move to the edge is translating to improvements in the developer experience. We are already seeing developers write more compact models to run on small microprocessors or even smaller microcontrollers, saving time and costs.