Redefining Mobile Experiences with AI-Optimized Arm CSS for Client and New Arm Kleidi Software
By
Chris Bergey, SVP and GM of the Client Business, Arm
Artificial Intelligence (AI)SmartphonesSoftware
Share
News highlights
- New compute solution, Arm Compute Subsystems (CSS) for Client, brings together Armv9 benefits with validated and verified production ready implementations of new Arm CPUs and GPUs on 3nm process nodes to enable silicon partners to rapidly innovate and speed time to market
- AI-optimized Arm CSS for Client with next generation Cortex-X CPU, delivering highest year-on-year IPC uplift resulting in a 36% increase in performance; new Immortalis GPU brings a 37% uplift in graphics performance
- New KleidiAI software integrates with popular AI frameworks for seamless developer experiences; KleidiAI with Arm CSS dramatically improves performance of computing applications by leveraging a wide range of Arm’s acceleration technologies (NEON, SVE2 and SME2)
With power efficiency in our DNA, the Arm platform is providing the foundation for the next wave of computing demands as the AI era accelerates. As AI models continue to rapidly evolve, we’re seeing that software begins to outpace hardware which means additional innovation is required at all levels of the compute stack. To meet these growing demands, we’re evolving our solution offering to gain the maximum benefits of leading process nodes and announcing the newest Arm compute solution for AI smartphones and PCs –
Arm Compute Subsystems (CSS) for Client.
Arm CSS for Client provides the performance, efficiency and accessibility to deliver leading AI-based experiences and makes it easier and faster for our silicon partners to build Arm-based solutions and get to market quickly. CSS for Client provides the foundational computing elements for flagship SoCs and features the latest Armv9.2 CPUs and Immortalis GPUs, as well as production ready physical implementations for CPU and GPU on 3nm and the latest Corelink System Interconnect and System Memory Management Units (SMMUs).
Unprecedented CPU and GPU performance and efficiency
CSS for Client delivers a step change in platform capabilities to continue pushing the boundaries of premium mobile experiences. This is the fastest Arm compute platform addressing demanding real-life Android workloads with greater than 30 percent increase on compute and graphics performance and 59 percent faster AI inference for broader AI/ML and computer vision (CV) workloads.
At the heart of CSS for Client is Arm’s most performant, efficient and versatile CPU cluster ever for maximum performance and power efficiency. The new Arm Cortex-X925 delivers the highest year-on-year performance uplift in the history of Cortex-X. Taking advantage of the leading edge 3nm process nodes, assuming a 3.8GHz clock rate and maximum cache size, the result is a massive 36 percent increase in single-thread performance when comparing to 2023 smartphone flagship 4nm SoCs. For AI, Cortex-X925 provides an incredible 41 percent performance uplift to dramatically improve the responsiveness of on-device generative AI, like large language models (LLMs).
The push for leading-edge performance is combined with leading-edge efficiency through our new Arm Cortex-A725 CPU, which delivers a 35 percent improvement in performance efficiency to target AI and mobile gaming use cases. This is supported by a refreshed Arm Cortex-A520 CPU and an updated DSU-120 that provide power efficiency and scalability improvements for consumer devices that adopt the latest Armv9 CPU clusters. Learn more about the new Armv9 CPUs in
this blog.
The new Arm Immortalis-G925 GPU, which is our most performant and efficient GPU to date, delivers 37 percent more performance across a wide range of leading mobile gaming applications, as well as 34 percent more performance when measured over multiple AI and ML networks. While Immortalis-G925 is for the flagship smartphone market, the highly scalable new GPU family, including Arm Mali-G725 and Mali-G625 GPUs, targets a broad range of consumer device markets, from premium mobile handsets to smartwatches and XR wearables. Learn more about Arm’s new GPUs in
this blog.
Optimizing software for outstanding developer innovation
We are relentlessly focused on millions of developers worldwide, ensuring they have access to the performance, tools and software libraries required to create the next wave of AI-enabled applications. To enable developers to land these innovations quickly at the highest performance, we’re introducing Arm Kleidi, which includes KleidiAI for AI workloads and KleidiCV for computer vision applications. KleidiAI is a set of compute kernels for developers of AI frameworks, providing them with frictionless access to the best performance possible on Arm CPUs, across a wide range of devices, with support for key Arm architectural features such as NEON, SVE2 and SME2. KleidiAI integrates with popular AI frameworks, such as PyTorch, Tensorflow and MediaPipe, with a view to accelerating the performance of key models including Meta Llama 3 and Phi-3. It is also backwards and forwards compatible to ensure Arm is future fit as we bring additional technologies to market. Learn more about Arm Kleidi in
this blog.
The compute platform for the future of AI
Through the unique combination of leading-edge CPU and GPU technologies, production ready physical implementations and continuous software optimizations, CSS for Client combined with Kleidi software will provide the compute platform for the future of AI, a future that is built on Arm.