ARM based MCU market is an excellent fit for BRN AI tech along with the emerging RISC-V based processor market.
ARM, Renesas, Qualcomm, NXP & STM all get a mention in following article as well as RISC-V.
Squeezing AI models into microcontrollers
May 13, 2020
Sally Ward-Foxton
What do you get when you cross AI with the IoT? The artificial intelligence of things (AIoT) is the simple answer, but you also get a huge new application area for microcontrollers, enabled by advances in neural network techniques that mean machine learning is no longer limited to the world of supercomputers. These days, smartphone application processors can (and do) perform AI inference for image processing, recommendation engines, and other complex features.
Bringing this kind of capability to the humble microcontroller represents a huge opportunity. Imagine a hearing aid that can use AI to filter background noise from conversations, smart-home appliances that can recognize the user’s face and switch to their personalized settings, and AI-enabled sensor nodes that can run for years on the tiniest of batteries. Processing the data at the endpoint offers latency, security, and privacy advantages that can’t be ignored.
However, achieving meaningful machine learning with microcontroller-level devices is not an easy task. Memory, a key criterion for AI calculations, is often severely limited, for example. But data science is advancing quickly to reduce model size, and device and IP vendors are responding by developing tools and incorporating features tailored for the demands of modern machine learning.
TinyML takes off
As a sign of this sector’s rapid growth, the TinyML Summit, a new industry event held in February in Silicon Valley, is going from strength to strength. The first summit, held last year, had 11 sponsoring companies; this year’s event had 27, and slots sold out much earlier, according to the organizers. Attendance at TinyML’s global monthly meet-ups for designers has grown dramatically, organizers said.
“We see a new world with trillions of intelligent devices enabled by TinyML technologies that sense, analyze, and autonomously act together to create a healthier and more sustainable environment for all,” said Qualcomm Senior Director Evgeni Gousev, co-chair of the TinyML Committee, in his opening remarks at a recent conference.
Gousev attributed this growth to the development of more energy-efficient hardware and algorithms, combined with more mature software tools. Corporate and venture-capital investment is increasing, as are startup and M&A activity, he noted.
Today, the TinyML Committee believes that the tech has been validated and that initial products using machine learning in microcontrollers should hit the market in two to three years. “Killer apps” are thought to be three to five years away.
A big part of the tech validation came last spring when Google demonstrated a version of its TensorFlow framework for microcontrollers for the first time. TensorFlow Lite for Microcontrollers is designed to run on devices with only kilobytes of memory (the core runtime fits in 16 KB on an Arm Cortex-M3; with enough operators to run a speech keyword-detection model, it takes up a total of 22 KB). It supports inference but not training.
Big players
The big microcontroller makers, of course, are watching developments in the TinyML community with interest. As research enables neural network models to get smaller, the opportunities get bigger. Most have some kind of support for machine-learning applications. For example, STMicroelectronics has an extension pack, STM32Cube.AI, that enables mapping and running neural networks on its STM32 family of Arm Cortex-M–based microcontrollers.
Renesas Electronics’ e-AI development environment allows AI inference to be implemented on microcontrollers. It effectively translates the model into a form that is usable in the company’s e2 studio, compatible with C/C++ projects.
NXP Semiconductors said it has customers using its lower-end Kinetis and LPC MCUs for machine-learning applications. The company is embracing AI with hardware and software solutions, albeit primarily oriented around its bigger application processors and crossover processors (between application processors and microcontrollers).
Strong Arm-ed
Most of the established companies in the microcontroller space have one thing in common: Arm. The embedded-processor–core giant dominates the microcontroller market with its Cortex-M series. The company recently announced the brand new Cortex-M55 core, which is designed specifically for machine-learning applications, especially when used in combination with Arm’s Ethos-U55 AI accelerator. Both are designed for resource-constrained environments. But how can startups and smaller companies seek to compete with the big players in this market?
“Not by building Arm-based SoCs, because [the dominant players] do that really well,” laughed XMOS CEO Mark Lippett. “The only way to compete against those guys is by having an architectural edge … [that means] the intrinsic capabilities of the Xcore in terms of performance, but also the flexibility.”
XMOS’s Xcore.ai, its newly released crossover processor for voice interfaces, will not compete directly with microcontrollers, but the sentiment still holds true. Any company making an Arm-based SoC to compete with the big guys better have something pretty special in its secret sauce.
Scaling voltage and frequency
Startup Eta Compute released its much-anticipated ultra-low-power device during the TinyML show. The ECM3532 can be used for machine learning in always-on image-processing and sensor-fusion applications with a power budget of 100 µW. The chip uses an Arm Cortex-M3 core plus an NXP DSP core — either or both of which can be used for ML workloads. The company’s secret sauce has several ingredients, but the way it scales both clock frequency and voltage on a continuous basis, for both cores, is key. The approach saves a lot of power, particularly because it’s achieved without a phase-locked loop (PLL).
With viable competitors to Arm now out there, including the up-and-coming instruction-set architecture offered by the RISC-V foundation, why did Eta Compute choose to use an Arm core for ultra-low-power machine-learning acceleration? “The simple answer is that the ecosystem for Arm is just so well-developed,” Eta Compute CEO Ted Tewksbury told
EE Times Europe. “It’s just much easier to go to production [with Arm] than it is with RISC-V right now. That situation could change in the future … RISC-V has its own set of advantages; certainly, it’s good for the Chinese market. But we’re looking primarily at domestic and European markets right now with the ecosystem for [our device].”
Tewksbury noted that the major challenge facing the AIoT is the breadth and diversity of the applications. The market is rather fragmented, with many relatively niche applications commanding only low volumes. Altogether, however, this sector potentially extends to billions of devices. “The challenge for developers is that they cannot afford to invest the time and the money in developing customized solutions for each one of those use cases,” Tewksbury said. “That’s where flexibility and ease of use become absolutely paramount. And that’s another reason why we chose Arm — because the ecosystem is there, the tools are there, and it’s easy for customers to develop products quickly and get them to market quickly without a lot of customization.”
After keeping its ISA under lock and key for decades, Arm finally announced in October that it would allow customers to build their own custom instructions for handling specialist workloads such as machine learning. That capability, in the right hands, may also offer the opportunity to reduce power consumption.
Eta Compute can’t take advantage of it just yet, because the new policy does not apply retroactively to existing Arm cores, so it is not applicable to the M3 core that Eta is using. But could Tewksbury see Eta Compute using Arm custom instructions in future product generations to cut power consumption further? “Absolutely, yes,” he said.
Alternative ISAs
RISC-V has been getting a lot of attention this year. The open-source ISA allows the design of processors without a license fee, whereas designs based on the RISC-V ISA can be protected as with any other type of IP. Designers can pick and choose which extensions to add, including their own customized extensions.
French startup GreenWaves is one of several companies using RISC-V cores to target the ultra-low–power machine-learning space. Its devices, GAP8 and GAP9, use eight- and nine-core compute clusters, respectively. Each device also has an additional core that handles control functions.
Martin Croome, vice president of business development at GreenWaves, explained to
EE Times Europe why the company uses RISC-V cores.
“The first reason is RISC-V gives us the ability to customize the cores at the instruction-set level, which we use heavily,” said Croome, adding that the custom extensions are used to reduce the power of both machine-learning and signal-processing workloads. “When the company was formed, if you wanted to do that with any other processor architecture, it was either impossible or it was going to cost you a fortune. And the fortune it was going to cost you was essentially your investor’s money going to a different company, and that is very difficult to justify.”
GreenWaves’ custom extensions alone give its cores a 3.6× improvement in energy consumption over unmodified RISC-V cores. But Croome also said that RISC-V has fundamental technical benefits that are simply due to its being new. “It’s a very clean, modern instruction set; it doesn’t have any baggage,” he said. “So from an implementation perspective, the RISC-V core is actually a simpler structure, and simple means less power.”
Croome also cited control as an important factor. The GAP8 device has eight cores in its compute cluster, and GreenWaves needs very fine, detailed control over the core execution to allow maximum power efficiency. RISC-V enables that, he said. “In the end, if we could have done all of that with Arm, we would have done all of that with Arm; it would have been a much more logical choice … because no one ever got fired for buying Arm,” he joked. “The software tools are there to a level of maturity which is far higher than RISC-V … but, that said, there’s now so much focus on RISC-V that those tools are increasing in maturity very fast.”
In summary, while some see Arm’s hold on the microprocessor market as weakening, in part because of increased competition from RISC-V, the company is responding by allowing some customized extensions and developing new cores designed for machine learning from the outset.
In fact, there are both Arm and non-Arm devices coming to the market for ultra-low-power machine-learning applications. As the TinyML community continues to work on reducing neural network model size and developing dedicated frameworks and tools, this sector will blossom into a healthy application area that will support a variety of device types.