Tothemoon24
Top 20
This reduced in size article is from Sally Ward EE times .
Embedded World 2023
Also on the STMicro booth were another couple of fun demos, including a washing machine that could tell how much laundry was in the machine in order to optimize the amount of water added. This system is sensorless; it is based on AI analysis of the current required to drive the motor, and predicted the weight of the 800g laundry load to within 30g. A robot vacuum cleaner equipped with a time-of-flight sensor also used AI to tell what type of floor surface it was cleaning, to allow it to select the appropriate cleaning method.
The M85 is a larger core than the Cortex-M55, but both are equipped with Helium—Arm’s vector extensions for the Cortex-M series—ideal for accelerating ML applications. Renesas’ figures had the M85 running inference 5.3× faster than a Renesas M7-based design, though the M85 was also running faster (480 MHz compared with 280).
Renesas’ demo had Plumerai’s person-detection model up and running in 77 ms per inference.
Renesas’ not-yet-announced Cortex-M85 device is the first we’ve seen running AI on the M85. Shown here running Plumerai people-detection model. (Source: EE Times/Sally Ward-Foxton)
Renesas field application engineer Stefan Ungerechts also gave EE Times an overview of the DRP-AI (dynamically reconfigurable processor for AI), Renesas’ IP for AI acceleration. A demo of the RZ/V2L device, equipped with a 0.5 TOPS @ FP16 (576 MACs) DRP-AI engine, was running tinyYOLOv2 in 27 ms at 500 mW (1 TOPS/W). This level of power efficiency means no heat sink is required, Ungerechts said.
The DRP-AI is, in fact, a two-part accelerator; the dynamically reconfigurable processor handles acceleration of non-linear functions, then there is a MAC array alongside it. Non-linear functions in this case might be image preprocessing functions or model pooling layers of a neural network. While the DRP is reconfigurable hardware, it is not an FPGA, Ungerechts said. The combination is optimized for feed-forward networks like convolutional neural networks commonly found in computer vision, and Renesas’ software stack allows either the whole AI workload to be passed to the DRP-AI or use of a combination of the DRP-AI and the CPU.
Also available with a DRP-AI engine are the RZ/V2MA and RZ/V2M, which offer 0.7 TOPS @ FP16 (they run faster than the -V2L at 630 MHz compared to 400, and have higher memory bandwidth).
A next-generation version of the DRP-AI that supports INT8 for greater throughput, and is scaled up to 4K MACs, will be available next year, Ungerechts said.
Squint CEO Kenneth Wenger told EE Times that the company wants to increase trust in AI decision making for applications like autonomous vehicles (AVs), healthcare and fintech. The company takes pre-production models and tests them for weaknesses—identifying in what situations they are more likely to make a mistake.
This information can be used to set up a mitigating factors, which might include human-in-the-loop—perhaps flagging a medical image to a doctor—or trigger a second, more specialized model that has been specifically trained for that situation. Squint’s techniques can also be used to tackle “data drift”—for maintaining models over longer periods of time.
Embedl has also been a part of the VEDL-IoT project, an EU-funded project in collaboration with Bielefeld University that aims to develop an IoT platform, which distributes AI across a heterogeneous cluster.
Their demo showed managing AI workloads across different hardware: an Nvidia AGX Xavier GPU in a 5G basestation and an NXP i.MX8 application processor in a car. With sufficient 5G bandwidth available, “difficult” layers of the neural network could be computed remotely in the basestation, and the rest in the car, for optimum latency. Reduce the 5G bandwidth available, and more or all of the workload goes to the i.MX8. Embedl had optimized the same model for both hardware types.
The VEDL-IoT project demo shows splitting AI workloads across 5G infrastructure and embedded hardware. (Source: EE Times/Sally Ward-Foxton)
A separate wake word demo ran in 50 ms on the xG24’s accelerator, and a third board was running a gesture recognition algorithm.
Embedded World 2023
Also on the STMicro booth were another couple of fun demos, including a washing machine that could tell how much laundry was in the machine in order to optimize the amount of water added. This system is sensorless; it is based on AI analysis of the current required to drive the motor, and predicted the weight of the 800g laundry load to within 30g. A robot vacuum cleaner equipped with a time-of-flight sensor also used AI to tell what type of floor surface it was cleaning, to allow it to select the appropriate cleaning method.
Renesas
Next stop was the Renesas booth, to see the Arm Cortex-M85 up and running in a not-yet-announced product (due to launch in June). This is the first time EE Times has seen AI running on a Cortex-M85 core, which was announced by Arm a year ago.The M85 is a larger core than the Cortex-M55, but both are equipped with Helium—Arm’s vector extensions for the Cortex-M series—ideal for accelerating ML applications. Renesas’ figures had the M85 running inference 5.3× faster than a Renesas M7-based design, though the M85 was also running faster (480 MHz compared with 280).
Renesas’ demo had Plumerai’s person-detection model up and running in 77 ms per inference.

Renesas field application engineer Stefan Ungerechts also gave EE Times an overview of the DRP-AI (dynamically reconfigurable processor for AI), Renesas’ IP for AI acceleration. A demo of the RZ/V2L device, equipped with a 0.5 TOPS @ FP16 (576 MACs) DRP-AI engine, was running tinyYOLOv2 in 27 ms at 500 mW (1 TOPS/W). This level of power efficiency means no heat sink is required, Ungerechts said.
The DRP-AI is, in fact, a two-part accelerator; the dynamically reconfigurable processor handles acceleration of non-linear functions, then there is a MAC array alongside it. Non-linear functions in this case might be image preprocessing functions or model pooling layers of a neural network. While the DRP is reconfigurable hardware, it is not an FPGA, Ungerechts said. The combination is optimized for feed-forward networks like convolutional neural networks commonly found in computer vision, and Renesas’ software stack allows either the whole AI workload to be passed to the DRP-AI or use of a combination of the DRP-AI and the CPU.
Also available with a DRP-AI engine are the RZ/V2MA and RZ/V2M, which offer 0.7 TOPS @ FP16 (they run faster than the -V2L at 630 MHz compared to 400, and have higher memory bandwidth).
A next-generation version of the DRP-AI that supports INT8 for greater throughput, and is scaled up to 4K MACs, will be available next year, Ungerechts said.
Squint
Squint, an AI company launched earlier this year, is taking on the challenge of explainable AI.Squint CEO Kenneth Wenger told EE Times that the company wants to increase trust in AI decision making for applications like autonomous vehicles (AVs), healthcare and fintech. The company takes pre-production models and tests them for weaknesses—identifying in what situations they are more likely to make a mistake.
This information can be used to set up a mitigating factors, which might include human-in-the-loop—perhaps flagging a medical image to a doctor—or trigger a second, more specialized model that has been specifically trained for that situation. Squint’s techniques can also be used to tackle “data drift”—for maintaining models over longer periods of time.
Embedl
Swedish AI company Embedl is working on retraining models to optimize them for specific hardware targets. The company has a Python SDK that fits into the training pipeline. Techniques include replacing operators with alternatives that may run more efficiently on the particular target hardware, as well as quantization-aware retraining. The company’s customers so far have included automotive OEMs and tier 1s, but they are expanding to Internet of Things (IoT) applications.Embedl has also been a part of the VEDL-IoT project, an EU-funded project in collaboration with Bielefeld University that aims to develop an IoT platform, which distributes AI across a heterogeneous cluster.
Their demo showed managing AI workloads across different hardware: an Nvidia AGX Xavier GPU in a 5G basestation and an NXP i.MX8 application processor in a car. With sufficient 5G bandwidth available, “difficult” layers of the neural network could be computed remotely in the basestation, and the rest in the car, for optimum latency. Reduce the 5G bandwidth available, and more or all of the workload goes to the i.MX8. Embedl had optimized the same model for both hardware types.

Silicon Labs
Silicon Labs had several xG24 dev kits running AI applications. One had a simple Sparkfun camera with the xG24 running people counting, and calculating the direction and speed of movement.A separate wake word demo ran in 50 ms on the xG24’s accelerator, and a third board was running a gesture recognition algorithm.