Fullmoonfever
Top 20
Looks like some friends in Japan, with a little support from Megachips, have been playing with Akida & MetaTF 
Apols if already posted as I may have missed it and haven't done a search.
Short video end of post.
Paper HERE
License: arXiv.org perpetual non-exclusive license
arXiv:2408.13018v1 [cs.RO] 23 Aug 2024
Excerpts:
Table 3: Hardware performance of policies: FPNN was evaluated by edge-CPU (Raspberry Pi 4: quad-core ARM Cortex-A72). SNN was evaluated by neurochip (Akida 1000 [9]). “Power cons” and “Calc. speed” denote power consumption and calculation speed for obtaining one action from NN policies using each piece of hardware. Power consumption was measured by voltage checker (TAP-TST8N).
Apols if already posted as I may have missed it and haven't done a search.
Short video end of post.
Paper HERE
License: arXiv.org perpetual non-exclusive license
arXiv:2408.13018v1 [cs.RO] 23 Aug 2024
Robust Iterative Value Conversion: Deep Reinforcement Learning for Neurochip-driven Edge Robots
Yuki Kadokawakadokawa.yuki@naist.ac.jpTomohito Koderakodera.tomohito.kp9@is.naist.jpYoshihisa Tsuruminetsurumine.yoshihisa@is.naist.jpShinya Nishimuranishimura.shinya@megachips.co.jpTakamitsu Matsubaratakam-m@is.naist.jpNara Institute of Science and Technology, 630-0192, Nara, Japan MegaChips Corporation, 532-0003, Osaka, JapanAbstract
A neurochip is a device that reproduces the signal processing mechanisms of brain neurons and calculates Spiking Neural Networks (SNNs) with low power consumption and at high speed. Thus, neurochips are attracting attention from edge robot applications, which suffer from limited battery capacity. This paper aims to achieve deep reinforcement learning (DRL) that acquires SNN policies suitable for neurochip implementation. Since DRL requires a complex function approximation, we focus on conversion techniques from Floating Point NN (FPNN) because it is one of the most feasible SNN techniques. However, DRL requires conversions to SNNs for every policy update to collect the learning samples for a DRL-learning cycle, which updates the FPNN policy and collects the SNN policy samples. Accumulative conversion errors can significantly degrade the performance of the SNN policies. We propose Robust Iterative Value Conversion (RIVC) as a DRL that incorporates conversion error reduction and robustness to conversion errors. To reduce them, FPNN is optimized with the same number of quantization bits as an SNN. The FPNN output is not significantly changed by quantization. To robustify the conversion error, an FPNN policy that is applied with quantization is updated to increase the gap between the probability of selecting the optimal action and other actions. This step prevents unexpected replacements of the policy’s optimal actions. We verified RIVC’s effectiveness on a neurochip-driven robot. The results showed that RIVC consumed 1/15 times less power and increased the calculation speed by five times more than an edge CPU (quad-core ARM Cortex-A72). The previous framework with no countermeasures against conversion errors failed to train the policies. Videos from our experiments are available:Excerpts:
5.1 Construction of Learning System for Experiments
5.1.1 Entire Experiment Settings
This section describes the construction of the proposed framework shown in Fig. 2. We utilized a desktop PC equipped with a GPU (Nvidia RTX3090) for updating the policies and an Akida Neural Processor SoC as a neurochip [9, 12]. The robot was controlled by the policies implemented in the neurochip. SNNs were implemented to the neurochip by a conversion executed by the MetaTF of Akida that converts the software [9, 12]. Samples were collected by the SNN policies in both the simulation tasks and the real-robot tasks since the target task is neurochip-driven robot control. For learning, the GPU updates the policies based on the collected samples in the real-robot environment. Concerning the SNN structure, the quantization of weights 𝑤𝑠 described in Eq. (16) and the calculation accuracy of the activation functions described in Eq. (19) are verified in a range from 2- to 8-bits; they are the implementation constraints of the neurochip [9].Table 3: Hardware performance of policies: FPNN was evaluated by edge-CPU (Raspberry Pi 4: quad-core ARM Cortex-A72). SNN was evaluated by neurochip (Akida 1000 [9]). “Power cons” and “Calc. speed” denote power consumption and calculation speed for obtaining one action from NN policies using each piece of hardware. Power consumption was measured by voltage checker (TAP-TST8N).
Hardware | Edge-CPU | Neurochip |
---|---|---|
Network | FPNN | SNN |
Power consumption [mW] | 61 | 4 |
Calculation speed [ms] | 205 | 40 |