Hi TTM,
While I'm not qualified to mark everybody's essays, I would take issue with the underlying principle of this paper that:
'S
oCs equipped with neuromorphic hardware are available, but they rarely consider the integration of non-neuromorphic hardware accelerators. For instance, Akida [6] is an SoC containing an Arm processor that controls the neuromorphic hardware, but does not support the on-chip integration of other hardware accelerators.'
This is a reference to Akida 1, SoC, not the Akida IP, but, in any event, the suggested limitations do not apply to either the SoC or the IP.
The whole idea of Akida IP is so that it can be integrated with other IC components, from doorbells to lidar, just as Akida 1 is adapted to work with the whole gamut of processors. As we know, Akida 1 has been demonstrated to work with key word recognition, lidar, DVS, breath analysers, ... . In addition, Akida has been produced in radhard form and in 22 nm FDSoI, and has been proven to be compatible across the range of ARM processors, which presumably includes the
M85 with AI hardware Ethos and Helium software (even if it makes them superfluous).
The statement that the ARM processor controls the neuromorphic hardware is not correct. The ARM, or any other processor is used only in setting up the SNN (configuration, loading weights and model libraries), but the processor plays no part in the inference/classification function of Akida 1, and probably very little in the more advanced functions of Akida 2 (TeNNs, ViT, Skip).
T
owards this goal, we developed SpikeHard, a runtime-programmable neuromorphic hardware accelerator designed under the premises of high efficiency, scalability, and seamless integration in heterogeneous many-accelerator SoCs. As shown in Figure 1, our design methodology follows innovative restructuring of SNNs, which optimally remaps an SNN model to a desired hardware architecture. This promotes multi-objective design-space exploration (DSE) that helps to meet strict performance and energy-efficiency requirements in a many-accelerator SoC.
View attachment 46032
Akida is configurable from 2 nodes (8 NPUs), up to 64 nodes, and several Akida's can be ganged together. MetaTF is used in optimizing the configuration of Akida.
Our friends at Renesas are proud of their DRP-AI (not a NN), but it is dynamically reconfigurable, but I'm not familiar with the details.
Having lumped Akida in with True North and Loihi, the paper then goes on to discuss perceived deficiencies in the IBM and Intel products and, by implication, attributing those deficiencies to Akida without any analysis of Akida.
The problem the authors addressed:
T
his model generation has two main weaknesses. First, the sequential generation at the level of each core imposes a dependency between the computational model and the hardware architecture. Second, after model generation, there is no process that checks for redundancy and efficient resource utilization.
As part of developing SpikeHard, we initially focused on the second weakness by optimizing SNNs that were generated by RANC with a dependency to a hardware architecture. In particular, we analyzed the original placement of neurons and axons, and provided an optimized placement according to the original capacity of each core. As shown in Figure 3, our method dramatically improves inner core utilization from an average of 50%50% for the original model to 85%85%, while also reducing the overall core count by 40%40%. In this way, we eliminate the redundancies at the level of a core and of the full neuromorphic processor, thereby solving the second weakness.
remembering that Akida can run different models, eg, face recognition, gesture recognition, voice recognition, I don't see how the following relates to Akida:
R
ANC only allows the SNN model to be configured at design time [26]. Specifically, each core stores model parameters, such as how neurons and axons are connected to one another, in immutable local buffers initialized via Verilog initial blocks. Consequently, in the case of an FPGA implementation, loading a new model to RANC requires synthesizing a new bitstream and reconfiguring the FPGA, which is time-consuming and impractical for real-time systems. In the case of an ASIC implementation, the immutable buffers would have to be initialized using an additional mechanism instead of initial blocks. Nevertheless, without programmability, the ASIC design would be unable to accommodate multiple SNN models at runtime.
I
n contrast, SpikeHard is a flexible hardware accelerator whose configuration can be programmed at runtime to process various SNN models. The neuromorphic processor inside SpikeHard uses the same core architecture as RANC with a few changes to support runtime programmability. To load a new model at runtime, SpikeHard overwrites the aforementioned local buffers with new model parameters. To this end, as illustrated in Figure 1(b), the new model parameters are broadcasted one after the other to all cores while being streamed from main memory. Alongside each parameter is a pointer, which specifies the destination core and buffer address. The destination core then writes the parameter to the specified address. Given a SpikeHard accelerator embedded with a grid of cores and a fixed core capacity, multiple different SNN models can be offloaded onto the accelerator by simply restructuring the models to fit within the core capacity and the grid dimensions (as described in Section 3.4).
So if I'm reading this correctly, they can only run different models sequentially, not simultaneously in parallel.
Because Akida is readily reconfigurable, I'm not sure that this paper is particularly relevant to Akida.