Right!
Now I've had a cold shower, it should be noted that Socionext has had an NNA since at least 2018:
https://socionextus.com/pressreleases/socionext-ai-accelerator-engine-for-edge-computing/#:~:text=SUNNYVALE, Calif., May 11, 2018 – Socionext Inc.,,been designed specifically for deep learning inference processing.
Socionext Develops AI Accelerator Engine Optimized for Edge Computing
Socionext Develops AI Accelerator Engine Optimized for Edge Computing
Share this post
Share TweetShare
Small-sized and Low Power Engine Supports Broad Range of Applications
SUNNYVALE, Calif., May 11, 2018 –Socionext Inc., a leading provider of SoC-based solutions, has developed a new
Neural Network Accelerator (NNA) engine, optimized for AI processing on edge computing devices. The compact, low power engine has been designed specifically for
deep learning inference processing. When implemented, it can achieve 100x performance boost compared with conventional processors for computer vision processing such as image recognition. Socionext will start delivering the Software Development Kit for the FPGA implementation of the NNA in the third quarter of 2018. The company is also planning to develop its SoC products with the NNA.
Socionext currently provides graphics SoC "SC1810" with a built-in proprietary Vision Processor Unit compatible with the computer vision API "OpenVX" developed by the Khronos Group, a standardization organization.
The NNA has been designed to work as an accelerator to extend the capability of the VPU. It performs various computer vision processing functions with deep learning, as well as conventional image recognition, for applications including automotive and digital signage, delivering higher performance and lower power consumption.
The NNA incorporates the company's
proprietary architecture using the quantization technology that
reduces the bits for parameters and activations required for deep learning. The quantization technology is capable of carrying out massive amounts of computing tasks with less resource, greatly reducing the data size, and significantly lowering the system memory bandwidth. In addition, the newly developed on-chip memory circuit design improves the efficiency of computing resource required for deep learning, enabling optimum performance in a very small package. A VPU equipped with the new NNA combined with the latest technologies will be able to achieve 100 times faster processing speed in image recognition compared with a conventional VPU.
View attachment 38967
Now the interesting thing is that Socionext have a patent application dating from mid-2018 whose purpose is to reduce the calculations required for large MAC loads.
US2021081489A1 ARITHMETIC METHOD 20180604
View attachment 38968
View attachment 38969
[0010] A
n arithmetic method according to the present disclosure is an arithmetic method of performing convolution operation in convolutional layers of a neural network by calculating matrix products, using an arithmetic unit and an internal memory included in a LSI. The arithmetic method includes: determining, for each of the convolutional layers, whether an amount of input data to be inputted to the convolutional layer is smaller than or equal to a predetermined amount of data; selecting a first arithmetic mode and performing convolution operation in the first arithmetic mode, when the amount of input data is determined to be smaller than or equal to the predetermined amount of data in the determining; selecting a second arithmetic mode and performing convolution operation in the second arithmetic mode, when the amount of input data is determined to be larger than the predetermined amount of data in the determining; and outputting output data which is a result obtained by performing convolution operation, in which the performing of convolution operation in the first arithmetic mode includes: storing weight data for the convolutional layer in external memory located outside the LSI; storing the input data for the convolutional layer in the internal memory; and reading the weight data from the external memory into the internal memory part by part as first data of at least one row vector or column vector, and causing the arithmetic unit to calculate a matrix product of the first data and a matrix of the input data stored in the internal memory, the weight data is read, as a whole, from the external memory into the internal memory only once, the performing of convolution operation in the second arithmetic mode includes: storing the input data for the convolutional layer in the external memory located outside the LSI; storing a matrix of the weight data for the convolutional layer in the internal memory; and reading the input data from the external memory into the internal memory part by part as second data of at least one column vector or row vector, and causing the arithmetic unit to calculate a matrix product of the second data and the matrix of the weight data stored in the internal memory, and the input data is read, as a whole, from the external memory into the internal memory only once.
Now it was about 2018 that BrainChip and Socionext began their cooperation, so their original NNA was developed in advance of their association with Akida.
If we assume that this patent is their description of their original NNA, Akida would wipe the floor with it. Akida could perform the functions of the VPU with NNA above in a trice. Given that SocioNext have undoubtedly seen Akida in action, bearing in mind their initial enthusiasm for a Synquacer/Akida engagement, would they persist with their clunky NNA from last millennium?