Fact Finder
Top 20
Hi @DiogeneseWe have been tracking Quadric since February on TSE. Given that we are co-partners with MegaChips it behoves us to keep a weather eye on them.
I find it useful to look at the problem a patent addresses. This one seeks to overcome the deficiencies of GPUs in performing NN tasks.
Basically they have designed a reconfigurable array of processors, much like a GPU, but have eliminated some redundant processing steps such as the addition of padding bits where the sensor data is absent.
The Quadric arrangement relies on synchronous operation and data inputs, whereas Akida has utilizes asynchronous data inputs, a particularly sweet adaptation for LiDaR.
For those who don't wish to know the score, turn away now.
US10474398B2 Machine perception and dense algorithm integrated circuit
Autonomous vehicles have been implemented with advanced sensor suites that provide a fusion of sensor data that enable route or path planning for autonomous vehicles. But, modern GPUs are not constructed for handling these additional high computation tasks.
[0006] At best, to enable a GPU or similar processing circuitry to handle additional sensor processing needs including path planning, sensor fusion, and the like, additional and/or disparate circuitry may be assembled to a traditional GPU. This fragmented and piecemeal approach to handling the additional perception processing needs of robotics and autonomous machines results in a number of inefficiencies in performing computations including inefficiencies in sensor signal processing.
1. An integrated circuit comprising:
a plurality of processing cores,
each processing core of the plurality of processing cores comprising:
at least one processing circuit; and
at least one memory circuit;
a plurality of peripheral cores,
each peripheral core of the plurality of peripheral cores comprising:
at least one memory circuit,
wherein:
at least a subset of the plurality of peripheral cores is arranged along a periphery of a first subset of the plurality of processing cores; and
[ii] a combination of the plurality of processing cores and the plurality of peripheral cores define an integrated circuit array;
a dispatch controller that provides data movement instructions, wherein the data movement instructions comprise a data flow schedule that:
defines an automatic movement of data within the integrated circuit array; and
sets one or more peripheral cores of the plurality of peripheral cores to a predetermined constant value if no data is provided to the one or more peripheral cores according to the predetermined data flow schedule.
View attachment 6877
130 = dispatch controller
140, 150 = periphery controllers
149, 159 = FIFO registers
160 = sensor data memory
View attachment 6883
112 = register file
122 = register file
114 = MAC
118 = ALU
[0039] An array core 110 may, additionally or alternatively, include a plurality of multiplier (multiply) accumulators (MACs) 114 or any suitable logic devices or digital circuits that may be capable of performing multiply and summation functions. In a preferred embodiment, each array core 110 includes four (4) MACs and each MAC 114 may be arranged at or near a specific side of a rectangular shaped array core 110 , as shown by way of example in FIG. 2.
[0035] An array core 110 preferably functions as a data or signal processing node (e.g., a small microprocessor) or processing circuit and preferably, includes a register file 112 having a large data storage capacity (e.g., 4 kilobyte (KB) or greater, etc.) and an arithmetic logic unit (ALU) 118 or any suitable digital electronic circuit that performs arithmetic and bitwise operations on integer binary numbers. In a preferred embodiment, the register file 112 of an array core 110 may be the only memory element that the processing circuits of an array core no may have direct access to. An array core no may have indirect access to memory outside of the array core and/or the integrated circuit array 105 (i.e., core mesh) defined by the plurality of border cores 120 and the plurality of array cores 110 .
[0036] The register file 112 of an array core no may be any suitable memory element or device, but preferably comprises one or more static random-access memories (SRAMs). The register file 112 may include a large number of registers, such as 1024 registers, that enables the storage of a sufficiently large data set for processing by the array core no. Accordingly, a technical benefit achieved by an arrangement of the large register file 112 within each array core 110 is that the large register file 112 reduces a need by an array core 110 to fetch and load data into its register file 112 for processing. As a result, a number of clock cycles required by the array core 112 to push data into and pull data out of memory is significantly reduced or eliminated altogether.
[0044] In a traditional integrated circuit (e.g., a GPU or the like), when input image data (or any other suitable sensor data) received for processing compute-intensive application (e.g., neural network algorithm) within such a circuit, it may be necessary to issue padding requests to areas within the circuit which do not include image values (e.g., pixel values) based on the input image data. That is, during image processing or the like, the traditional integrated circuit may function to perform image processing from a memory element that does not contain any image data value. In such instances, the traditional integrated circuit may function to request that a padding value, such as zero, be added to the memory element to avoid subsequent image processing efforts at the memory element without an image data value. A consequence of this typical image data processing by the traditional integrated circuit results in a number of clock cycles spent identifying the blank memory element and adding a computable value to the memory element for image processing or the like by the traditional integrated circuit.
Autonomous vehicles have been implemented with advanced sensor suites that provide a fusion of sensor data that enable route or path planning for autonomous vehicles. But, modern GPUs are not constructed for handling these additional high computation tasks.
[0006] At best, to enable a GPU or similar processing circuitry to handle additional sensor processing needs including path planning, sensor fusion, and the like, additional and/or disparate circuitry may be assembled to a traditional GPU. This fragmented and piecemeal approach to handling the additional perception processing needs of robotics and autonomous machines results in a number of inefficiencies in performing computations including inefficiencies in sensor signal processing.
1. An integrated circuit comprising:
a plurality of processing cores,
each processing core of the plurality of processing cores comprising:
at least one processing circuit; and
at least one memory circuit;
a plurality of peripheral cores,
each peripheral core of the plurality of peripheral cores comprising:
at least one memory circuit,
wherein:
at least a subset of the plurality of peripheral cores is arranged along a periphery of a first subset of the plurality of processing cores; and
[ii] a combination of the plurality of processing cores and the plurality of peripheral cores define an integrated circuit array;
a dispatch controller that provides data movement instructions, wherein the data movement instructions comprise a data flow schedule that:
defines an automatic movement of data within the integrated circuit array; and
sets one or more peripheral cores of the plurality of peripheral cores to a predetermined constant value if no data is provided to the one or more peripheral cores according to the predetermined data flow schedule.
Brilliant!
When I first read about Quadric I thought about the fact that there is yet to be a single agreed definition of the Edge and where it actually is in a system.
It struck me then that AKIDA at the far Edge with its mum saying be careful you should not be that close and Quadric at a safer place back from the Edge with its mum saying AKIDA come back and stand with your cousin Quadric was why MegaChips have both solutions.
The following which I extracted from the group of words you posted seems to fit this scenario with AKIDA making all the sensors intelligent and a Quadric processing the AKIDA made relevant data:
“Autonomous vehicles have been implemented with advanced sensor suites that provide a fusion of sensor data that enable route or path planning for autonomous vehicles. But, modern GPUs are not constructed for handling these additional high computation tasks.
[0006] At best, to enable a GPU or similar processing circuitry to handle additional sensor processing needs including path planning, sensor fusion, and the like, additional and/or disparate circuitry may be assembled to a traditional GPU”
If you generally agree then I can allow the rest of the technological differences to happily go over my head?
My opinion only DYOR
FF
AKIDA BALLISTA