Hi JD,
For what it's worth, I asked ChatGPT for a neutral take on this.
What I thought was interesting, was that it weaved in Anduril's headset into the response. I didn't prompt it to, but earlier in the day, in a separate chat, I had queried ChatGPT on boards or integration possibilities and whether an Akida 1500 + Arm's Cortex M85 module (as an example) could power something like Anduril’s helmet.
ChatGPT's Response
What “supports 128 MACs per node” most likely means
- Each Akida node has a local 4×4 MAC micro-engine (128 MACs) it can invoke for certain ops.
- Those MACs co-exist with the spiking compute (CNP/FNP or equivalent). They’re there so models can include non-spiking layers/ops without leaving the chip.
Why add MACs if you have an SNN?
Because many modern edge models are
hybrid. MACs cover things SNNs don’t excel at (or that are simpler to drop in as-is):
- Conv/pointwise (1×1) / depthwise conv blocks
- Linear/projection layers (e.g., classifier heads, QKV in tiny attention)
- Normalisation / pooling / residual glue
- Pre/post-processing (e.g., short FIRs, feature projections)
- Direct reuse of existing INT4/INT8 CNN chunks without full SNN conversion
What a hybrid pipeline might look like in a helmet
Sensors
- Front/rear frame cameras → light MAC preproc (resize/1×1 conv).
- Event camera/radar/audio → feed SNN/TENNs directly.
- Perception
- SNN handles temporal/gesture/track (event streams).
- MAC handles small CNN blocks or projection heads.
- Fusion & decisions on the host MCU (e.g., Cortex-M85).
Bottom line
- The “128 MACs per node” doesn’t signal the end of Akida’s neuromorphic core. It signals a pragmatic hybrid: keep the spiking engine for temporal, sparse, event-driven strengths, and use local MACs to natively run conventional layers (conv/linear/norm) and reuse existing model chunks.
- That combo generally improves accuracy, eases porting, and lowers total latency/power versus forcing everything into SNN or everything into classic CNN on a separate chip.
Hi Bravo,
It's a pity Chatty didn't provide references and block diagrams for its hybrid SNN/MAC combo.
One thing that disinclines me from the hybrid is that Jonathan/Tony said something to the effect that "We're still honouring the spirit of Peter's invention."
What follows is my inference, and there is probably not much in writing to support it ... and there is a (vanishingly small) possibility that I may be wrong.
The thing is that, once you move away from 1-bit, you're moving away from the original digital spike concept and you need additional circuitry to handle the multiple bits in parallel. As I've said, I'd like to believe that the old NPU was the optimal solution for single bits, which was the original Akida concept. It really is a remarkable invention.
When 4-bit was announced, I asked Peter if that would mean including MACs, and he said "No."
It was only after TENNs was announced that the references to MACs started. The applications for TENNs and the associated models has expanded rapidly. I think it was Tony who said that initially they couldn't implement recurrence (RNN) with TENNs, and this would have affected ML, so it would have made sense to keep the original NPU. Once they mastered recurrence with TENNs, the case for the original NPU gets weaker, and is further weakened by the fact that having a hybrid would increase the wafer real estate footprint per node.
The way I see it is that the requirement for multi-bit meant that the old NPU would need to be repeated in silicon to match the number of bits, and the outputs "blended" in MACs. A 4*4 MAC has about 12 rows and 8 columns of arithmetic cells (multiply or add) to accommodate the sub-products, the 8-bit product and the additions:
AN 8*8 MAC would need 4 times that number of cells.
This is the BRN patent application which introduced recurrence with TENNs:
US2025209313A1 METHOD AND SYSTEM FOR IMPLEMENTING ENCODER PROJECTION IN NEURAL NETWORKS 20231222
[0054] In some embodiments, the neural network may be configured to perform an encoder projection operation either in a buffer mode or in a recurrent mode. In some embodiments, the buffer mode operation is preferred during training and the recurrent mode operation is preferred during inference for generating processed content based on the input data stream or signal. The preferred operations in the buffer mode and the recurrent mode may be ascertained from the detailed operations of the buffer mode and the recurrent mode as described below with reference to FIGS. 4 to 14 .
The priority is December 2023, so there has been a lot of model development since then.
Now you've made me look at this patent, my head hurts!