Hi wasMADX,
I haven't studied Weebit's tech
This is what they say about Edge AI:
Edge Artificial Intelligence
Regardless of the specific application, storing weights for artificial Neural Networks (NNs) requires significant on-chip memory.
Depending on the network size, requirements typically range between 10Mb – 100Mb. For AI edge products where low power consumption is so important, what’s needed is small, fast, on-chip (embedded) NVM.
Although it is common and simple for near-memory computation, SRAM won’t work for these applications because it is extremely large and volatile. This volatility means it must stay connected to power, consuming a great deal of power and also risking data loss in the event that power is unexpectedly cut off. Given its size, it would also require additional off-chip NVM, leading to memory bottlenecks and power waste. On-chip flash memory is also far from ideal. As NVM, it can persistently hold weights even during power-off, but it can’t scale below 28nm as embedded on-chip memory. This means a separate chip is needed – leading to memory bottlenecks.
ReRAM (RRAM) is 4x smaller than SRAM so more data can reside locally. It scales well below 28nm, it is non-volatile, and it enables quick memory access. Weebit ReRAM is ideal for advanced edge AI chips.
https://www.weebit-nano.com/market/applications/#edge
They talk about storing weights for ANNs.
They do not propose their ReRAM for in-memory compute, which would be analog.
I would guess that there is the pervasive analog problem of manufacturing variations which would result in unreliable calculations.
One of the advantages of their ReRAM is its robustness in hostile environments:
https://www.weebit-nano.com/market/applications/#aerospace
Aerospace and Defense
ICs for aerospace and defense have unique requirements for robustness, reliability at high temperatures, and tolerance to radiation (rad-hard) and electromagnetic fields. As these products are often required to last for years – mostly without maintenance – longevity is another key trait. Memory must be reliable for the lifetime of the product.
Weebit ReRAM has significantly better endurance than flash, ensuring it can support products with long lifetimes. It is also able to maintain its reliability at a broad range of temperatures, from (-55)0 Celsius up to 1750 Celsius. ReRAM (RRAM) cells are inherently immune to various types of radiation and electromagnetic fields. In fact, Weebit ReRAM can withstand 350x more radiation than flash. These features make Weebit ReRAM ideal for aerospace and defense applications.
It could be used as a backup memory for Akida's configuration data (weights, connexions ...) in remote/inaccessible applications.
Hi FF,
This sounds like the pre-Thorpe rate coding.
"An enhanced version of the integrate and fire model is the leaky integrate and fire (LIF) model which also takes the membrane voltage leak into account. SFA, i.e.
increase in the inter-spike interval (ISI) over time for a regular spike train, is an intrinsic feature of biological neurons. In this paper, we will focus on SFA as an important feature to explore in SNNs."
Thorpe deduced from Adrian's research from 70 years before, that the information in the spike train repetition rate was largely redundant, the relevant information being conveyed by the amplitude of the initial spike, and that the larger spikes (those conveying the most significant information, arrived before weaker spikes (possibly because they reached the firing threshold earlier?).
This then led to N-of-M coding in which only the first N incoming spikes were passed on for processing. This is quite similar to DVS cameras where pixels whose output does not exceed a threshold are ignored.
View attachment 56450
View attachment 56451
The paper postulates a number of reasons for SFA.
The biological phenomenon of spike frequency adaptation
I
n biology, if a neuron is stimulated in a repeated and prolonged fashion, for example by constant sensory stimulation or artificially by applying an electric current, it first shows a strong onset response, followed by an increase in the time between spikes.
Hence the spike rate attenuates and the so-called spike frequency adaptation takes place.
Experimental data from the Allen Institute show that17 a substantial fraction of excitatory neurons of the neocortex, ranging from 20% in the mouse visual cortex to 40% in the human frontal lobe, exhibit SFA as shown in Fig. 2a, b.
There can be different causes for SFA:
First, short-term depression of the synapse through depletion of the synaptic vesicle pool. This means that at the connection site between neurons, the signal from the pre-synaptic neuron cannot be transmitted to the next neuron.
Second, by an increase in the spiking threshold of the post-synaptic neuron due to the activation of potassium channels by calcium, which has a subtractive effect on the input current. Hence, the same input current that previously caused a spike does not lead to a spike anymore.
Third, lateral and feedback inhibition in the local network reduces the effect of excitatory inputs in a delayed fashion20. Therefore, like in the second case, spike generation is hampered.
Advantages of spike frequency adaptation
Fr
om a biological standpoint, multiple advantages of the SFA mechanism have been observed. First, it lowers the metabolic costs, by facilitating sparse coding21: When there is no significant information in the presented inputs, as the input is either being repeated or there is a high-intensity constant stimulant, the firing rate is decreased leading to a reduction in metabolic cost and hence power consumption. Moreover, the separation of high-frequency signals from noisy environments is facilitated by SFA22. In addition, SFA can be seen as a simple form of short-term memory on the single-cell level23.
In other words, SFA improves the efficiency24 and accuracy of the neural code and hence optimizes information transmission25. SFA can be seen as an adaptation of the spike output range to the statistical range of the environment, meaning that it contrasts fluctuations of the input rather than its absolute intensity26. Thereby noise is reduced and, as mentioned above, repetitive information is suppressed which leads to an increase in entropy. Consequently, the detection of a salient stimulus can be enhanced27. These biological advantages of SFA can also be exploited for low-power and high-entropy computations in artificial neural networks.
To introduce SFA in spiking neural networks, a neuron model can be used which includes an adaptive threshold property28. SSNs with these kinds of neurons learn quickly, even without synaptic plasticity29. Moreover, SFA helps in attaining higher computational efficiency in SNNs17. For example, to achieve a store-and-recall cycle (working memory) of duration 1200 ms, a single exponential adaptive model requires a decay constant, τa = 1200 ms in ref. 17, while a double exponential adaptive threshold model requires decay constants of τa1 = 30 ms and τa2 = 300 ms19—the latter being more efficient and sophisticated with four adaptation parameters compared to two parameters in ref.
17.
However, it is not clear that attempting to mimic biological neurons too closely is beneficial in an electronic context. This is where Rain came unstuck.
Does it make the process faster/more power efficient/more accurate/improve ML?
Does the claim that using the rate change is more efficient needs to take into account the cost of monitoring the rate.
N-of-M coding is highly efficient in weeding out the also-rans. On that front, it is notable that a couple of Steve Furber's papers are cited, but Steve independently of Thorpe came up with N-of-M coding, yet there is no mention of this.