BRN Discussion Ongoing

Dang Son

Regular
<iframe src="https://giphy.com/embed/gIEpRVmXIOHiQr7fjP" width="480" height="480" frameBorder="0" class="giphy-embed" allowFullScreen></iframe><p><a href="">via GIPHY</a></p>
 
  • Haha
Reactions: 1 users

uiux

Regular
@uiux @misslou

Not sure if this helps but does give some insight into Grai and their original AI Neuronflow Architecture on the GrAI One and they have now moved to the GrAI VIP as an evolution so I could be wrong but, expect the underlying architecture of the VIP is just some enhancements and not something entirely new or ground breaking.

Whilst obviously suitable for certain cases and close to Akida in particular areas eg digital, event based, would appear no on device learning and indicate to me still not near Akida overall.

Original May 22 paper attached.

View attachment 13972

View attachment 13969

View attachment 13966
View attachment 13967

I am hoping that if I force feed enough of the underlying maths into my head it's going to magically just start making sense one day
 
  • Haha
  • Like
Reactions: 14 users

Bravo

If ARM was an arm, BRN would be its biceps💪!
Hey Brain Fam,

I only just realised this morning after reading the article linked below that Sony and Valeo have a connection. I must be a bit slow on the up-tick but, better late than never as the saying goes. At 2022 CES, Sony announced two electric cars, the Vision-S 01, a sedan to take on the Tesla Model 3, and the Sony Vision-S 02. For these EVs, Sony had partnered with Magna Steyr, Almotive, Valeo, and Bosch.

I haven't checked the the Vision-S 01 yet but the Vision-S 02 sedan has 40 sensors comprised of cameraas, radars and LiDARS.

Naturally, I'm not jumping to any conclusions 😇 but I thought this was VERY interesting!

1 am.png

Screen Shot 2022-08-12 at 10.48.03 am.png


Screen Shot 2022-08-12 at 10.57.20 am.png

 
  • Like
  • Love
  • Fire
Reactions: 37 users

misslou

Founding Member
“Better” is open for interpretation. I’d say different and limited. It seems to be a chip that can be trained to do a specific task in a way that is generally acceptable to call it AI. Not in my opinion, but seemingly accepted by many others. But then I could do the same in a sequential programmimg language also. No AI need be involved.

For each pixel
if( pixel changes from previous state) then
do something
end if
Store current pixel state for next iteration
end loop

A couple of things that stood out for me were:
1) “GrAI VIP can handle MobileNetv1–SSD running at 30fps for 184 mW, around 20× the inferences per second per Watt compared to a comparable GPU”

Comparing it to a power-hungry GPU is a bit naughty. Everyone knows they are power hungry and anyway, GPUs don’t do inferences per se—just sledgehammer, power hungry, high level maths. Well considering multiplication to be high level that is.

Akida has helped achieve 1000fps and uses µW


2) it uses 16 but floating point in calcs. That would be compute intensive.


3) the system can be trained, but I saw nothing about it learning.

and
4) it seems very specific to processing images only. Although they do also mention audio, bit their example is only for video.

IMHO it seems like they are closer to a normal, and single tasked, CNN and are using the word neuromorphic in a very loose manner. Pretty much just as a buzz word—probably to get search engines to find the article. Sure they call things neurons, but so to do many other implementations call memory cells neurons, and call what they have neuromorphic.

As @jtardif999 stated, they don’t mention synapses, and I don’t accept that if you have neurons, then synapses naturally follow. They should, in a true neuromorphic implementation, but so many are using that term for things that are very loosely modelled on only part of the brain.

As an example I refer to ReRAM implementations of “neuromorphic” systems. They store both state and weight in memory cells, and use the resistive state of memory cells to perform analogue addition and multiplication. But I think all such “neuromorphic” implementations suffer the same limitation of not being able to learn, they can only be trained. And once trained for a task, that is the only task they do until re-trained. And if that is all you want, then is your definition of “better”.

This raises a VERY relevant question, is Akida too good. The world has time-and-time again gone with simple to understand, and simple to use solutions, over complex multi-faceted solutions. The world especially likes mass-produced widgets that do a required task well-enough. Some of these other “neuromorphic” solutions may prove to be just that. People seem happy to throw money multiple times at an inferior product rather than pay extra for the product they really need.

There’s enough room in the TAM for multiple players. I’m happy for Akida to occupy the top spot, solving the more difficult problems, and leave the more mundane to others.
Fantastic explanation, thanks very much.

And thanks to everyone else who knows stuff and generously shared that knowledge.

I loved being educated, especially when it comes to this investment.
 
  • Like
  • Love
  • Fire
Reactions: 25 users

KMuzza

Mad Scientist
Actually, nviso did port all their models over.... So not entirely sure about # of required nodes of all models..... Either way, it's not a fair comparison with Mobilenet as apples vs oranges


View attachment 13959
“Better” is open for interpretation. I’d say different and limited. It seems to be a chip that can be trained to do a specific task in a way that is generally acceptable to call it AI. Not in my opinion, but seemingly accepted by many others. But then I could do the same in a sequential programmimg language also. No AI need be involved.

For each pixel
if( pixel changes from previous state) then
do something
end if
Store current pixel state for next iteration
end loop

A couple of things that stood out for me were:
1) “GrAI VIP can handle MobileNetv1–SSD running at 30fps for 184 mW, around 20× the inferences per second per Watt compared to a comparable GPU”

Comparing it to a power-hungry GPU is a bit naughty. Everyone knows they are power hungry and anyway, GPUs don’t do inferences per se—just sledgehammer, power hungry, high level maths. Well considering multiplication to be high level that is.

Akida has helped achieve 1000fps and uses µW


2) it uses 16 but floating point in calcs. That would be compute intensive.


3) the system can be trained, but I saw nothing about it learning.

and
4) it seems very specific to processing images only. Although they do also mention audio, bit their example is only for video.

IMHO it seems like they are closer to a normal, and single tasked, CNN and are using the word neuromorphic in a very loose manner. Pretty much just as a buzz word—probably to get search engines to find the article. Sure they call things neurons, but so to do many other implementations call memory cells neurons, and call what they have neuromorphic.

As @jtardif999 stated, they don’t mention synapses, and I don’t accept that if you have neurons, then synapses naturally follow. They should, in a true neuromorphic implementation, but so many are using that term for things that are very loosely modelled on only part of the brain.

As an example I refer to ReRAM implementations of “neuromorphic” systems. They store both state and weight in memory cells, and use the resistive state of memory cells to perform analogue addition and multiplication. But I think all such “neuromorphic” implementations suffer the same limitation of not being able to learn, they can only be trained. And once trained for a task, that is the only task they do until re-trained. And if that is all you want, then is your definition of “better”.

This raises a VERY relevant question, is Akida too good. The world has time-and-time again gone with simple to understand, and simple to use solutions, over complex multi-faceted solutions. The world especially likes mass-produced widgets that do a required task well-enough. Some of these other “neuromorphic” solutions may prove to be just that. People seem happy to throw money multiple times at an inferior product rather than pay extra for the product they really need.

There’s enough room in the TAM for multiple players. I’m happy for Akida to occupy the top spot, solving the more difficult problems, and leave the more mundane to others.
Hi Guys,
remember this as well.

BrainChip Holdings Ltd

Appendix 4C and Quarterly Activities Report

For the Period Ended 30 June 2022.


Financial Update.-

Brainchip incurred $0.77M in third party licenses and hardware related to the development of

next-generation Akida engineering samples




cheers
AKIDA BALLISTA UBQTS
 
  • Like
  • Fire
  • Love
Reactions: 28 users

uiux

Regular
Fantastic explanation, thanks very much.

And thanks to everyone else who knows stuff and generously shared that knowledge.

I loved being educated, especially when it comes to this investment.


The research paper that @Fullmoonfever shared is amazing. Hands down the biggest education resource I've seen for a long time, covering lots of the competitive landscape


Amazing read. Recommended.

Thanks again @Fullmoonfever
 
  • Like
  • Fire
  • Thinking
Reactions: 31 users

uiux

Regular
Hey Brain Fam,

I only just realised this morning after reading the article linked below that Sony and Valeo have a connection. I must be a bit slow on the up-tick but, better late than never as the saying goes. At 2022 CES, Sony announced two electric cars, the Vision-S 01, a sedan to take on the Tesla Model 3, and the Sony Vision-S 02. For these EVs, Sony had partnered with Magna Steyr, Almotive, Valeo, and Bosch.

I haven't checked the the Vision-S 01 yet but the Vision-S 02 sedan has 40 sensors comprised of cameraas, radars and LiDARS.

Naturally, I'm not jumping to any conclusions 😇 but I thought this was VERY interesting!

View attachment 13974
View attachment 13975

View attachment 13976

Here's a conclusion for you Bravo:

Sony and Valeo have connection
 
  • Like
  • Haha
Reactions: 22 users

Bravo

If ARM was an arm, BRN would be its biceps💪!
Here's a conclusion for you Bravo:

Sony and Valeo have connection


Affirmative U-bby-baby! And here's the other conclusion:

Valeo and BrainChip have a connection! 🤭
 
  • Like
  • Haha
  • Love
Reactions: 25 users

alwaysgreen

Top 20
Therefore, it is highly likely that someone at Valeo has advised someone at Sony, that Akida is the best god damn neuromorphic chip on the planet!
 
  • Like
  • Fire
  • Love
Reactions: 31 users

Baisyet

Regular
This is an awesome find, thank you!



I am looking for the slide from a BrainChip presentation that shows Graimatter next to BrainChip with a few of the other chips to the right


Anyone know which one it's from? I thought it was 2022 AGM preso
Is this the one uiux
 

Attachments

  • Brain chip 2021.pdf
    4.4 MB · Views: 148
  • Like
  • Fire
Reactions: 5 users

uiux

Regular
Therefore, it is highly likely that someone at Valeo has advised someone at Sony, that Akida is the best god damn neuromorphic chip on the planet!

There's always that one guy
 
  • Like
  • Haha
Reactions: 7 users

uiux

Regular
@Baisyet


Nah, there is one that has 5 or so competing products on it, with Brainchip and Graimatter towards the bottom left of the slide
 
Last edited:
  • Thinking
  • Like
Reactions: 2 users

alwaysgreen

Top 20
  • Haha
  • Like
Reactions: 17 users

uiux

Regular
Is this the one uiux

Although that preso has this:


1660269226811.png



Which gives insight into what that $$$$ spent was in the 4C..


AKD1500


Reading this slide makes me think that AKD2000 will be the 2nd interation of AKD1500, which is similar to the performance enhancements reported from the 2nd batch of AKD1000
 
  • Like
  • Fire
  • Love
Reactions: 34 users

equanimous

Norse clairvoyant shapeshifter goddess
Although that preso has this:


View attachment 13979


Which gives insight into what that $$$$ spent was in the 4C..


AKD1500


Reading this slide makes me think that AKD2000 will be the 2nd interaction of AKD1500, which is similar to the performance enhancements reported from the 2nd batch of AKD1000
Im on the move is this what akd3000 is about below

Post in thread 'BRN Discussion 2022' https://thestockexchange.com.au/threads/brn-discussion-2022.1/post-117381
 
  • Like
Reactions: 4 users

Diogenese

Top 20
Although that preso has this:


View attachment 13979


Which gives insight into what that $$$$ spent was in the 4C..


AKD1500


Reading this slide makes me think that AKD2000 will be the 2nd interation of AKD1500, which is similar to the performance enhancements reported from the 2nd batch of AKD1000
Hi Ui,

If I recall correctly, Akida 1500 will include LSTM.

This is a major change which was not contemplated in the original roadmap. I suppose it has been interpolated because it is too important to wait for the full Akida 2000.
 
  • Like
  • Fire
  • Thinking
Reactions: 20 users

buena suerte :-)

BOB Bank of Brainchip

Here is a very comprehensive report including Brainchip and GrAI Matter (will take me forever to get through all this!! :)

Spiking Neural Networks: Research Projects Or Commercial Products?​

744Shares
facebook sharing button
381
twitter sharing button
45
linkedin sharing button
294
sharethis sharing button

Opinions differ widely, but in this space that isn’t unusual.
MAY 18TH, 2020 - BY: BRYON MOYER
popularity

Spiking neural networks (SNNs) often are touted as a way to get close to the power efficiency of the brain, but there is widespread confusion about what exactly that means. In fact, there is disagreement about how the brain actually works.
Some SNN implementations are less brain-like than others. Depending on whom you talk to, SNNs are either a long way away or close to commercialization. The varying definitions of SNNs leads to differences in how the industry is seen.
“A few startups are doing their own SNNs,” said Ron Lowman, strategic marketing manager of IP at Synopsys. “It’s being driven by guys that have expertise in how to train, optimize, and write software for them.”
On the other hand, Flex Logix Inference Technical Marketing Manager Vinay Mehta said that, “SNNs are out further than reinforcement learning,” referring to a machine-learning concept that’s still largely in the research phase.
The entire notion of a “neural network” is motivated by attempts to model how the brain works. But current neural networks — like the convolutional neural networks (CNNs) that are so prevalent today — don’t follow the design of the brain. Instead, they rely on matrix multiplication for incorporating synaptic weights and gradient-descent algorithms for supervised training.
Those working on SNNs often refer to these as “classical” networks or “artificial” neural networks (ANNs). That said, Alexandre Valentian, head of advanced technologies and system-on-chip laboratory for CEA-Leti, noted that CNNs reflect more of an approach or type of application, while SNNs reflect an implementation. “CNNs can be implemented in spikes — it’s not CNN vs. SNN.”
Mimicking the brain
The notion of an SNN originates in the fact that the brain uses spikes to relay information. An important question, however, is how information is coded onto those spikes. Several ways are used in both research and development stages. This category of neural network is sometimes referred to as “neuromorphic,” in that it reflects the way the brain works. Classical networks are not neuromorphic, but some SNNs are more neuromorphic than others. As noted in a BrainChip paper, “… Today’s technology… is, at best, only loosely related to how the brain functions.”
Many of the SNN ideas are still in the exploration stage in academic institutions. Several papers at the 2019 IEDM conference dealt with implementations of SNNs with novel circuit techniques to achieve the goals of lower power. But there are also commercial companies working on SNNs. As identified at the recent Linley Spring Processor Conference, Intel has a serious research program going, while BrainChip and GrAI Matter Labs are readying commercial chips. The reason for this wide range between early research and commercial viability reflects a range of interpretations as to how an SNN can be implemented.
Some of the projects underway involve literal spikes, which are an analog phenomenon. But others abstract the notion of a “spike” into that of an “event,” and they implement them digitally as packets traveling through a network from neuron to neuron. The high-level effect, then, is to move from measuring everything all the time, as in a classical CNN, to dealing only with events. The power savings expected from SNNs is often thought to relate to the spikes themselves, but part of the gain comes from dealing with events. In other words, work happens only when there’s an interesting event to work with. Otherwise, no work (or less work) is done, keeping power low.
“If you don’t achieve [a neuron’s] activation threshold, no event is generated,” said Roger Levinson, COO of BrainChip. This corresponds to a high level of sparsity, which is coveted in classical networks.
Another feature of SNNs is the fact that events can excite or suppress a neuron. Events then can compete with each other, with some having an excitatory effect while others have an inhibitory effect. With classical networks, negative weights can reduce the magnitude of the resulting activations, but that’s more of a static representation of a video frame (or other data set) being evaluated rather than events pushing and pulling on the outcomes.
Coding values in spikes
One of the major distinctions between SNN implementations relates to what is referred to as “coding” – how a value is transformed into a stream of spikes. While there are several ways to do this, two appear to predominate many of the discussions: rate coding and temporal coding.
Rate coding takes a value and transforms it into a constant spike frequency for the duration of that value. The benefit of this approach is that classical training techniques can be used, with the resulting values then being transcoded for an SNN inference engine. Classical networks use an enormous amount of multiplication, which is energy-intensive. Spikes, by contrast, are simply accumulated, with no multiplication necessary. That said, each spike results in a synaptic-weight lookup, which also burns power, prompting Valentian to caution that it’s not clear that this approach is lower in power.
Temporal coding is said by some to be closer to what happens in the brain, although there are differing opinions on that, with some saying that that’s the case only for a small set of examples: “It’s actually not that common in the brain,” said Jonatha Tapson, GrAI Matter’s chief scientific officer. An example where it is used is in owl’s ears. “They use their hearing to hunt at night, so their directional sensitivity has to be very high.” Instead of representing a value by a frequency of spikes, the value is encoded as the delay between spikes. Spikes then represent events, and the goal is to identify meaningful patterns in a stream of spikes.
A major challenge, however, is training, because classical training results cannot be transcoded into this type of SNN. There is no easily-obtained derivative of the spike train, making it impossible to use the gradient-descent approach to training. In general, Tapson said, “Temporal coding is horrible for electronics. It makes it hard to know if a calculation completes, and it is very slow.”
Temporally coded SNNs can be most effective when driven by sensors that generate temporal-coded data – that is, event-based sensors. Dynamic vision sensors (DVS) are examples. They don’t generate full frames of data on a frames-per-second basis. Instead, each pixel reports when its illumination changes by more than some threshold amount. This generates a “change” event, which then propagates through the network. Valentian said these also can be particularly useful in AR/VR applications for “visual odometry,” where inertial measurement units are too slow.
It’s possible that temporally-coded SNNs could work with shallower networks than the 50 to 100 (or more) layers we’re seeing with classical networks. “The visual cortex is only six layers deep, although that system isn’t purely feed-forward,” Valentian said. “There’s some feedback, as well.” Still, he noted that what’s lacking here is a killer application that will provide the energy and funding required to push temporal coding forward.
Meanwhile, BrainChip started with rate coding, but decided that wasn’t commercially viable. Instead, it uses rank coding (or rank-order coding), which uses the order of arrival of spikes (as opposed to literal timing) to a neuron as a code. This is a pattern-oriented approach, with arrivals in the prescribed order (along with synaptic weighting) stimulating the greatest response and arrivals in other orders providing less stimulation.
All of these coding approaches aside, GrAI Matter uses a more direct approach. “We encode values directly as numbers – 8- or 16-bit integers in GrAI One or Bfloat16 in our upcoming chip. This is a key departure from other neuromorphic architectures, which have to use rate or population or time or ensemble codes. We can use those, too, but they are not efficient,” said Tapson.
Neurons
SNN neurons typically are implemented in one of two ways. The approaches are motivated by analog implementations, although they can be abstracted into digital equivalents. Arteris IP fellow and chief architect Michael Frank refers to this as “emulation.” He points to several challenges for an analog implementation: “With analog, you would need to customize the model to the specific chip for inference. No two transistors are the same. And at 7 nm, you can’t do analog.”
Tapson concurs. “For a large circuit, you need to be digital,” he said.
The idea behind the two abstract neural approaches is that a neuron evaluates a signal by accumulating spikes. The simplest implementation is called “integrate-and-fire” (IF). Each spike is accumulated in the neuron until a threshold is reached, at which point the neuron fires an output spike – that is, it creates an event that propagates downstream in the network (at least for a feed-forward configuration). Many of the academic projects ongoing implement this as a literal analog circuit, and in operation it’s philosophically similar to sigma-delta modulation.
The challenge here, especially for temporal coding, is that patterns may inadvertently appear over a long time period. What are two events separated in time may be interpreted as a single pattern, since early accumulation remains in place as new spikes arrive.
In order to neutralize older “obsolete” results as newer ones arrive, a “leaky integrate-and-fire” (LIF) circuit can be used. This means that accumulations gradually dissipate over time so that, given enough time between events, accumulation restarts from a low level.
Another element that can reverse accumulation is an inhibitory event. Accumulation assumes excitatory events that add to the accumulation, but inhibitory events accumulate negative values, reducing the level of accumulation.

Fig. 1: IF and LIF neuron behavior, idealized for illustration. Note that, in the second case, the threshold is never reached due to the leakage. Neurons may also have a refractory period during which they can accumulate but not fire. Source: Bryon Moyer/Semiconductor Engineering
Synapses

Synapse implementation will depend strongly on how a specific network is implemented. For analog implementations, a spike will result in a certain amount of current injected into or out of the neuron. The amount of current depends on the synaptic weight.
A team from CEA-Leti discussed an analog SNN using RRAM in a paper presented at the 2019 IEDM conference. While RRAM has been used in classical networks as a way of implementing in-memory computation of multiply-accumulate functions, its usage here is different. Eight cells are used, four each for excitation and inhibition, with anywhere from 0 to 4 of the resistors being programmed in a low-resistance state. Low resistance means more current and, hence, a stronger weight. The more cells in a low-resistance state, the greater the overall synaptic current. The following image shows the Leti synapse design.

Fig. 2: Leti’s synapse implementation. “HRS” stands for “high-resistance state”; “LRS” stands for “low-resistance state.” Source: CEA-Leti
An array of these cells is shown in Figure 3. Each synapse gets its own word line; currents are sensed through the bit lines.

Fig. 3: Leti’s synaptic array. Source: CEA-Leti
The currents are summed into the neuron as shown in Figure 4. The capacitor acts as the accumulator as the membrane voltage varies with the injected currents. Note that there are both positive and negative thresholds, meaning that the neuron can fire an excitation spike or an inhibition spike.

Fig. 4: Neuron accumulation in the presence of excitatory and inhibitory spikes. Source: CEA-Leti
In a digital implementation, the notion of a spike is an abstraction, and multiplication is still required to scale an incoming spike by a synaptic weight. GrAI Matter’s approach is shown in Figure 5.

Fig. 5: GrAI Matter’s digital neuron core. Source: GrAI Matter Labs
NoCs in the Circuit

For digital SNN emulations, the routing of spikes often happens through a network-on-chip, or NoC. NoCs are common in sophisticated systems-on-chip (SoCs), but those networks often carry large payloads. By contrast, spike data is very small. In fact, Arteris IP’s Frank said the packet headers may be longer than the payload itself.
Packets can be broadcast to the destination neurons with an identifying tag. Then receiving neurons will know which tag to pay attention to, giving the effect of multi-cast. In this way, spikes arrive at the intended neurons for processing, while other neurons ignore them. This gives the input side of the neuron a many-to-one relationship, while the output has a one-to-many relationship.
Frank indicated there should not be issues with collisions on the network. Sensor data is generated at a rate of around 500 samples per second, while the network is clocked at hundreds of megahertz. This leaves plenty of room for time-sharing data so that individual spike deliveries can appear to be concurrent. If there is any issue with collisions, Frank noted that the network can be divided into domains to reduce their impact.
Timing also has a role here. Frank noted that Intel’s Loihi network is asynchronous. “If you use a synchronous approach, it’s probably too high power for a large network.”
A selection of projects
The range of approaches to SNNs is illustrated by reviewing several of the more prominent ones. There are many more projects underway at academic institutions and possibly at other commercial companies as well, so this list will by no means be exhaustive.
We’ve already seen some of what CEA-Leti has been working on. Their IEDM paper claims this is the first full network implementation using spikes, analog neurons, and RRAM synapses. It’s a single-layer, fully-connected network with 10 output neurons corresponding to the 10 classes used for MNIST image classification. Inference is considered complete when the difference between the highest-spiking output and the next-highest-spiking one exceeds a threshold. They’ve shown an equivalence between this and the classical tanh activation function.
BrainChip has an all-digital implementation, which allows it to be implemented on any CMOS process (unlike analog). A conceptual view of their architecture is shown in Figure 6.

Fig. 6: BrainChip’s architecture. The Akida array is conceptual. It does not reflect the true number and arrangement of NPUs. Source: BrainChip
The neural fabric is fully configurable for different applications. Each node in the array contains four neural processing units (NPUs), and each NPU can be configured for event-based convolution (supporting standard or depthwise convolution) or for other configurations, including fully connected. Events are carried as packets on the network.
While NPU details or images are not available, BrainChip did further explain that each NPU has digital logic and SRAM, providing something of a processing-in-memory capability, but not using an analog-memory approach. An NPU contains eight neural processing engines that implement the neurons and synapses. Each event is multiplied by a synaptic weight upon entering a neuron.
The company noted that its use of event-domain convolution allows it to use IF neurons rather than LIF, since this approach results in much simpler hardware. In order to deal with the issue of straggling spikes creating an inadvertent pattern, BrainChip frames the time so that, once that frame is completed, subsequent spikes will start afresh.
Training is a topic the company does not talk much about. It refers to training as “semi-supervised.” BrainChip bases its proprietary learning algorithms on a training notion referred to as Spike Timing-Dependent Plasticity, or STDP, as well as some reinforcement learning concepts. It does the training with fully connected layers in a feed-forward manner that it says is orders of magnitude faster than what is typical with classical networks. The company also is working on unsupervised learning — that is, the ability to train a network without giving it pre-labeled samples — for its next generation architectures.
Unusually, BrainChip has the ability to do some further training in the field on a deployed device. It refers to this as “incremental training,” which leverages the existing training model but allows for the device to be further trained in the field. This is done by removing the last network layer (which does classification) and replacing it with a fully connected layer. The device can then “relearn” the existing classes (the last layer only, as prior layers remain unchanged) while adding new classes to the capabilities of the network. The company does this with labeled samples, but it can add new classes with a single image instead of hundreds or thousands of images.
GrAI Matter also is doing an all-digital implementation. It uses an on-chip packet-switched network to route the “spikes.” GrAI Matter’s overall architecture is shown below (the node implementation is shown above in Figure 5). The company trains its chip using classical techniques, converting the result to the GrAL Matter format for implementation.

Fig. 7: GrAI Matter’s architecture. Source: GrAI Matter Labs
Even though this is an event-based engine, the network has been optimized to deal with standard video streams instead of DVS event streams. In a manner similar to the ISSCC paper discussed in a prior article, these operate on the differences between frames rather than the full frames. That “diff” is taken both at the input and at each activation layer, creating an enormous amount of sparsity entering and flowing through the network.

Fig. 8: GrAI Matter processes only changed pixels in each successive layer. Source: GrAI Matter Labs
Finally, Intel has a sizable research project underway under the direction of Mike Davies, director of their Neuromorphic Computing Lab. Intel called the chip Loihi (lo-EE-hee), and other players in this space appear to be paying close attention.
This is an advanced project, and it operates very differently from the prior projects, appearing to be truly neuromorphic. Details on the architecture aren’t available, but the chip currently has 128 cores, which can be scaled to 4,096. Chips also can be scaled out to a maximum of 16,384 chips. Intel uses LIF neurons, routing spikes as packets on a NoC.
“We are continuing to work on advancing neuromorphic software and hardware, with the goal of eventual commercialization,” Davies said. “Because neuromorphic technology is still at a basic research stage, it’s hard to make firm predictions on the time frame for mainstream use. We hope to have some initial niche applications providing business value in the next few years and would be happy if our neuromorphic systems were starting to be sold commercially to a broad range of customers within a five-year time frame.”
State of the industry
In general, SNNs generate divided opinions. The amount of ongoing research is indicative of the level of industry interest, but not everyone has been quite so enthusiastic. Yann LeCun, a Facebook AI researcher, noted in a 2019 ISSCC presentation, “I’m very skeptical of this [SNNs].”
Others expressed concern, as well. “[Research] papers are aimed at much simpler models [than what are implemented with classical networks],” said Geoff Tate, CEO of Flex Logix. “It’s far from commercialization.”
It’s also not necessarily an either-or situation: “You could have a network that’s partly classical and partly SNN. An example is sensor fusion, with video as classical and sound as SNN,” said Leti’s Valentian.
Arteris IP’s Frank sees a future for SNNs. “SNNs have their domain where they will outrun a standard network. Even a digital emulation of an SNN is better than a classical CNN,” he said.
The success of early commercial entrants, as well as Intel’s Loihi research project, will be indicators of whether SNNs eventually can bring their much-anticipated power savings into the market for good.
Related Material
New Ways To Optimize Machine Learning
Different approaches for improving performance and lowering power in ML systems.
Memory Issues For AI Edge Chips
In-memory computing becomes critical, but which memory and at what process node?
The Challenges Of Building Inferencing Chips
As the field of AI continues to advance, different approaches to inferencing are being developed. Not all of them will work.

744Shares
facebook sharing button
381
twitter sharing button
45
linkedin sharing button
294
sharethis sharing button

TAGS: ANALOG ANNS AR ARTERIS IP BRAINCHIP CEA-LETI CNNS DYNAMIC VISION SENSORS FACEBOOK FLEX LOGIX GRAI INTEL NEURAL NETWORKS NEUROMORPHIC COMPUTING RANK-ORDER CODING RRAM SNNS SPIKING NEURAL NETWORKS SYNOPSYS THE LINLEY GROUP VR
Bryon Moyer

Bryon Moyer​

(all posts)
Bryon Moyer is a technology editor at Semiconductor Engineering. He has been involved in the electronics industry for more than 35 years. The first 25 were as an engineer and marketer at all levels of management, working for MMI, AMD, Cypress, Altera, Actel, Teja Technologies, and Vector Fabrics. His industry focus was on PLDs/FPGAs, EDA, multicore processing, networking, and software analysis. He has been an editor and freelance ghostwriter for more than 12 years, having previously written for EE Journal. His editorial coverage has included AI, security, MEMS and sensors, IoT, and semiconductor processing to his portfolio. His technical interests are broad, and he finds particular satisfaction in drawing useful parallels between seemingly unrelated fields. He has a BSEE from UC Berkeley and an MSEE from Santa Clara University. Away from work, Bryon enjoys music, photography, travel, cooking, hiking, and languages.


2 comments​

 
  • Like
  • Fire
  • Love
Reactions: 39 users

Boab

I wish I could paint like Vincent
@Baisyet


Nah, there is one that has 5 or so competing products on it, with Brainchip and Graimatter towards the bottom left of the slide
I think we are getting closer but I think there is another one that includes other names?

Compare.jpg
 
  • Like
  • Fire
  • Love
Reactions: 19 users

uiux

Regular
  • Like
Reactions: 7 users

Here is a very comprehensive report including Brainchip and GrAI Matter (will take me forever to get through all this!! :)

Spiking Neural Networks: Research Projects Or Commercial Products?​

744Shares
facebook sharing button
381
twitter sharing button
45
linkedin sharing button
294
sharethis sharing button

Opinions differ widely, but in this space that isn’t unusual.
MAY 18TH, 2020 - BY: BRYON MOYER
popularity

Spiking neural networks (SNNs) often are touted as a way to get close to the power efficiency of the brain, but there is widespread confusion about what exactly that means. In fact, there is disagreement about how the brain actually works.
Some SNN implementations are less brain-like than others. Depending on whom you talk to, SNNs are either a long way away or close to commercialization. The varying definitions of SNNs leads to differences in how the industry is seen.
“A few startups are doing their own SNNs,” said Ron Lowman, strategic marketing manager of IP at Synopsys. “It’s being driven by guys that have expertise in how to train, optimize, and write software for them.”
On the other hand, Flex Logix Inference Technical Marketing Manager Vinay Mehta said that, “SNNs are out further than reinforcement learning,” referring to a machine-learning concept that’s still largely in the research phase.
The entire notion of a “neural network” is motivated by attempts to model how the brain works. But current neural networks — like the convolutional neural networks (CNNs) that are so prevalent today — don’t follow the design of the brain. Instead, they rely on matrix multiplication for incorporating synaptic weights and gradient-descent algorithms for supervised training.
Those working on SNNs often refer to these as “classical” networks or “artificial” neural networks (ANNs). That said, Alexandre Valentian, head of advanced technologies and system-on-chip laboratory for CEA-Leti, noted that CNNs reflect more of an approach or type of application, while SNNs reflect an implementation. “CNNs can be implemented in spikes — it’s not CNN vs. SNN.”
Mimicking the brain
The notion of an SNN originates in the fact that the brain uses spikes to relay information. An important question, however, is how information is coded onto those spikes. Several ways are used in both research and development stages. This category of neural network is sometimes referred to as “neuromorphic,” in that it reflects the way the brain works. Classical networks are not neuromorphic, but some SNNs are more neuromorphic than others. As noted in a BrainChip paper, “… Today’s technology… is, at best, only loosely related to how the brain functions.”
Many of the SNN ideas are still in the exploration stage in academic institutions. Several papers at the 2019 IEDM conference dealt with implementations of SNNs with novel circuit techniques to achieve the goals of lower power. But there are also commercial companies working on SNNs. As identified at the recent Linley Spring Processor Conference, Intel has a serious research program going, while BrainChip and GrAI Matter Labs are readying commercial chips. The reason for this wide range between early research and commercial viability reflects a range of interpretations as to how an SNN can be implemented.
Some of the projects underway involve literal spikes, which are an analog phenomenon. But others abstract the notion of a “spike” into that of an “event,” and they implement them digitally as packets traveling through a network from neuron to neuron. The high-level effect, then, is to move from measuring everything all the time, as in a classical CNN, to dealing only with events. The power savings expected from SNNs is often thought to relate to the spikes themselves, but part of the gain comes from dealing with events. In other words, work happens only when there’s an interesting event to work with. Otherwise, no work (or less work) is done, keeping power low.
“If you don’t achieve [a neuron’s] activation threshold, no event is generated,” said Roger Levinson, COO of BrainChip. This corresponds to a high level of sparsity, which is coveted in classical networks.
Another feature of SNNs is the fact that events can excite or suppress a neuron. Events then can compete with each other, with some having an excitatory effect while others have an inhibitory effect. With classical networks, negative weights can reduce the magnitude of the resulting activations, but that’s more of a static representation of a video frame (or other data set) being evaluated rather than events pushing and pulling on the outcomes.
Coding values in spikes
One of the major distinctions between SNN implementations relates to what is referred to as “coding” – how a value is transformed into a stream of spikes. While there are several ways to do this, two appear to predominate many of the discussions: rate coding and temporal coding.
Rate coding takes a value and transforms it into a constant spike frequency for the duration of that value. The benefit of this approach is that classical training techniques can be used, with the resulting values then being transcoded for an SNN inference engine. Classical networks use an enormous amount of multiplication, which is energy-intensive. Spikes, by contrast, are simply accumulated, with no multiplication necessary. That said, each spike results in a synaptic-weight lookup, which also burns power, prompting Valentian to caution that it’s not clear that this approach is lower in power.
Temporal coding is said by some to be closer to what happens in the brain, although there are differing opinions on that, with some saying that that’s the case only for a small set of examples: “It’s actually not that common in the brain,” said Jonatha Tapson, GrAI Matter’s chief scientific officer. An example where it is used is in owl’s ears. “They use their hearing to hunt at night, so their directional sensitivity has to be very high.” Instead of representing a value by a frequency of spikes, the value is encoded as the delay between spikes. Spikes then represent events, and the goal is to identify meaningful patterns in a stream of spikes.
A major challenge, however, is training, because classical training results cannot be transcoded into this type of SNN. There is no easily-obtained derivative of the spike train, making it impossible to use the gradient-descent approach to training. In general, Tapson said, “Temporal coding is horrible for electronics. It makes it hard to know if a calculation completes, and it is very slow.”
Temporally coded SNNs can be most effective when driven by sensors that generate temporal-coded data – that is, event-based sensors. Dynamic vision sensors (DVS) are examples. They don’t generate full frames of data on a frames-per-second basis. Instead, each pixel reports when its illumination changes by more than some threshold amount. This generates a “change” event, which then propagates through the network. Valentian said these also can be particularly useful in AR/VR applications for “visual odometry,” where inertial measurement units are too slow.
It’s possible that temporally-coded SNNs could work with shallower networks than the 50 to 100 (or more) layers we’re seeing with classical networks. “The visual cortex is only six layers deep, although that system isn’t purely feed-forward,” Valentian said. “There’s some feedback, as well.” Still, he noted that what’s lacking here is a killer application that will provide the energy and funding required to push temporal coding forward.
Meanwhile, BrainChip started with rate coding, but decided that wasn’t commercially viable. Instead, it uses rank coding (or rank-order coding), which uses the order of arrival of spikes (as opposed to literal timing) to a neuron as a code. This is a pattern-oriented approach, with arrivals in the prescribed order (along with synaptic weighting) stimulating the greatest response and arrivals in other orders providing less stimulation.
All of these coding approaches aside, GrAI Matter uses a more direct approach. “We encode values directly as numbers – 8- or 16-bit integers in GrAI One or Bfloat16 in our upcoming chip. This is a key departure from other neuromorphic architectures, which have to use rate or population or time or ensemble codes. We can use those, too, but they are not efficient,” said Tapson.
Neurons
SNN neurons typically are implemented in one of two ways. The approaches are motivated by analog implementations, although they can be abstracted into digital equivalents. Arteris IP fellow and chief architect Michael Frank refers to this as “emulation.” He points to several challenges for an analog implementation: “With analog, you would need to customize the model to the specific chip for inference. No two transistors are the same. And at 7 nm, you can’t do analog.”
Tapson concurs. “For a large circuit, you need to be digital,” he said.
The idea behind the two abstract neural approaches is that a neuron evaluates a signal by accumulating spikes. The simplest implementation is called “integrate-and-fire” (IF). Each spike is accumulated in the neuron until a threshold is reached, at which point the neuron fires an output spike – that is, it creates an event that propagates downstream in the network (at least for a feed-forward configuration). Many of the academic projects ongoing implement this as a literal analog circuit, and in operation it’s philosophically similar to sigma-delta modulation.
The challenge here, especially for temporal coding, is that patterns may inadvertently appear over a long time period. What are two events separated in time may be interpreted as a single pattern, since early accumulation remains in place as new spikes arrive.
In order to neutralize older “obsolete” results as newer ones arrive, a “leaky integrate-and-fire” (LIF) circuit can be used. This means that accumulations gradually dissipate over time so that, given enough time between events, accumulation restarts from a low level.
Another element that can reverse accumulation is an inhibitory event. Accumulation assumes excitatory events that add to the accumulation, but inhibitory events accumulate negative values, reducing the level of accumulation.

Fig. 1: IF and LIF neuron behavior, idealized for illustration. Note that, in the second case, the threshold is never reached due to the leakage. Neurons may also have a refractory period during which they can accumulate but not fire. Source: Bryon Moyer/Semiconductor Engineering
Synapses

Synapse implementation will depend strongly on how a specific network is implemented. For analog implementations, a spike will result in a certain amount of current injected into or out of the neuron. The amount of current depends on the synaptic weight.
A team from CEA-Leti discussed an analog SNN using RRAM in a paper presented at the 2019 IEDM conference. While RRAM has been used in classical networks as a way of implementing in-memory computation of multiply-accumulate functions, its usage here is different. Eight cells are used, four each for excitation and inhibition, with anywhere from 0 to 4 of the resistors being programmed in a low-resistance state. Low resistance means more current and, hence, a stronger weight. The more cells in a low-resistance state, the greater the overall synaptic current. The following image shows the Leti synapse design.

Fig. 2: Leti’s synapse implementation. “HRS” stands for “high-resistance state”; “LRS” stands for “low-resistance state.” Source: CEA-Leti
An array of these cells is shown in Figure 3. Each synapse gets its own word line; currents are sensed through the bit lines.

Fig. 3: Leti’s synaptic array. Source: CEA-Leti
The currents are summed into the neuron as shown in Figure 4. The capacitor acts as the accumulator as the membrane voltage varies with the injected currents. Note that there are both positive and negative thresholds, meaning that the neuron can fire an excitation spike or an inhibition spike.

Fig. 4: Neuron accumulation in the presence of excitatory and inhibitory spikes. Source: CEA-Leti
In a digital implementation, the notion of a spike is an abstraction, and multiplication is still required to scale an incoming spike by a synaptic weight. GrAI Matter’s approach is shown in Figure 5.

Fig. 5: GrAI Matter’s digital neuron core. Source: GrAI Matter Labs
NoCs in the Circuit

For digital SNN emulations, the routing of spikes often happens through a network-on-chip, or NoC. NoCs are common in sophisticated systems-on-chip (SoCs), but those networks often carry large payloads. By contrast, spike data is very small. In fact, Arteris IP’s Frank said the packet headers may be longer than the payload itself.
Packets can be broadcast to the destination neurons with an identifying tag. Then receiving neurons will know which tag to pay attention to, giving the effect of multi-cast. In this way, spikes arrive at the intended neurons for processing, while other neurons ignore them. This gives the input side of the neuron a many-to-one relationship, while the output has a one-to-many relationship.
Frank indicated there should not be issues with collisions on the network. Sensor data is generated at a rate of around 500 samples per second, while the network is clocked at hundreds of megahertz. This leaves plenty of room for time-sharing data so that individual spike deliveries can appear to be concurrent. If there is any issue with collisions, Frank noted that the network can be divided into domains to reduce their impact.
Timing also has a role here. Frank noted that Intel’s Loihi network is asynchronous. “If you use a synchronous approach, it’s probably too high power for a large network.”
A selection of projects
The range of approaches to SNNs is illustrated by reviewing several of the more prominent ones. There are many more projects underway at academic institutions and possibly at other commercial companies as well, so this list will by no means be exhaustive.
We’ve already seen some of what CEA-Leti has been working on. Their IEDM paper claims this is the first full network implementation using spikes, analog neurons, and RRAM synapses. It’s a single-layer, fully-connected network with 10 output neurons corresponding to the 10 classes used for MNIST image classification. Inference is considered complete when the difference between the highest-spiking output and the next-highest-spiking one exceeds a threshold. They’ve shown an equivalence between this and the classical tanh activation function.
BrainChip has an all-digital implementation, which allows it to be implemented on any CMOS process (unlike analog). A conceptual view of their architecture is shown in Figure 6.

Fig. 6: BrainChip’s architecture. The Akida array is conceptual. It does not reflect the true number and arrangement of NPUs. Source: BrainChip
The neural fabric is fully configurable for different applications. Each node in the array contains four neural processing units (NPUs), and each NPU can be configured for event-based convolution (supporting standard or depthwise convolution) or for other configurations, including fully connected. Events are carried as packets on the network.
While NPU details or images are not available, BrainChip did further explain that each NPU has digital logic and SRAM, providing something of a processing-in-memory capability, but not using an analog-memory approach. An NPU contains eight neural processing engines that implement the neurons and synapses. Each event is multiplied by a synaptic weight upon entering a neuron.
The company noted that its use of event-domain convolution allows it to use IF neurons rather than LIF, since this approach results in much simpler hardware. In order to deal with the issue of straggling spikes creating an inadvertent pattern, BrainChip frames the time so that, once that frame is completed, subsequent spikes will start afresh.
Training is a topic the company does not talk much about. It refers to training as “semi-supervised.” BrainChip bases its proprietary learning algorithms on a training notion referred to as Spike Timing-Dependent Plasticity, or STDP, as well as some reinforcement learning concepts. It does the training with fully connected layers in a feed-forward manner that it says is orders of magnitude faster than what is typical with classical networks. The company also is working on unsupervised learning — that is, the ability to train a network without giving it pre-labeled samples — for its next generation architectures.
Unusually, BrainChip has the ability to do some further training in the field on a deployed device. It refers to this as “incremental training,” which leverages the existing training model but allows for the device to be further trained in the field. This is done by removing the last network layer (which does classification) and replacing it with a fully connected layer. The device can then “relearn” the existing classes (the last layer only, as prior layers remain unchanged) while adding new classes to the capabilities of the network. The company does this with labeled samples, but it can add new classes with a single image instead of hundreds or thousands of images.
GrAI Matter also is doing an all-digital implementation. It uses an on-chip packet-switched network to route the “spikes.” GrAI Matter’s overall architecture is shown below (the node implementation is shown above in Figure 5). The company trains its chip using classical techniques, converting the result to the GrAL Matter format for implementation.

Fig. 7: GrAI Matter’s architecture. Source: GrAI Matter Labs
Even though this is an event-based engine, the network has been optimized to deal with standard video streams instead of DVS event streams. In a manner similar to the ISSCC paper discussed in a prior article, these operate on the differences between frames rather than the full frames. That “diff” is taken both at the input and at each activation layer, creating an enormous amount of sparsity entering and flowing through the network.

Fig. 8: GrAI Matter processes only changed pixels in each successive layer. Source: GrAI Matter Labs
Finally, Intel has a sizable research project underway under the direction of Mike Davies, director of their Neuromorphic Computing Lab. Intel called the chip Loihi (lo-EE-hee), and other players in this space appear to be paying close attention.
This is an advanced project, and it operates very differently from the prior projects, appearing to be truly neuromorphic. Details on the architecture aren’t available, but the chip currently has 128 cores, which can be scaled to 4,096. Chips also can be scaled out to a maximum of 16,384 chips. Intel uses LIF neurons, routing spikes as packets on a NoC.
“We are continuing to work on advancing neuromorphic software and hardware, with the goal of eventual commercialization,” Davies said. “Because neuromorphic technology is still at a basic research stage, it’s hard to make firm predictions on the time frame for mainstream use. We hope to have some initial niche applications providing business value in the next few years and would be happy if our neuromorphic systems were starting to be sold commercially to a broad range of customers within a five-year time frame.”
State of the industry
In general, SNNs generate divided opinions. The amount of ongoing research is indicative of the level of industry interest, but not everyone has been quite so enthusiastic. Yann LeCun, a Facebook AI researcher, noted in a 2019 ISSCC presentation, “I’m very skeptical of this [SNNs].”
Others expressed concern, as well. “[Research] papers are aimed at much simpler models [than what are implemented with classical networks],” said Geoff Tate, CEO of Flex Logix. “It’s far from commercialization.”
It’s also not necessarily an either-or situation: “You could have a network that’s partly classical and partly SNN. An example is sensor fusion, with video as classical and sound as SNN,” said Leti’s Valentian.
Arteris IP’s Frank sees a future for SNNs. “SNNs have their domain where they will outrun a standard network. Even a digital emulation of an SNN is better than a classical CNN,” he said.
The success of early commercial entrants, as well as Intel’s Loihi research project, will be indicators of whether SNNs eventually can bring their much-anticipated power savings into the market for good.
Related Material
New Ways To Optimize Machine Learning
Different approaches for improving performance and lowering power in ML systems.
Memory Issues For AI Edge Chips
In-memory computing becomes critical, but which memory and at what process node?
The Challenges Of Building Inferencing Chips
As the field of AI continues to advance, different approaches to inferencing are being developed. Not all of them will work.

744Shares
facebook sharing button
381
twitter sharing button
45
linkedin sharing button
294
sharethis sharing button

TAGS: ANALOG ANNS AR ARTERIS IP BRAINCHIP CEA-LETI CNNS DYNAMIC VISION SENSORS FACEBOOK FLEX LOGIX GRAI INTEL NEURAL NETWORKS NEUROMORPHIC COMPUTING RANK-ORDER CODING RRAM SNNS SPIKING NEURAL NETWORKS SYNOPSYS THE LINLEY GROUP VR
Bryon Moyer

Bryon Moyer​

(all posts)
Bryon Moyer is a technology editor at Semiconductor Engineering. He has been involved in the electronics industry for more than 35 years. The first 25 were as an engineer and marketer at all levels of management, working for MMI, AMD, Cypress, Altera, Actel, Teja Technologies, and Vector Fabrics. His industry focus was on PLDs/FPGAs, EDA, multicore processing, networking, and software analysis. He has been an editor and freelance ghostwriter for more than 12 years, having previously written for EE Journal. His editorial coverage has included AI, security, MEMS and sensors, IoT, and semiconductor processing to his portfolio. His technical interests are broad, and he finds particular satisfaction in drawing useful parallels between seemingly unrelated fields. He has a BSEE from UC Berkeley and an MSEE from Santa Clara University. Away from work, Bryon enjoys music, photography, travel, cooking, hiking, and languages.


2 comments​

When does Byron have time off work to have 6 interests
 
  • Like
  • Haha
Reactions: 4 users
Top Bottom