D
Deleted member 118
Guest
Got a nice Guinness stew and dumplings cooking for later, that should cheer me upI guess someone got up on the wrong foot . Be happy it’s Friday. New financial year, weekend tomorrow!
Got a nice Guinness stew and dumplings cooking for later, that should cheer me upI guess someone got up on the wrong foot . Be happy it’s Friday. New financial year, weekend tomorrow!
Was saying this exact thing to one of the guys last night. This is shorts playing with themselves for sure. The trading volume is incredibly low.Not many shares needed the last week or so to get our SP moving in either direction quite a lot.
I have been wondering too if that is an option with the M4 for those clients wanting an AI accelerator or add on white label IP (Akida) for their "own" NPU.Just to be clear and for others not quite following,
Is there a possibility that the ARM M4 contains AKIDA IP???, (used as illustrated in some context in the above infeon patent which describes a neural network) and hence explaining the low power consumption?
And the fact that no NN details are provided just means you can't tell what's running the NN?
Ps I know I'm just repeating the above post but just dumbing it down so it can sink in for everyone. Correct me if I'm wrong.
Akida Ballista!!!
After the most recent AGM, I would have expected to see some sizable increase in the upcoming BRN company 4C revenue report especially given the fact that Sean H was so adamant at that meeting that we were to start now watching the quarterly results for anticipated financial progress.Goodbye to that Financial year...I'd suggest a lot of us made some profit, but I personally am not expecting to see any
revenue of any substance in the upcoming 4c being delivered later next month..if products had been released in the
previous 3 months, well, we would or someone would have that information and divulged it by now.
The only revenue would be from IP Licences...correct me if I've forgotten something, which does happen quite a bit these days, as in,
what did I come into this room for ?!!!
Depending on all this shorting behaviour and the "washing" that's been taking place this month (in my opinion) the share price could
well take another hit in late July, as the recurring pattern of no revenue, seems to send a message to shorters, downrampers to try to
savage our stock once again, and (in my opinion) this pattern will only change once "explosive revenue" shows it's face, and is repeated
quarter on quarter....which I'm expecting like many of you.
My love affair with Brainchip is as strong as ever, too emotional for some ?....that thought just makes me laugh
From a perfect clear evening in Perth (but cool).....Tech x
Yes correct @Stable Genius i am also wants to listen to this guy specially after joining BRN! I listened to all his speeches and talks before joining our company and he was great!Morning Rocket,
I’m looking forward to hearing from our Chief Marketing Officer.
He’s very experienced and important to our company’s success so I can’t wait to hear his thoughts and strategy going forward!
From memory he has a degrees in psychology which he applies with a organisation and business mindset.
I love psychology and enjoy seeing positivity pushed. You sell more with a positive mindset than a negative one.so this interview is coming at the right time!
Happy thoughts; Happy Friday!
Well, back in 2016, Kneron had a synchronous NN:I have been wondering too if that is an option with the M4 for those clients wanting an AI accelerator or add on white label IP (Akida) for their "own" NPU.
Here is another case I'm trying to see how is set up but only just started digging.
These recent boards and latest NPUs from Kneron just come out the past month or so.
Maybe @Diogenese could cast an eye if it hasn't already been looked at or discussed yet?
Was something the in the wording of their new board which I've highlighted.
The NPU designed by ARM architecture...huh. Makes sense to say that about the M4 but the NPU?
Couple links below and the site has datasheet etc but doesn't explain much around the NPU and runs 2 X M4.
Did notice though the reference to LTSM in support models so maybe not related at all?
Mini-AI-720
AI Edge Computing Module with Kneron KL720 NPU
Overview
AI Edge Computing Module with Kneron KL720 NPU
Features
- Kneron KL720 NPU (Designed by ARM architecture)
- Mini card (PCIe[x1] interface,Full size)
- Accelerator for AI Edge Computing
- Enhanced performance to process high resolution video and graphic related computing
View attachment 10549
AI Edge Computing Module with Kneron KL720 NPU
AI Edge Computing Module with Kneron KL720 NPUwww.aaeon.com
Kneron edge AI chips: local AI, private and secure
Kneron's edge AI SoCs enable low power AI processing on any device with a sensor. Since none of the data collected goes to the cloud to be inferred, data is private and secure. Contact us to learn more.www.kneron.com
I believe they sell 40 or 50 million-ish a year (they never release exact sales numbers). Akida inside would definitely make me switch from Google to Apple.Apple watch os9 making some changes to allow low power mode. Recently reveiled that it is an hardware exclusive feature change. Chip is not changing so most likely "low power co-processor".......hmmmm... perfect fit (ZONEofTECH) Published yesterday
Please God let it be....
Well, back in 2016, Kneron had a synchronous NN:
US2017330069A1 MULTI-LAYER ARTIFICIAL NEURAL NETWORK AND CONTROLLING METHOD THEREOF
View attachment 10552
A multi-layer artificial neural network including a plurality of artificial neurons, a storage device, and a controller is provided. The plurality of artificial neurons are used for performing computation based on plural parameters. The storage device is used for storing plural sets of parameters, each set of parameters being corresponding to a respective layer. At a first time instant, the controller controls the storage device to provide a set of parameters corresponding to a first layer to the plurality of artificial neurons so that the plurality of artificial neurons form at least part of the first layer. At a second time instant, the controller controls the storage device to provide a set of parameters corresponding to a second layer to the plurality of artificial neurons so that the plurality of artificial neurons format least part of the second layer.
I see they also dabbled in MemRistors:
US10839893B2 Memory cell with charge trap transistors and method thereof capable of storing data by trapping or detrapping charges
View attachment 10554
A memory cell includes a first charge trap transistor and a second charge trap transistor. The first charge trap transistor has a substrate, a first terminal coupled to a first bitline, a second terminal coupled to a signal line, a control terminal coupled to a wordline, and a dielectric layer formed between the substrate of the first charge trap transistor and the control terminal of the first charge trap transistor. The second charge trap transistor has a substrate, a first terminal coupled to the signal line, a second terminal coupled to a second bitline, a control terminal coupled to the wordline, and a dielectric layer between the substrate of the second charge trap transistor and the control terminal of the second charge trap transistor. Charges are either trapped to or detrapped from the dielectric layer of the first charge trap transistor when writing data to the memory cell.
More recently, they have been dabbling in back propagation training for NNs, but that document is in Chinese:
CN113240075A MSVL-based BP neural network construction and training method, and MSVL-based BP neural network construction and training system
But do they have anything we need to worry about?
https://www.kneron.com/en/news/blog/106/
Kneron Unveils Next-Gen AI Chip — No Compromise AI For Smart Devices
Kneron’s KL720 chip provides best-in-class performance, energy-efficiency, privacy, and security for consumer smart devices
San Diego, CA, August 27th, 2020
KL720 is not only the most powerful and energy-efficient chip Kneron has built, it also outclasses competing offerings. Compared to Intel’s Movidius AI chips, KL720 is twice as energy-efficient for similar performance and at half the cost. A DJI drone that currently uses a Movidius chip would double its battery life by using a Kneron chip, without any loss of power. Kneron’s solution can be used in devices that would not be practical for Intel’s chips, either because they’re too expensive or they require too much battery power to operate. KL720 is also 4x more efficient than Google’s Coral edge TPU according to MobileNetV2 benchmark results.
I have it on good authority from someone who lives in a barrel that for the following patent to be actioned they would need to already have access to a convolutional spiking neural network processor.WHITE PAPER
Neuromorphic Computing Brings AI to the Edge
How conventional processor architecture is becoming a thing of the past
Connected devices driven by 5G and the Internet of Things (IoT) are everywhere from autonomous vehicles, smart homes, healthcare to space exploration. Devices are becoming more intelligent. Massive amounts of data from multiple sources need to be processed quickly, securely, and in real time, having low latency. Cloud-based architecture may not fulfil these needs of futuristic AI-based systems, that require intelligence at the edge and the ability to process sparse events. Neuromorphic computing resolves the issues of the conventional processor architecture or the von Neumann architecture by separating processing and memory units. It mimics the human brain and its cognitive functions such as interpretation, autonomous adaptation; as well as supports in-memory processing at higher speeds, complexity, and better energy efficiency. As research continues, neuromorphic processors will advance edge computing capabilities and bring AI closer to the edge.
Click on the "Read More" button to read the entire whitepaper.
Click on the contact icon at the bottom right to talk to our subject matter experts.
Read More
Arijit Mukherjee
Senior Scientist, TCS Research
Sounak Dey
Senior Scientist, TCS Research
Vedvyas Krishnamoorthy
Business Development Manager, Technology Business Unit, TCS
6028626 | February 22, 2000 | Aviv |
6236736 | May 22, 2001 | Crabtree |
6701016 | March 2, 2004 | Jojic |
7152051 | December 19, 2006 | Commons |
7280697 | October 9, 2007 | Perona |
8504361 | August 6, 2013 | Collobert |
8811726 | August 19, 2014 | Belhumeur |
8942466 | January 27, 2015 | Petre et al. |
9015093 | April 21, 2015 | Commons |
9299022 | March 29, 2016 | Buibas et al. |
109144260 | January 2019 | CN |
WO2019074532 | April 2019 | WO |
Got a nice Guinness stew and dumplings cooking for later, that should cheer me up
There are interesting things happening with Tata. The first is they appear to be in line to produce components and electronics for the India Apple IPhone. The second is what their company Titan is planning to bring to market in India in the wearables area:I have it on good authority from someone who lives in a barrel that for the following patent to be actioned they would need to already have access to a convolutional spiking neural network processor.
Now some might completely discount the fact that Arijit Mukherjee from the above article who is one of the inventors of the following patent and who was a member of the Brainchip Tata team that presented a joint demonstration on 14.12.19 of AKIDA technology performing live gesture recognition and that Brainchip having the only commercially available patent protected convolutional spiking neural network chip in the world 3 years ahead of anyone else as proving or even pointing to Brainchip as providing this chip to Tata but I am not in that camp.
This is one huge statement for TATA to make in my opinion: "Neuromorphic Computing Brings AI to the Edge How conventional processor architecture is becoming a thing of the past".
My opinion only DYOR
FF
AKIDA BALLISTA
US Patent for System and method of gesture recognition using a reservoir based convolutional spiking neural network Patent (Patent # 11,256,954 issued February 22, 2022) - Justia Patents Search
This disclosure relates to method of identifying a gesture from a plurality of gestures using a reservoir based convolutional spiking neural network. A two-dimensional spike streams is received from neuromorphic event camera as an input. The two-dimensional spike streams associated with at least...patents.justia.com
System and method of gesture recognition using a reservoir based convolutional spiking neural network
Dec 17, 2020
This disclosure relates to method of identifying a gesture from a plurality of gestures using a reservoir based convolutional spiking neural network. A two-dimensional spike streams is received from neuromorphic event camera as an input. The two-dimensional spike streams associated with at least one gestures from a plurality of gestures is preprocessed to obtain plurality of spike frames. The plurality of spike frames is processed by a multi layered convolutional spiking neural network to learn plurality of spatial features from the at least one gesture. A filter block is deactivated from the plurality of filter blocks corresponds to at least one gesture which are not currently being learnt. A spatio-temporal features is obtained by allowing the spike activations from CSNN layer to flow through the reservoir. The spatial feature is classified by classifier from the CSNN layer and the spatio-temporal features from the reservoir to obtain set of prioritized gestures.
Skip to: Description · Claims · References Cited · Patent History · Patent History
Description
PRIORITY CLAIM
This U.S. patent application claims priority under 35 U.S.C. § 119 to: India Application No. 202021025784, filed on Jun. 18, 2020. The entire contents of the aforementioned application are incorporated herein by reference.
TECHNICAL FIELD
This disclosure relates generally to gesture recognition, and, more particularly, to system and method of gesture recognition using a reservoir based convolutional spiking neural network.
BACKGROUND
In an age of artificial intelligence, robots and drones are key enablers of task automation and they are being used in various domains such as manufacturing, healthcare, warehouses, disaster management etc. As a consequence, they often need to share work-space with and interact with human workers and thus evolving the area of research named Human Robot Interaction (HRI). Problems in this domain are mainly centered around learning and identifying of gestures/speech/intention of human coworkers along with classical problems of learning and identification of surrounding environment (and obstacles, objects etc. therein). All these essentially are needed to be done in a dynamic and noisy practical work environment. As of current state of the art vision based solutions using artificial neural networks (including deep neural networks) have high accuracy, however the models are not the most efficient solutions as learning methods and inference frameworks of the conventional deep neural networks require huge amount of training data and are typically compute and energy intensive. They are also bounded by one or more conventional architectures that leads to data transfer bottleneck between memory and processing units and related power consumption issues. Hence, this genre of solutions does not really help robots and drones to do their jobs as they are classically constrained by their battery life.
SUMMARY
Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems. For example, in one aspect, a processor implemented method of identifying a gesture from a plurality of gestures using a reservoir based convolutional spiking neural network is provided. The processor implemented method includes at least one of: receiving, from a neuromorphic event camera, two-dimensional spike streams as an input; preprocessing, via one or more hardware processors, the address event representation (AER) record associated with at least one gestures from a plurality of gestures to obtain a plurality of spike frames; processing, by a multi layered convolutional spiking neural network, the plurality of spike frames to learn a plurality of spatial features from the at least one gesture; deactivating, via the one or more hardware processors, at least one filter block from the plurality of filter blocks corresponds to at least one gesture which are not currently being learnt; obtaining, via the one or more hardware processors, spatio-temporal features by allowing the spike activations from a CSNN layer to flow through the reservoir; and classifying, by a classifier, the at least one of spatial feature from the CSNN layer and the spatio-temporal features from the reservoir to obtain a set of prioritized gestures. In an embodiment, the two-dimensional spike streams are represented as an address event representation (AER) record. In an embodiment, each sliding convolutional window in the plurality of spike frames are connected to a neuron corresponding to a filter among plurality of filters corresponding to a filter block among plurality of filter blocks in each convolutional layer from plurality of convolutional layers. In an embodiment, the plurality of filter blocks are configured to concentrate a plurality of class-wise spatial features to the filter block for learning associated patterns based on a long-term lateral inhibition mechanism. In an embodiment, the CSNN layer is stacked to provide at least one of: (i) a low-level spatial features, (ii) a high-level spatial features, or combination thereof.
In an embodiment, the spike streams may be compressed per neuronal level by accumulating spikes at a sliding window of time, to obtain a plurality of output frames with reduced time granularity. In an embodiment, plurality of learned different spatially co-located features may be distributed on the plurality of filters from the plurality of filter blocks. In an embodiment, a special node between filters of the filter block may be configured to switch between different filters based on an associated decay constant to distribute learning of different spatially co-located features on the different filters. In an embodiment, a plurality of weights of a synapse between input and the CSNN layer may be learned using an unsupervised two trace STDP learning rule upon at least one spiking activity of the input layer. In an embodiment, the reservoir may include a sparse random cyclic connectivity which acts as a random projection of the input spikes to an expanded spatio-temporal embedding.
In another aspect, there is provided a system to identify a gesture from a plurality of gestures using a reservoir based convolutional spiking neural network. The system comprises a memory storing instructions; one or more communication interfaces; and one or more hardware processors coupled to the memory via the one or more communication interfaces. The one or more hardware processors are configured by the instructions to: receive, from a neuromorphic event camera, two-dimensional spike streams as an input; preprocess, the address event representation (AER) record associated with at least one gestures from a plurality of gestures to obtain a plurality of spike frames; process, by a multi layered convolutional spiking neural network, the plurality of spike frames to learn a plurality of spatial features from the at least one gesture; deactivate, at least one filter block from the plurality of filter blocks corresponds to at least one gesture which are not currently being learnt; obtain, spatiotemporal features by allowing the spike activations from a CSNN layer to flow through the reservoir; and classify, by a classifier, the at least one of spatial feature from the CSNN layer and the spatiotemporal features from the reservoir to obtain a set of prioritized gestures. In an embodiment, the two-dimensional spike streams is represented as an address event representation (AER) record. In an embodiment, each sliding convolutional window in the plurality of spike frames are connected to a neuron corresponding to a filter among plurality of filters corresponding to a filter block among plurality of filter blocks in each convolutional layer from plurality of convolutional layers. In an embodiment, the plurality of filter blocks are configured to concentrate a plurality of class-wise spatial features to the filter block for learning associated patterns based on a long-term lateral inhibition mechanism. In an embodiment, the CSNN layer is stacked to provide at least one of: (i) a low-level spatial features, (ii) a high-level spatial features, or combination thereof.
In an embodiment, the spike streams may be compressed per neuronal level by accumulating spikes at a sliding window of time, to obtain a plurality of output frames with reduced time granularity. In an embodiment, plurality of learned different spatially co-located features may be distributed on the plurality of filters from the plurality of filter blocks. In an embodiment, a special node between filters of the filter block may be configured to switch between different filters based on an associated decay constant to distribute learning of different spatially co-located features on the different filters. In an embodiment, a plurality of weights of a synapse between input and the CSNN layer may be learned using an unsupervised two trace STDP learning rule upon at least one spiking activity of the input layer. In an embodiment, the reservoir may include a sparse random cyclic connectivity which acts as a random projection of the input spikes to an expanded spatio-temporal embedding.
In yet another aspect, there are provided one or more non-transitory machine readable information storage mediums comprising one or more instructions which when executed by one or more hardware processors causes at least one of: receiving, from a neuromorphic event camera, two-dimensional spike streams as an input; preprocessing, the address event representation (AER) record associated with at least one gestures from a plurality of gestures to obtain a plurality of spike frames; processing, by a multi layered convolutional spiking neural network, the plurality of spike frames to learn a plurality of spatial features from the at least one gesture; deactivating, at least one filter block from the plurality of filter blocks corresponds to at least one gesture which are not currently being learnt; obtaining, spatio-temporal features by allowing the spike activations from a CSNN layer to flow through the reservoir; and classifying, by a classifier, the at least one of spatial feature from the CSNN layer and the spatio-temporal features from the reservoir to obtain a set of prioritized gestures. In an embodiment, the two-dimensional spike streams are represented as an address event representation (AER) record. In an embodiment, each sliding convolutional window in the plurality of spike frames are connected to a neuron corresponding to a filter among plurality of filters corresponding to a filter block among plurality of filter blocks in each convolutional layer from plurality of convolutional layers. In an embodiment, the plurality of filter blocks are configured to concentrate a plurality of class-wise spatial features to the filter block for learning associated patterns based on a long-term lateral inhibition mechanism. In an embodiment, the CSNN layer is stacked to provide at least one of: (i) a low-level spatial features, (ii) a high-level spatial features, or combination thereof.
In an embodiment, the spike streams may be compressed per neuronal level by accumulating spikes at a sliding window of time, to obtain a plurality of output frames with reduced time granularity. In an embodiment, plurality of learned different spatially co-located features may be distributed on the plurality of filters from the plurality of filter blocks. In an embodiment, a special node between filters of the filter block may be configured to switch between different filters based on an associated decay constant to distribute learning of different spatially co-located features on the different filters. In an embodiment, a plurality of weights of a synapse between input and the CSNN layer may be learned using an unsupervised two trace STDP learning rule upon at least one spiking activity of the input layer. In an embodiment, the reservoir may include a sparse random cyclic connectivity which acts as a random projection of the input spikes to an expanded spatio-temporal embedding.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed....
Claims
1. A processor implemented method of identifying a gesture from a plurality of gestures using a reservoir based convolutional spiking neural network, comprising:
receiving, from a neuromorphic event camera, two-dimensional spike streams as an input, wherein the two-dimensional spike streams are represented as an address event representation (AER) record;preprocessing, via one or more hardware processors, the address event representation (AER) record associated with at least one gestures from a plurality of gestures to obtain a plurality of spike frames;processing, by a multi layered convolutional spiking neural network, the plurality of spike frames to learn a plurality of spatial features from the at least one gesture, wherein each sliding convolutional window in the plurality of spike frames are connected to a neuron corresponding to a filter among plurality of filters corresponding to a filter block among plurality of filter blocks in each convolutional layer from plurality of convolutional layers;deactivating, via the one or more hardware processors, at least one filter block from the plurality of filter blocks corresponds to at least one gesture which are not currently being learnt, wherein the plurality of filter blocks are configured to concentrate a plurality of class-wise spatial features to the filter block for learning associated patterns based on a long-term lateral inhibition mechanism;obtaining, via the one or more hardware processors, spatio-temporal features by allowing the spike activations from a CSNN layer to flow through the reservoir, wherein the CSNN layer is stacked to provide at least one of: (i) a low-level spatial features, (ii) a high-level spatial features, or combination thereof; andclassifying, by a classifier, the at least one of spatial feature from the CSNN layer and the spatio-temporal features from the reservoir to obtain a set of prioritized gestures.
2. The processor implemented method of claim 1, wherein the spike streams are compressed per neuronal level by accumulating spikes at a sliding window of time, to obtain a plurality of output frames with reduced time granularity.
3. The processor implemented method of claim 1, wherein a plurality of learned different spatially co-located features are distributed on the plurality of filters from the plurality of filter blocks.
4. The processor implemented method of claim 1, wherein a special node between filters of the filter block is configured to switch between different filters based on an associated decay constant to distribute learning of different spatially co-located features on the different filters.
5. The processor implemented method of claim 1, wherein a plurality of weights of a synapse between input and the CSNN layer are learned using an unsupervised two trace STDP learning rule upon at least one spiking activity of the input layer.
6. The processor implemented method of claim 1, wherein the reservoir comprises a sparse random cyclic connectivity which acts as a random projection of the input spikes to an expanded spatio-temporal embedding.
7. A system (100) to identify a gesture from a plurality of gestures using a reservoir based convolutional spiking neural network, comprising:
a memory (102) storing instructions;one or more communication interfaces (106); andone or more hardware processors (104) coupled to the memory (102) via the one or more communication interfaces (106), wherein the one or more hardware processors (104) are configured by the instructions to: receive, from a neuromorphic event camera, two-dimensional spike streams as an input, wherein the two-dimensional spike streams are represented as an address event representation (AER) record; preprocess, the address event representation (AER) record associated with at least one gestures from a plurality of gestures to obtain a plurality of spike frames; process, by a multi layered convolutional spiking neural network, the plurality of spike frames to learn a plurality of spatial features from the at least one gesture, wherein each sliding convolutional window in the plurality of spike frames are connected to a neuron corresponding to a filter among plurality of filters corresponding to a filter block among plurality of filter blocks in each convolutional layer from plurality of convolutional layers; deactivate, at least one filter block from the plurality of filter blocks corresponds to at least one gesture which are not currently being learnt, wherein the plurality of filter blocks are configured to concentrate a plurality of class-wise spatial features to the filter block for learning associated patterns based on a long-term lateral inhibition mechanism; obtain, spatiotemporal features by allowing the spike activations from a CSNN layer to flow through the reservoir, wherein the CSNN layer is stacked to provide at least one of: (i) a low-level spatial features, (ii) a high-level spatial features, or combination thereof; and classify, by a classifier, the at least one of spatial feature from the CSNN layer and the spatiotemporal features from the reservoir to obtain a set of prioritized gestures.
8. The system (100) of claim 7, wherein the spike streams are compressed per neuronal level by accumulating spikes at a sliding window of time, to obtain a plurality of output frames with reduced time granularity.
9. The system (100) of claim 7, wherein plurality of learned different spatially co-located features are distributed on the plurality of filters from the plurality of filter blocks.
10. The system (100) of claim 7, wherein a special node between filters of the filter block is configured to switch between different filters based on an associated decay constant to distribute learning of different spatially co-located features on the different filters.
11. The system (100) of claim 7, wherein a plurality of weights of a synapse between input and the CSNN layer are learned using an unsupervised two trace STDP learning rule upon at least one spiking activity of the input layer.
12. The system (100) of claim 7, wherein the reservoir comprises a sparse random cyclic connectivity which acts as a random projection of the input spikes to an expanded spatio-temporal embedding.
13. One or more non-transitory machine-readable information storage mediums comprising one or more instructions which when executed by one or more hardware processors perform actions comprising:
receiving, from a neuromorphic event camera, two-dimensional spike streams as an input, wherein the two-dimensional spike streams are represented as an address event representation (AER) record;preprocessing, the address event representation (AER) record associated with at least one gestures from a plurality of gestures to obtain a plurality of spike frames;processing, by a multi layered convolutional spiking neural network, the plurality of spike frames to learn a plurality of spatial features from the at least one gesture, wherein each sliding convolutional window in the plurality of spike frames are connected to a neuron corresponding to a filter among plurality of filters corresponding to a filter block among plurality of filter blocks in each convolutional layer from plurality of convolutional layers;deactivating, at least one filter block from the plurality of filter blocks corresponds to at least one gesture which are not currently being learnt, wherein the plurality of filter blocks are configured to concentrate a plurality of class-wise spatial features to the filter block for learning associated patterns based on a long-term lateral inhibition mechanism;obtaining, spatio-temporal features by allowing the spike activations from a CSNN layer to flow through the reservoir, wherein the CSNN layer is stacked to provide at least one of: (i) a low-level spatial features, (ii) a high-level spatial features, or combination thereof; andclassifying, by a classifier, the at least one of spatial feature from the CSNN layer and the spatio-temporal features from the reservoir to obtain a set of prioritized gestures.
14. The one or more non-transitory machine-readable information storage mediums of claim 13, wherein the spike streams are compressed per neuronal level by accumulating spikes at a sliding window of time, to obtain a plurality of output frames with reduced time granularity.
15. The one or more non-transitory machine-readable information storage mediums of claim 13, wherein a plurality of learned different spatially co-located features are distributed on the plurality of filters from the plurality of filter blocks.
16. The one or more non-transitory machine-readable information storage mediums of claim 13, wherein a special node between filters of the filter block is configured to switch between different filters based on an associated decay constant to distribute learning of different spatially co-located features on the different filters.
17. The one or more non-transitory machine-readable information storage mediums of claim 13, wherein a plurality of weights of a synapse between input and the CSNN layer are learned using an unsupervised two trace STDP learning rule upon at least one spiking activity of the input layer.
18. The one or more non-transitory machine-readable information storage mediums of claim 13, wherein the reservoir comprises a sparse random cyclic connectivity which acts as a random projection of the input spikes to an expanded spatio-temporal embedding.
Referenced Cited
U.S. Patent Documents
Foreign Patent Documents
6028626 February 22, 2000 Aviv 6236736 May 22, 2001 Crabtree 6701016 March 2, 2004 Jojic 7152051 December 19, 2006 Commons 7280697 October 9, 2007 Perona 8504361 August 6, 2013 Collobert 8811726 August 19, 2014 Belhumeur 8942466 January 27, 2015 Petre et al. 9015093 April 21, 2015 Commons 9299022 March 29, 2016 Buibas et al.
Other references
109144260 January 2019 CN WO2019074532 April 2019 WO
Patent History
- Panda, Priyadarshini et al., “Learning to Recognize Actions from Limited Training Examples Using a Recurrent Spiking Neural Model,” Frontiers in Neuroscience, Oct. 2017, Publisher: Arxiv Link: https://arxiv.org/pdf/1710.07354.pdf.
Patent number: 11256954
Type: Grant
Filed: Dec 17, 2020
Date of Patent: Feb 22, 2022
Patent Publication Number: 20210397878
Assignee: Tala Consultancy Services Limited (Mumbai)
Inventors: Arun George (Bangalore), Dighanchal Banerjee (Kolkata), Sounak Dey (Kolkata), Arijit Mukherjee (Kolkata)
Primary Examiner: Yosef Kassa
Application Number: 17/124,584
Classifications
Current U.S. Class: Intrusion Detection (348/152)
International Classification: G06K 9/62 (20060101); G06K 9/00 (20060101); G06N 3/04 (20060101);
Just made a chocolate and wallnut brownie cake for deserts later as well, just needs popping in the oven and served with either custard or ice creamIs that suit dumplings Rocket, ? Just love em, not what one would call gourmet cuisine these days, but oh, they are so good, cheer you up, indeed, they will. Got an appetite now, wonder if we have some suit and let’s see if I can win some lose change from them to get some extra BrN on Monday
hotty...
I have it on good authority from someone who lives in a barrel that for the following patent to be actioned they would need to already have access to a convolutional spiking neural network processor.
Now some might completely discount the fact that Arijit Mukherjee from the above article who is one of the inventors of the following patent and who was a member of the Brainchip Tata team that presented a joint demonstration on 14.12.19 of AKIDA technology performing live gesture recognition and that Brainchip having the only commercially available patent protected convolutional spiking neural network chip in the world 3 years ahead of anyone else as proving or even pointing to Brainchip as providing this chip to Tata but I am not in that camp.
This is one huge statement for TATA to make in my opinion: "Neuromorphic Computing Brings AI to the Edge How conventional processor architecture is becoming a thing of the past".
My opinion only DYOR
FF
AKIDA BALLISTA
US Patent for System and method of gesture recognition using a reservoir based convolutional spiking neural network Patent (Patent # 11,256,954 issued February 22, 2022) - Justia Patents Search
This disclosure relates to method of identifying a gesture from a plurality of gestures using a reservoir based convolutional spiking neural network. A two-dimensional spike streams is received from neuromorphic event camera as an input. The two-dimensional spike streams associated with at least...patents.justia.com
System and method of gesture recognition using a reservoir based convolutional spiking neural network
Dec 17, 2020
This disclosure relates to method of identifying a gesture from a plurality of gestures using a reservoir based convolutional spiking neural network. A two-dimensional spike streams is received from neuromorphic event camera as an input. The two-dimensional spike streams associated with at least one gestures from a plurality of gestures is preprocessed to obtain plurality of spike frames. The plurality of spike frames is processed by a multi layered convolutional spiking neural network to learn plurality of spatial features from the at least one gesture. A filter block is deactivated from the plurality of filter blocks corresponds to at least one gesture which are not currently being learnt. A spatio-temporal features is obtained by allowing the spike activations from CSNN layer to flow through the reservoir. The spatial feature is classified by classifier from the CSNN layer and the spatio-temporal features from the reservoir to obtain set of prioritized gestures.
Skip to: Description · Claims · References Cited · Patent History · Patent History
Description
PRIORITY CLAIM
This U.S. patent application claims priority under 35 U.S.C. § 119 to: India Application No. 202021025784, filed on Jun. 18, 2020. The entire contents of the aforementioned application are incorporated herein by reference.
TECHNICAL FIELD
This disclosure relates generally to gesture recognition, and, more particularly, to system and method of gesture recognition using a reservoir based convolutional spiking neural network.
BACKGROUND
In an age of artificial intelligence, robots and drones are key enablers of task automation and they are being used in various domains such as manufacturing, healthcare, warehouses, disaster management etc. As a consequence, they often need to share work-space with and interact with human workers and thus evolving the area of research named Human Robot Interaction (HRI). Problems in this domain are mainly centered around learning and identifying of gestures/speech/intention of human coworkers along with classical problems of learning and identification of surrounding environment (and obstacles, objects etc. therein). All these essentially are needed to be done in a dynamic and noisy practical work environment. As of current state of the art vision based solutions using artificial neural networks (including deep neural networks) have high accuracy, however the models are not the most efficient solutions as learning methods and inference frameworks of the conventional deep neural networks require huge amount of training data and are typically compute and energy intensive. They are also bounded by one or more conventional architectures that leads to data transfer bottleneck between memory and processing units and related power consumption issues. Hence, this genre of solutions does not really help robots and drones to do their jobs as they are classically constrained by their battery life.
SUMMARY
Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems. For example, in one aspect, a processor implemented method of identifying a gesture from a plurality of gestures using a reservoir based convolutional spiking neural network is provided. The processor implemented method includes at least one of: receiving, from a neuromorphic event camera, two-dimensional spike streams as an input; preprocessing, via one or more hardware processors, the address event representation (AER) record associated with at least one gestures from a plurality of gestures to obtain a plurality of spike frames; processing, by a multi layered convolutional spiking neural network, the plurality of spike frames to learn a plurality of spatial features from the at least one gesture; deactivating, via the one or more hardware processors, at least one filter block from the plurality of filter blocks corresponds to at least one gesture which are not currently being learnt; obtaining, via the one or more hardware processors, spatio-temporal features by allowing the spike activations from a CSNN layer to flow through the reservoir; and classifying, by a classifier, the at least one of spatial feature from the CSNN layer and the spatio-temporal features from the reservoir to obtain a set of prioritized gestures. In an embodiment, the two-dimensional spike streams are represented as an address event representation (AER) record. In an embodiment, each sliding convolutional window in the plurality of spike frames are connected to a neuron corresponding to a filter among plurality of filters corresponding to a filter block among plurality of filter blocks in each convolutional layer from plurality of convolutional layers. In an embodiment, the plurality of filter blocks are configured to concentrate a plurality of class-wise spatial features to the filter block for learning associated patterns based on a long-term lateral inhibition mechanism. In an embodiment, the CSNN layer is stacked to provide at least one of: (i) a low-level spatial features, (ii) a high-level spatial features, or combination thereof.
In an embodiment, the spike streams may be compressed per neuronal level by accumulating spikes at a sliding window of time, to obtain a plurality of output frames with reduced time granularity. In an embodiment, plurality of learned different spatially co-located features may be distributed on the plurality of filters from the plurality of filter blocks. In an embodiment, a special node between filters of the filter block may be configured to switch between different filters based on an associated decay constant to distribute learning of different spatially co-located features on the different filters. In an embodiment, a plurality of weights of a synapse between input and the CSNN layer may be learned using an unsupervised two trace STDP learning rule upon at least one spiking activity of the input layer. In an embodiment, the reservoir may include a sparse random cyclic connectivity which acts as a random projection of the input spikes to an expanded spatio-temporal embedding.
In another aspect, there is provided a system to identify a gesture from a plurality of gestures using a reservoir based convolutional spiking neural network. The system comprises a memory storing instructions; one or more communication interfaces; and one or more hardware processors coupled to the memory via the one or more communication interfaces. The one or more hardware processors are configured by the instructions to: receive, from a neuromorphic event camera, two-dimensional spike streams as an input; preprocess, the address event representation (AER) record associated with at least one gestures from a plurality of gestures to obtain a plurality of spike frames; process, by a multi layered convolutional spiking neural network, the plurality of spike frames to learn a plurality of spatial features from the at least one gesture; deactivate, at least one filter block from the plurality of filter blocks corresponds to at least one gesture which are not currently being learnt; obtain, spatiotemporal features by allowing the spike activations from a CSNN layer to flow through the reservoir; and classify, by a classifier, the at least one of spatial feature from the CSNN layer and the spatiotemporal features from the reservoir to obtain a set of prioritized gestures. In an embodiment, the two-dimensional spike streams is represented as an address event representation (AER) record. In an embodiment, each sliding convolutional window in the plurality of spike frames are connected to a neuron corresponding to a filter among plurality of filters corresponding to a filter block among plurality of filter blocks in each convolutional layer from plurality of convolutional layers. In an embodiment, the plurality of filter blocks are configured to concentrate a plurality of class-wise spatial features to the filter block for learning associated patterns based on a long-term lateral inhibition mechanism. In an embodiment, the CSNN layer is stacked to provide at least one of: (i) a low-level spatial features, (ii) a high-level spatial features, or combination thereof.
In an embodiment, the spike streams may be compressed per neuronal level by accumulating spikes at a sliding window of time, to obtain a plurality of output frames with reduced time granularity. In an embodiment, plurality of learned different spatially co-located features may be distributed on the plurality of filters from the plurality of filter blocks. In an embodiment, a special node between filters of the filter block may be configured to switch between different filters based on an associated decay constant to distribute learning of different spatially co-located features on the different filters. In an embodiment, a plurality of weights of a synapse between input and the CSNN layer may be learned using an unsupervised two trace STDP learning rule upon at least one spiking activity of the input layer. In an embodiment, the reservoir may include a sparse random cyclic connectivity which acts as a random projection of the input spikes to an expanded spatio-temporal embedding.
In yet another aspect, there are provided one or more non-transitory machine readable information storage mediums comprising one or more instructions which when executed by one or more hardware processors causes at least one of: receiving, from a neuromorphic event camera, two-dimensional spike streams as an input; preprocessing, the address event representation (AER) record associated with at least one gestures from a plurality of gestures to obtain a plurality of spike frames; processing, by a multi layered convolutional spiking neural network, the plurality of spike frames to learn a plurality of spatial features from the at least one gesture; deactivating, at least one filter block from the plurality of filter blocks corresponds to at least one gesture which are not currently being learnt; obtaining, spatio-temporal features by allowing the spike activations from a CSNN layer to flow through the reservoir; and classifying, by a classifier, the at least one of spatial feature from the CSNN layer and the spatio-temporal features from the reservoir to obtain a set of prioritized gestures. In an embodiment, the two-dimensional spike streams are represented as an address event representation (AER) record. In an embodiment, each sliding convolutional window in the plurality of spike frames are connected to a neuron corresponding to a filter among plurality of filters corresponding to a filter block among plurality of filter blocks in each convolutional layer from plurality of convolutional layers. In an embodiment, the plurality of filter blocks are configured to concentrate a plurality of class-wise spatial features to the filter block for learning associated patterns based on a long-term lateral inhibition mechanism. In an embodiment, the CSNN layer is stacked to provide at least one of: (i) a low-level spatial features, (ii) a high-level spatial features, or combination thereof.
In an embodiment, the spike streams may be compressed per neuronal level by accumulating spikes at a sliding window of time, to obtain a plurality of output frames with reduced time granularity. In an embodiment, plurality of learned different spatially co-located features may be distributed on the plurality of filters from the plurality of filter blocks. In an embodiment, a special node between filters of the filter block may be configured to switch between different filters based on an associated decay constant to distribute learning of different spatially co-located features on the different filters. In an embodiment, a plurality of weights of a synapse between input and the CSNN layer may be learned using an unsupervised two trace STDP learning rule upon at least one spiking activity of the input layer. In an embodiment, the reservoir may include a sparse random cyclic connectivity which acts as a random projection of the input spikes to an expanded spatio-temporal embedding.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed....
Claims
1. A processor implemented method of identifying a gesture from a plurality of gestures using a reservoir based convolutional spiking neural network, comprising:
receiving, from a neuromorphic event camera, two-dimensional spike streams as an input, wherein the two-dimensional spike streams are represented as an address event representation (AER) record;preprocessing, via one or more hardware processors, the address event representation (AER) record associated with at least one gestures from a plurality of gestures to obtain a plurality of spike frames;processing, by a multi layered convolutional spiking neural network, the plurality of spike frames to learn a plurality of spatial features from the at least one gesture, wherein each sliding convolutional window in the plurality of spike frames are connected to a neuron corresponding to a filter among plurality of filters corresponding to a filter block among plurality of filter blocks in each convolutional layer from plurality of convolutional layers;deactivating, via the one or more hardware processors, at least one filter block from the plurality of filter blocks corresponds to at least one gesture which are not currently being learnt, wherein the plurality of filter blocks are configured to concentrate a plurality of class-wise spatial features to the filter block for learning associated patterns based on a long-term lateral inhibition mechanism;obtaining, via the one or more hardware processors, spatio-temporal features by allowing the spike activations from a CSNN layer to flow through the reservoir, wherein the CSNN layer is stacked to provide at least one of: (i) a low-level spatial features, (ii) a high-level spatial features, or combination thereof; andclassifying, by a classifier, the at least one of spatial feature from the CSNN layer and the spatio-temporal features from the reservoir to obtain a set of prioritized gestures.
2. The processor implemented method of claim 1, wherein the spike streams are compressed per neuronal level by accumulating spikes at a sliding window of time, to obtain a plurality of output frames with reduced time granularity.
3. The processor implemented method of claim 1, wherein a plurality of learned different spatially co-located features are distributed on the plurality of filters from the plurality of filter blocks.
4. The processor implemented method of claim 1, wherein a special node between filters of the filter block is configured to switch between different filters based on an associated decay constant to distribute learning of different spatially co-located features on the different filters.
5. The processor implemented method of claim 1, wherein a plurality of weights of a synapse between input and the CSNN layer are learned using an unsupervised two trace STDP learning rule upon at least one spiking activity of the input layer.
6. The processor implemented method of claim 1, wherein the reservoir comprises a sparse random cyclic connectivity which acts as a random projection of the input spikes to an expanded spatio-temporal embedding.
7. A system (100) to identify a gesture from a plurality of gestures using a reservoir based convolutional spiking neural network, comprising:
a memory (102) storing instructions;one or more communication interfaces (106); andone or more hardware processors (104) coupled to the memory (102) via the one or more communication interfaces (106), wherein the one or more hardware processors (104) are configured by the instructions to: receive, from a neuromorphic event camera, two-dimensional spike streams as an input, wherein the two-dimensional spike streams are represented as an address event representation (AER) record; preprocess, the address event representation (AER) record associated with at least one gestures from a plurality of gestures to obtain a plurality of spike frames; process, by a multi layered convolutional spiking neural network, the plurality of spike frames to learn a plurality of spatial features from the at least one gesture, wherein each sliding convolutional window in the plurality of spike frames are connected to a neuron corresponding to a filter among plurality of filters corresponding to a filter block among plurality of filter blocks in each convolutional layer from plurality of convolutional layers; deactivate, at least one filter block from the plurality of filter blocks corresponds to at least one gesture which are not currently being learnt, wherein the plurality of filter blocks are configured to concentrate a plurality of class-wise spatial features to the filter block for learning associated patterns based on a long-term lateral inhibition mechanism; obtain, spatiotemporal features by allowing the spike activations from a CSNN layer to flow through the reservoir, wherein the CSNN layer is stacked to provide at least one of: (i) a low-level spatial features, (ii) a high-level spatial features, or combination thereof; and classify, by a classifier, the at least one of spatial feature from the CSNN layer and the spatiotemporal features from the reservoir to obtain a set of prioritized gestures.
8. The system (100) of claim 7, wherein the spike streams are compressed per neuronal level by accumulating spikes at a sliding window of time, to obtain a plurality of output frames with reduced time granularity.
9. The system (100) of claim 7, wherein plurality of learned different spatially co-located features are distributed on the plurality of filters from the plurality of filter blocks.
10. The system (100) of claim 7, wherein a special node between filters of the filter block is configured to switch between different filters based on an associated decay constant to distribute learning of different spatially co-located features on the different filters.
11. The system (100) of claim 7, wherein a plurality of weights of a synapse between input and the CSNN layer are learned using an unsupervised two trace STDP learning rule upon at least one spiking activity of the input layer.
12. The system (100) of claim 7, wherein the reservoir comprises a sparse random cyclic connectivity which acts as a random projection of the input spikes to an expanded spatio-temporal embedding.
13. One or more non-transitory machine-readable information storage mediums comprising one or more instructions which when executed by one or more hardware processors perform actions comprising:
receiving, from a neuromorphic event camera, two-dimensional spike streams as an input, wherein the two-dimensional spike streams are represented as an address event representation (AER) record;preprocessing, the address event representation (AER) record associated with at least one gestures from a plurality of gestures to obtain a plurality of spike frames;processing, by a multi layered convolutional spiking neural network, the plurality of spike frames to learn a plurality of spatial features from the at least one gesture, wherein each sliding convolutional window in the plurality of spike frames are connected to a neuron corresponding to a filter among plurality of filters corresponding to a filter block among plurality of filter blocks in each convolutional layer from plurality of convolutional layers;deactivating, at least one filter block from the plurality of filter blocks corresponds to at least one gesture which are not currently being learnt, wherein the plurality of filter blocks are configured to concentrate a plurality of class-wise spatial features to the filter block for learning associated patterns based on a long-term lateral inhibition mechanism;obtaining, spatio-temporal features by allowing the spike activations from a CSNN layer to flow through the reservoir, wherein the CSNN layer is stacked to provide at least one of: (i) a low-level spatial features, (ii) a high-level spatial features, or combination thereof; andclassifying, by a classifier, the at least one of spatial feature from the CSNN layer and the spatio-temporal features from the reservoir to obtain a set of prioritized gestures.
14. The one or more non-transitory machine-readable information storage mediums of claim 13, wherein the spike streams are compressed per neuronal level by accumulating spikes at a sliding window of time, to obtain a plurality of output frames with reduced time granularity.
15. The one or more non-transitory machine-readable information storage mediums of claim 13, wherein a plurality of learned different spatially co-located features are distributed on the plurality of filters from the plurality of filter blocks.
16. The one or more non-transitory machine-readable information storage mediums of claim 13, wherein a special node between filters of the filter block is configured to switch between different filters based on an associated decay constant to distribute learning of different spatially co-located features on the different filters.
17. The one or more non-transitory machine-readable information storage mediums of claim 13, wherein a plurality of weights of a synapse between input and the CSNN layer are learned using an unsupervised two trace STDP learning rule upon at least one spiking activity of the input layer.
18. The one or more non-transitory machine-readable information storage mediums of claim 13, wherein the reservoir comprises a sparse random cyclic connectivity which acts as a random projection of the input spikes to an expanded spatio-temporal embedding.
Referenced Cited
U.S. Patent Documents
Foreign Patent Documents
6028626 February 22, 2000 Aviv 6236736 May 22, 2001 Crabtree 6701016 March 2, 2004 Jojic 7152051 December 19, 2006 Commons 7280697 October 9, 2007 Perona 8504361 August 6, 2013 Collobert 8811726 August 19, 2014 Belhumeur 8942466 January 27, 2015 Petre et al. 9015093 April 21, 2015 Commons 9299022 March 29, 2016 Buibas et al.
Other references
109144260 January 2019 CN WO2019074532 April 2019 WO
Patent History
- Panda, Priyadarshini et al., “Learning to Recognize Actions from Limited Training Examples Using a Recurrent Spiking Neural Model,” Frontiers in Neuroscience, Oct. 2017, Publisher: Arxiv Link: https://arxiv.org/pdf/1710.07354.pdf.
Patent number: 11256954
Type: Grant
Filed: Dec 17, 2020
Date of Patent: Feb 22, 2022
Patent Publication Number: 20210397878
Assignee: Tala Consultancy Services Limited (Mumbai)
Inventors: Arun George (Bangalore), Dighanchal Banerjee (Kolkata), Sounak Dey (Kolkata), Arijit Mukherjee (Kolkata)
Primary Examiner: Yosef Kassa
Application Number: 17/124,584
Classifications
Current U.S. Class: Intrusion Detection (348/152)
International Classification: G06K 9/62 (20060101); G06K 9/00 (20060101); G06N 3/04 (20060101);