GPT SNN
This paper was referred to by Dylan Muir (Synsense) in the recent interview with Sally Ward-Foxton discusses the use of SNNs in GPT. Who's in the right place, right time?
2302.13939.pdf (arxiv.org) https://arxiv.org/pdf/2302.13939.pdf
As the size of large language models continue to scale, so does the computational resources required to run it. Spiking Neural Networks (SNNs) have emerged as an energy-efficient approach to deep learning that leverage sparse and event-driven activations to reduce the computational overhead associated with model infer ence. While they have become competitive with non-spiking models on many computer vision tasks, SNNs have also proven to be more challenging to train. As a result, their performance lags behind modern deep learning, and we are yet to see the effectiveness of SNNs in language generation. In this paper, inspired by the Receptance Weighted Key Value (RWKV) language model, we successfully implement ‘SpikeGPT’, a generative language model with binary, event-driven spiking activation units. We train the proposed model on two model variants: 45M and 216M parameters. To the best of our knowledge, SpikeGPT is the largest backpropagation-trained SNN model to date, rendering it suitable for both the generation and comprehension of natural language. We achieve this by modifying the transformer block to replace multi-head self attention to reduce quadratic com putational complexity O(N2) to linear complexity O(N) with increasing sequence length. Input tokens are instead streamed in sequentially to our attention mecha nism (as with typical SNNs). Our preliminary experiments show that SpikeGPT remains competitive with non-spiking models on tested benchmarks, while main taining 20× fewer operations when processed on neuromorphic hardware that can leverage sparse, event-driven activations.
...
Recall Self-Attention. The self-attention operation lies at the heart of Transformers. In Transformers, self-attention takes an input sequence X, and applies a scaled dot product attention.
Artificial Neural Networks (ANNs) have recently achieved widespread, public-facing impact in Natural Language Processing (NLP), but has come with a significant computational and energy consumption burden across training and deployment. As examples, training GPT-3 was projected to use 190,000 kWh of energy [1; 2; 3]. Deploying ChatGPT into every modern word processor will witness millions of users in need of on-demand inference of large language models [4]. SNNs, inspired by neuroscientific models of neuronal firing, offer a more energy-efficient alternative by using discrete spikes to compute and transmit information [5]. Spike-based computing combined with neuromorphic hardware holds great potential for low-energy AI [6 (Mike Davies loihi); 7; 8], and its effectiveness in integration with deep learning has been demonstrated through numerous studies [9; 10; 11; 12; 13; 14]. At this stage, the performance of Spiking Neural Networks (SNNs) in NLP and generation tasks remains relatively under-investigated. While SNNs have shown competitiveness in computer vision tasks such as classification and object detection [15; 16; 17], they have yet to attain similar success in generative models. Th
This paper was referred to by Dylan Muir (Synsense) in the recent interview with Sally Ward-Foxton discusses the use of SNNs in GPT. Who's in the right place, right time?
2302.13939.pdf (arxiv.org) https://arxiv.org/pdf/2302.13939.pdf
As the size of large language models continue to scale, so does the computational resources required to run it. Spiking Neural Networks (SNNs) have emerged as an energy-efficient approach to deep learning that leverage sparse and event-driven activations to reduce the computational overhead associated with model infer ence. While they have become competitive with non-spiking models on many computer vision tasks, SNNs have also proven to be more challenging to train. As a result, their performance lags behind modern deep learning, and we are yet to see the effectiveness of SNNs in language generation. In this paper, inspired by the Receptance Weighted Key Value (RWKV) language model, we successfully implement ‘SpikeGPT’, a generative language model with binary, event-driven spiking activation units. We train the proposed model on two model variants: 45M and 216M parameters. To the best of our knowledge, SpikeGPT is the largest backpropagation-trained SNN model to date, rendering it suitable for both the generation and comprehension of natural language. We achieve this by modifying the transformer block to replace multi-head self attention to reduce quadratic com putational complexity O(N2) to linear complexity O(N) with increasing sequence length. Input tokens are instead streamed in sequentially to our attention mecha nism (as with typical SNNs). Our preliminary experiments show that SpikeGPT remains competitive with non-spiking models on tested benchmarks, while main taining 20× fewer operations when processed on neuromorphic hardware that can leverage sparse, event-driven activations.
...
Recall Self-Attention. The self-attention operation lies at the heart of Transformers. In Transformers, self-attention takes an input sequence X, and applies a scaled dot product attention.
Artificial Neural Networks (ANNs) have recently achieved widespread, public-facing impact in Natural Language Processing (NLP), but has come with a significant computational and energy consumption burden across training and deployment. As examples, training GPT-3 was projected to use 190,000 kWh of energy [1; 2; 3]. Deploying ChatGPT into every modern word processor will witness millions of users in need of on-demand inference of large language models [4]. SNNs, inspired by neuroscientific models of neuronal firing, offer a more energy-efficient alternative by using discrete spikes to compute and transmit information [5]. Spike-based computing combined with neuromorphic hardware holds great potential for low-energy AI [6 (Mike Davies loihi); 7; 8], and its effectiveness in integration with deep learning has been demonstrated through numerous studies [9; 10; 11; 12; 13; 14]. At this stage, the performance of Spiking Neural Networks (SNNs) in NLP and generation tasks remains relatively under-investigated. While SNNs have shown competitiveness in computer vision tasks such as classification and object detection [15; 16; 17], they have yet to attain similar success in generative models. Th
Last edited: