So another project has just recently been created on GitHub.
Not a lot of info and the author is just gv1x2 and no further details though I'll provide additional info I discovered, further in the post.
Whether the GitHub profile is one of the papers authors or someone with relevant datasets I'm not sure as the papers results were via simulations and using SpikingJelly (also under the same GitHub user files). However, the gv1x2 repository also now has the Akida Project and the papers "future work" is to implement their processes on actual neuromorphic hardware.
I'm presuming this is now the new Akida Project repository.
Files appear to involve spectrograms which:
Applications of Spectrograms
- Speech and Language Analysis:
Analyzing the formant frequencies in vocal sounds to diagnose speech disorders or study language development.
- Music Production:
Isolating specific notes or instruments, analyzing musical structure, or even embedding hidden images within songs.
- Bioacoustics:
Studying wildlife sounds by visualizing and analyzing calls of birds, insects, and other animals.
- Audio Forensics and Repair:
Detecting and isolating problematic noises, helping with the spectral repair of recordings.
Contribute to gv1x2/akida_project development by creating an account on GitHub.
github.com
On using general searching for the user name I found a reference to the GitHub repository in a recent paper published in Aug this year.
Abstract reveals the authors are working on telemedicine and in particular mental health.
From Convolution to Spikes for Mental Health: A CNN-to-SNN Approach Using the DAIC-WOZ Dataset
by
Victor Triohin
1,
Monica Leba
2,* and
Andreea Cristina Ionica
3
1
Doctoral School, University of Petroșani, 332006 Petrosani, Romania
2
System Control and Computer Engineering Department, University of Petroșani, 332006 Petrosani, Romania
3
Management and Industrial Engineering Department, University of Petroșani, 332006 Petrosani, Romania
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025,
15(16), 9032;
https://doi.org/10.3390/app15169032
Submission received: 28 July 2025 / Revised: 7 August 2025 / Accepted: 14 August 2025 / Published: 15 August 2025
(This article belongs to the Special Issue
eHealth Innovative Approaches and Applications: 2nd Edition)
Featured Application
This work enables energy-efficient, real-time depression screening from speech, with potential applications in mobile health platforms, telemedicine, and low-resource clinical settings.
Abstract
Depression remains a leading cause of global disability, yet scalable and objective diagnostic tools are still lacking. Speech has emerged as a promising non-invasive modality for automated depression detection, due to its strong correlation with emotional state and ease of acquisition. While convolutional neural networks (CNNs) have achieved state-of-the-art performance in this domain, their high computational demands limit deployment in low-resource or real-time settings. Spiking neural networks (SNNs), by contrast, offer energy-efficient, event-driven computation inspired by biological neurons, but they are difficult to train directly and often exhibit degraded performance on complex tasks. This study investigates whether CNNs trained on audio data from the clinically annotated DAIC-WOZ dataset can be effectively converted into SNNs while preserving diagnostic accuracy. We evaluate multiple conversion thresholds using the SpikingJelly framework and find that the 99.9% mode yields an SNN that matches the original CNN in both accuracy (82.5%) and macro F1 score (0.8254). Lower threshold settings offer increased sensitivity to depressive speech at the cost of overall accuracy, while naïve conversion strategies result in significant performance loss. These findings support the feasibility of CNN-to-SNN conversion for real-world mental health applications and underscore the importance of precise calibration in achieving clinically meaningful results.
Excerpt:
Future research will aim to address the limitations identified in this study, including the need for evaluation on neuromorphic hardware and the exploration of more complex network architectures. Nevertheless, the present work contributes meaningfully to the expanding body of literature focused on bridging the gap between high-performance deep learning models and biologically inspired, energy-efficient architectures. The findings presented herein offer empirical support for the viability of properly configured SNNs as effective alternatives to CNNs in sensitive applications such as automated depression detection from speech data.
Data Availability Statement
Data available upon request at
https://dcapswoz.ict.usc.edu/ (accessed on 3 May 2024).
All codes involved in obtaining the results presented in this paper are available at https://github.com/gv1x2/ANN2SNN-SpikingJelly- (accessed on 20 July 2025).