Abstract
Edge detectors are widely used in computer vision applications to locate sharp intensity changes and find object boundaries in an image. The Canny edge detector is the most popular edge detector, and it uses a multi-step process, including the first step of noise reduction using a Gaussian kernel and a final step to remove the weak edges by the hysteresis threshold. In this work, a spike-based computing algorithm is presented as a neuromorphic analogue of the Canny edge detector, where the five steps of the conventional algorithm are processed using spikes. A spiking neural network layer consisting of a simplified version of a conductance-based Hodgkin–Huxley neuron as a building block is used to calculate the gradients. The effectiveness of the spiking neural-network-based algorithm is demonstrated on a variety of images, showing its successful adaptation of the principle of the Canny edge detector. These results demonstrate that the proposed algorithm performs as a complete spike domain implementation of the Canny edge detector.
Keywords:
edge detection;
segmentation;
spiking neural networks;
bio-inspired neurons
Graphical Abstract
1. Introduction
Artificial neural networks (ANNs) have become an indispensable tool for implementing machine learning and computer vision algorithms in a variety of pattern recognition and knowledge discovery tasks for both commercial and defense interests. Recent progress in neural networks is driven by the increase in computing power in data centers, cloud computing platforms, and edge computing boards. In size, weight, and power (SWaP)–constrained applications, such as unmanned aerial vehicles (UAVs), augmented reality headsets, and smart phones, more novel computing architectures are desirable. The state-of-the-art deep learning hardware platforms are often based on graphics processing units (GPUs), tensor processing units (TPUs) and field programmable gate arrays (FPGAs). The human brain is capable of performing more general and complex tasks at a minute fraction of the power required by deep learning hardware platforms. Spiking neurons are regarded as the building blocks of the neural networks in the brain. Moreover, research in neuroscience indicates the spatiotemporal computing capabilities of spiking neurons play a role in the energy efficiency of the brain. In addition, spiking neurons leverage sparse time-based information encoding, event-triggered plasticity, and low-power inter-neuron signaling. In this context, neuromorphic computing hardware architecture and spike domain machine learning algorithms offer a low-power alternative to ANNs on von Neumann computing architectures. The availability of neuromorphic processors, such as IBM’s TrueNorth [
1], Intel’s Loihi [
2], and event-domain neural processors, for example, BrainChip’s Akida [
3,
4], which offers the flexibility to define both artificial neural network layers and spiking neuron layers, are motivating the research and development of new algorithms for edge computing. In the present work, we have investigated how one can program an algorithm for Canny type edge detection using a spiking neural network and spike-based computing.
2. Background
An edge detection algorithm is widely used in computer vision to locate object boundaries in images. An edge in an image shows a sharp change in image brightness, which is a result of a sharp change in pixel intensity data. An edge detector computes and identifies the pixels with sharp changes in intensity with respect to the intensity of neighboring pixels. There are several edge detection image processing algorithms.
The three stages in edge detection are image smoothing, detection, edge localization. There are mainly three types of operators in edge detection. These are (i) gradient-based, (ii) Laplacian-based and (iii) Gaussian-based. The gradient-based edge detection method detects the edges by finding the maximum and the minimum in the first derivative of the image using a threshold. The Roberts edge detector [
5], Sobel edge detector [
6], and Prewitt edge detector [
7] are some of the examples of gradient-based edge detectors. These detectors use a 3 × 3 pattern grid. A detailed discussion on these edge detectors and a comparison of their advantages and disadvantages can be found in [
8]. The Roberts edge detection method is built on the idea that a difference on any pair of mutually perpendicular directions can be used to calculate the gradient. The Sobel operator uses the convolution of the images with a small, separable, and integer-valued filter in horizontal and vertical directions for edge detection. The Prewitt edge detector uses two masks, each computing the derivate of the image in the x-direction and the y-direction. This detector is suitable to estimate the magnitude and orientation of the edge. Laplacian-based edge detectors find the edges by searching for zero crossings in the second derivative of the image. The Laplacian of the Gaussian algorithm uses a pre-smoothing step with a Gaussian low-pass filter on an image followed by a second-order differential, i.e., Laplacian, which finds the image edge. This method needs a discrete convolutional kernel that can approximate the second derivative for the image which consists of discrete pixels. The Marr–Hildreth edge detector is also based on the Laplacian of the Gaussian operator [
9]. The Gabor filter edge detector [
10] and Canny edge detector [
11] are Gaussian-based edge detectors. The Gabor filter is a linear filter with its impulse response function defined by the product of a harmonic function with a Gaussian function and is similar to the human perception system.
The Canny edge detector provides excellent edge detection, as it meets the three criteria for edge detection [
12]: (i) detection with low error rate, (ii) the edge point should localize in the center of the edge, and (iii) an edge should only be marked once and image noise should not create edges. Canny edge detection uses the calculus of variations to optimize a functional which is a sum of four exponential terms, which approximates the first derivative of a Gaussian. A Canny edge detector is a multi-step algorithm designed to detect the edges of any analyzed image. The steps of this process are: (1) removal of noise in the image using a Gaussian filter, (2) calculation of the gradient of the image pixels along x- and y-directions, (3) non-maximum suppression to thin out edges, (4) double-threshold filtering to detect strong, weak and non-relevant pixels, and (5) edge tracking by hysteresis to transform weaker pixels into stronger pixels if at least one of their neighbors is a stronger pixel. The Canny edge detection algorithm is highly cited (∼36,000 citations) and the most commonly used edge detection algorithm [
11].
Edge detection is a primary step in identifying an object and further research is strongly desirable to expand these methods to event-domain applications. The Canny edge detector has a better performance as an edge detector compared to Roberts, Sobel and Prewitt edge detectors, but at a higher computational cost [
8]. An alternate implementation of the Canny edge detection algorithm is for edge computing event-domain applications, where low-power and real-time solutions can be attractive to target applications, in view of its tunable performance using the standard deviation of the Gaussian filter.
3. Related Work
In the human vision system, the photoreceptors in the retina convert the light intensity into nerve signals. These signals are further processed and converted into spike trains by the ganglion cells in the retina. The spike trains travel along the optic nerve for further processing in the visual cortex. Neural networks that are inspired by the human vision system have been introduced to improve image processing techniques, such as edge detection [
13]. Spiking neural networks, which are built on the concepts of spike encoding techniques [
14], spiking neuron models [
15] and spike-based learning rules [
16], are biologically inspired in their mechanism of image processing. SNNs are gaining attraction for biologically inspired computing and learning applications [
17,
18]. Wu et al. simulated a three-layer spiking neural network (SNN), consisting of a receptor layer, an intermediate layer with four filters, respectively, for up, down, left, and right directions, and an oputput layer with Hogkin–Huxley-type neurons as the building blocks for edge detection [
19].
Clogenson et al. demonstrated how a SNN with scalable, hexagonally shaped receptive fields performs edge detection with computational improvements over rectangular shaped pixel-based SNN approaches [
20]. The digital images are converted into a hexagonal pixel representation before being processed by the SNN. A spiking neuron integrates the spikes from a group of afferent neurons in a receptive field. The network model used by the authors consists of an intermediate layer with four types of neurons corresponding to four different receptive fields, corresponding to up, down, right and left orientations. Yedjour et al. [
21] demonstrated the basic task of contour detection using a spiking neural network based on the Hodgkin–Huxley neuron model. In this approach, the synaptic weights are determined by the Gabor function to describe the receptive field’s behaviors of simple cells in the visual cortex. Vemuru [
22] reported the design of a SNN edge detector with biologically inspired neurons and demonstrated that the edge detector detects edges in simulated low-contrast images. These studies focused on defining SNNs using an array of Gabor filter receptive fields in the edge detector. In view of the success of SNNs in edge detection, it is desirable to develop a spike domain implementation of an edge detector with the Canny edge detector algorithm because it has the potential to offer a high performance alternative for edge detection.
4. Methods
Network models of the visual cortex are simulated with spiking neurons using Hodgkin and Huxley equations [
23]. Retinal ganglion cells convey the visual image from the eye to the brain [
24,
25]. Receptive fields exist in the visual cortex; however, an accurate representation of the neuron circuits for the visual cortex is still unclear. Neural network models have been proposed explaining how the visual system is able to process an image efficiently, and more research is desired to further our understanding of the visual cortex [
26]. As ANNs grow in complexity, their associated energy consumption becomes a challenging problem. Such challenges also exist for computing edges in images, where the computing devices are resource-constrained while operating on a limited energy budget. Therefore, specialized optimizations for deep learning have to be performed at both software and hardware levels. Edge detection can be achieved using a spiking neuron model [
19]. Spiking neural networks offer a low-energy computational alternative with only a few layers, while maintaining edge features. Our solution for spike-based edge detection uses only one layer of Hodgkin–Huxley-type neurons (1 neuron/pixel) with five spike processing layers, one conductance calculation layer and a synaptic current update layer. The simple form of Hodgkin–Huxley neurons used in the network are similar to the conductance-based leaky integrate-and-fire neurons, which are frequently used in neuromorphic hardware implementation [
1,
2].
To implement a neuromorphic analogue of the Canny detector, we invented spike-based computation using the five key steps introduced earlier and implemented in MATLAB.
Figure 1 illustrates the flowchart of the algorithmic steps in the spike domain computation of Canny edge detection. The image I(x,y), where (x, y) are the coordinates of the pixels, is first converted into grayscale I(grayscale)(x,y) and scaled such that I(grayscale)max = 0.01 to match the units of the model parameters of the Hodgkin–Huxley neuron model. Then, it is assigned as peak conductance for excitatory synapse qex and peak conductance for inhibitory synapse qin. The peak synapses are then converted into time-dependence conductances, gex and gin, for excitatory and inhibitory synapses, respectively, using the equations
gex=qex(τex×dt)/(τex+dt)
(1)
gin=qin(τin×dt)/(τin+dt),
(2)
where τex = 4 ms and τin = 7 ms are the time constants for excitatory, inhibitory synapses, respectively [
27]. Then, the conductances are processed using a Gaussian kernel of 5 × 5, to calculate the synaptic current at each time step t [
28]:
Iz(t)=gex(V−Vex)+gin(V−Vin),
(3)
where Vex and Vin are the reverse potentials for excitatory and inhibitory synapses, respectively. Note that the kernel size was smaller than 5 × 5 for edge pixels.