In this era of data deluge with real-time contents continuously generated by distributed sensors, intelligent neuromorphic systems are required to efficiently deal with the massive amount of data and computations in ubiquitous automobiles and portable edge devices. Spiking Neural Networks (SNNs), often regarded as third generation neural networks, can be highly power-efficient and have competitive capabilities to deal with numerous cognitive tasks. However, the typical shallow spiking network architectures have limited capacity for expressing complex representations while training a very deep spiking network has not been successful so far.
The first part of this thesis explores several pathways to effectively train deep SNNs using unsupervised, supervised and semi-supervised schemes, and neuron model analysis. First, we present a layer-wise unsupervised Spike-Timing-Dependent-Plasticity (STDP) for training deep convolutional SNNs. Second, we propose an approximate derivative method to overcome the discontinuous and non-differentiable nature of spike generation function and to enable training deep convolutional SNNs with input spike events using supervised spike-based backpropagation algorithm. Third, we develop a pre-training scheme using biologically plausible unsupervised learning, namely STDP, in order to better initialize the network parameters prior to supervised spike-based backpropagation. In addition, we present a comprehensive and comparative analysis between neuron models with and without leak to analyze the impacts of leak on noise robustness and spike sparsity in deep SNNs.
The later part of this thesis explores the combination between SNNs and event camera that provides highly temporal information in the form of spike streams. Event-based cameras display great potential for a variety of tasks such as high-speed motion detection and navigation in low-light environments where standard frame-based cameras suffer critically. However, conventional computer vision methods as well as deep Analog Neural Networks (ANNs) are not compatible in their native form with the asynchronous and discrete nature of event camera outputs. In this regard, SNNs serve as ideal paradigms to directly handle event camera outputs. However, deep SNNs suffer in terms of performance due to the spike vanishing phenomenon. To overcome these issues, we present Spike-FlowNet, a deep hybrid neural network architecture integrating SNNs and ANNs for efficiently estimating optical flow from sparse event camera outputs without sacrificing the performance. Furthermore, we propose Fusion-FlowNet, a sensor/architecture fusion framework for accurately estimating dense optical flow. In essence, we leverage the complementary characteristics of event- and frame-based sensors as well as ANNs and SNNs.