FROM SEEING BETTER TO UNDERSTANDING BETTER: DEEP LEARNING FOR MODERN COMPUTER VISION APPLICATIONS
In this dissertation, we document a few of our recent attempts in bridging the gap between the fast evolving deep learning research and the vast industry needs for dealing with computer vision challenges. More specifically, we developed novel deep-learning-based techniques for the following application-driven computer vision challenges: image super-resolution with quality restoration, motion estimation by optical flow, object detection for shape reconstruction, and object segmentation for motion tracking. Those four topics cover the computer vision hierarchy from the low level where digital images are processed to restore missing information for better human perception, to middle level where certain objects of interest are recognized and their motions are analyzed, finally to high level where the scene captured in the video footage will be interpreted for further analysis. In the process of building the whole-package of ready-to-deploy solutions, we center our efforts on designing and training the most suitable convolutional neural networks for the particular computer vision problem at hand. Complementary procedures for data collection, data annotation, post-processing of network outputs tailored for specific application needs, and deployment details will also be discussed where necessary. We hope our work demonstrates the applicability and versatility of convolutional neural networks for real-world computer vision tasks on a broad spectrum, from seeing better to understanding better.
History
Degree Type
- Doctor of Philosophy
Department
- Electrical and Computer Engineering
Campus location
- West Lafayette