EFFICIENT DEEP LEARNING FROM CONVOLUTIONAL NEURAL NETWORKS TO TRANSFORMERS.
Deep learning models have played a crucial role in advancing the capabilities of Artificial Intelligence (AI), powering a wide range of applications and systems. Over time, these models have evolved from handling basic classification tasks to driving complex generative AI applications. Among the most prominent architectures are Convolutional Neural Networks (CNNs) and transformers, each contributing significantly to the field. However, the increasing size of these models presents significant challenges in training and deploying them on resource-constrained devices. This thesis addresses the growing challenges of deep learning model efficiency and scalability, particularly in the context of distributed systems, attention mechanisms, and diffusion models. By proposing innovative approaches such as conditionally deep hybrid precision networks, complexity-aware training, and the integration of attention with recurrence, we enhance both energy efficiency and inference accuracy in resource-constrained environments. Additionally, our introduction of the Panoptic Diffusion Model (PDM) sets a new benchmark in concurrent image and segmentation map generation, pushing the boundaries of creativity and realism in diffusion models. Together, these contributions pave the way for more efficient and versatile deep learning applications across a range of domains.
History
Degree Type
- Doctor of Philosophy
Department
- Electrical and Computer Engineering
Campus location
- West Lafayette