Purdue University Graduate School
Browse

Advancing Learned Lossy Image Compression through Knowledge Distillation and Contextual Clustering

Download (6.47 MB)
thesis
posted on 2024-10-29, 13:14 authored by Yichi ZhangYichi Zhang

In recent decades, the rapid growth of internet traffic, particularly driven by high-definition images/videos has highlighted the critical need for effective image compression to reduce bit rates and enable efficient data transmission. Learned lossy image compression (LIC), which uses end-to-end deep neural networks, has emerged as a highly promising method, even outperforming traditional methods such as the intra-coding of the versatile video coding (VVC) standard. This thesis contributes to the field of LIC in two ways. First, we present a theoretical bound-guided knowledge distillation technique, which utilizes estimated bound information rate-distortion (R-D) functions to guide the training of LIC models. Implemented with a modified hierarchical variational autoencoder (VAE), this method demonstrates superior rate-distortion performance with reduced computational complexity. Next, we introduce a token mixer neural architecture, referred to as contextual clustering, which serves as an alternative to conventional convolutional layers or self-attention mechanisms in transformer architectures. Contextual clustering groups pixels based on their cosine similarity and uses linear layers to aggregate features within each cluster. By integrating with current LIC methods, we not only improve coding performance but also reduce computational load.

History

Degree Type

  • Master of Science in Electrical and Computer Engineering

Department

  • Electrical and Computer Engineering

Campus location

  • West Lafayette

Advisor/Supervisor/Committee Chair

Fengqing Maggie Zhu

Additional Committee Member 2

Edward J. Delp

Additional Committee Member 3

Mary L. Comer

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC