CONTENT UNDERSTANDING FOR IMAGING SYSTEMS: PAGE CLASSIFICATION, FADING DETECTION, EMOTION RECOGNITION, AND SALIENCY BASED IMAGE QUALITY ASSESSMENT AND CROPPING
thesisposted on 2021-10-12, 13:14 authored by Shaoyuan XuShaoyuan Xu
This thesis consists of four sections which are related with four research projects.
The first section is about Page Classification. In this section, we extend our previous approach which could classify 3 classes of pages: Text, Picture and Mixed, to 5 classes which are: Text, Picture, Mixed, Receipt and Highlight. We first design new features to define those two new classes and then use DAG-SVM to classify those 5 classes of images. Based on the results, our algorithm performs well and is able to classify 5 types of pages.
The second section is about Fading Detection. In this section, we develop an algorithm that can automatically detect fading for both text and non-text region. For text region, we first do global alignment and then perform local alignment. After that, we create a 3D color node system, assign each connected component to a color node and get the color difference between raster page connected component and scanned page connected. For non-text region, after global alignment, we divide the page into "super pixels" and get the color difference between raster super pixels and testing super pixels. Compared with the traditional method that uses a diagnostic page, our method is more efficient and effective.
The third section is about CNN Based Emotion Recognition. In this section, we build our own emotion recognition classification and regression system from scratch. It includes data set collection, data preprocessing, model training and testing. We extend the model to real-time video application and it performs accurately and smoothly. We also try another approach of solving the emotion recognition problem using Facial Action Unit detection. By extracting Facial Land Mark features and adopting SVM training framework, the Facial Action Unit approach achieves comparable accuracy to the CNN based approach.
The forth section is about Saliency Based Image Quality Assessment and Cropping. In this section, we propose a method of doing image quality assessment and recomposition with the help of image saliency information. Saliency is the remarkable region of an image that attracts people's attention easily and naturally. By showing everyday examples as well as our experimental results, we demonstrate the fact that, utilizing the saliency information will be beneficial for both tasks.