Novel methods for image processing, image analysis, machine learning and deep learning with applications to on-line fashion retailing, print quality assessment, and image enhancement
thesisposted on 29.11.2021, 21:28 authored by Litao HuLitao Hu
In the online fashion market, sellers often modify their product images by adding image frames or other non-native contents, or compositing different images, to emphasize the features of their products and get more attention. However, from the buyer's point of view, these excessive contents are often redundant, interfering with the evaluation of the major contents or products in the image. Additionally, it makes it harder for product archiving in the fashion market. In this thesis, we will introduce several novel algorithms based on image processing techniques as well as deep learning to analyze and detect these added synthetic contents in the product images on a fashion market. Promising results have been shown through comprehensive evaluations on several testing datasets and comparisons to other deep learning models that have been used for similar purposes.
The development of image quality assessment algorithms has been a very active research area in the field of image processing, and there have been numerous methods proposed. However, most of the existing methods focus on digital images that only or mainly contain pictures or photos taken by digital cameras. Traditional approaches evaluate an input image as a whole and try to estimate a quality score for the image, to give viewers an idea of how good or satisfying the image looks. In this thesis, we focus on the quality evaluation of elements such as texts, bar-codes, QR-codes, lines, and hand-writings in target images, which is often neglected in many other related works. Estimating a quality score for this kind of information can be based on whether or not it is readable by a human, or recognizable by a decoder. Moreover, we mainly study the viewing quality of the scanned document of a printed image. For this purpose, we propose a novel document image quality assessment (DIQA) algorithm that can determine the scanning resolution for a document or regions in a document that is optimized for cloud and host destinations. We also develop an automatic compression system for scanned document pages based on our optimal resolution algorithm, to achieve optimal file sizes for scanned documents without loss of key information. Experimental results on our testing images successfully demonstrate the effectiveness of our method.
When it comes to the image signal processor (ISP) for digital cameras, denoising and tone-mapping are two important steps in high-dynamic-range (HDR) imaging. Tone-mapping, which adjusts the brightness and contrast of a given image, can significantly amplify the noise, especially in low-light areas, posing challenges to denoising. Denoising, on the other hand, can undo the enhanced contrast from tone-mapping steps if not tuned accordingly, and many times assumes the linear inputs, which no longer holds due to the multi-exposure capture and non-linear tone-mapping operation. While such entanglement between tone-mapping and denoising exists, the existing image processing unit (IPU) or image signal processor pipeline usually instantiates the two steps as separate and isolated modules, making the balancing of the two modules' effects difficult. In this work, observing that both operations can benefit from multi-scale processing (i.e. decomposing an image into high- or low-frequency components and performing denoising and tone-mapping accordingly), we propose a joint multi-scale denoising and tone-mapping framework with both operations in mind for HDR imaging. Our joint multi-scale DCT-based network is trained in an end-to-end format that optimizes both operators together. On recent HDR benchmark datasets, we show both quantitatively and qualitatively that our proposed framework achieves better results than state-of-the-art HDR tone-mapping methods that separately perform denoising and tone-mapping procedures.