File(s) under embargo
until file(s) become available
Application of Deep Learning to Three Problems in Image Analysis and Image Processing: Automatic Image Cropping, Remote Heart Rate Estimation and Quadrilateral Detection
Image re-composition has always been regarded as one of the most important steps during the post-processing of a photo. The quality of an image re-composition mainly depends on a person’s taste in aesthetics, which is not an effortless task for those who have no abundant experience in photography. Besides, while re-composing one image does not require much of a person’s time, it could be quite time-consuming when there are hundreds of images to be recomposed. To solve these problems, we propose a method that automates the process of re-composing an image to the desired aspect ratio. Although there already exist many image re-composition methods, they only provide a score to their predicted best crop but fail to explain why the score is high or low. Conversely, we succeed in designing an explainable method by introducing a novel 10-layer aesthetic score map, which represents how the position of the saliency in the original uncropped image, relative to that of the crop region, contributes to the overall score of the crop, so that the crop is not just represented by a single score. We conducted experiments to show that the proposed score map boosts the performance of our algorithm, which achieves a state-of-the-art performance on both public dataset and our own dataset.
Heart rate, the speed of the heartbeat, has been regarded as one of the most important measurements to evaluate one's health. It can be used to measure one's anxiety, stress and illness; abnormalities of heart rate usually indicate potential disease one may have. Recent studies have shown that it is possible to directly measure the heart rate from a sequence of images that contain a person's face. Requiring only a webcam, this method largely simplifies the process of traditional methods, which require the use of a pulse oximeter attached to the fingertip to measure the PPG signal, or electrodes placed on the skin to measure the ECG signal. However, this most recent method, though attracting a lot of interest, still suffers from sudden movement of the head, or turning away from the camera. In this paper, we propose a novel robust method of generating reliable PPG signals and measuring the heart rate from only face videos in real time, which is invariant to the movement of the head. We have also conducted studies on how different factors, light conditions, the angle of the head and the distance of the head away from the camera, could affect the predictions of the heart rate. After conducting a thorough analysis, we can conclude that our method succeeds in producing accurate, robust and promising results.
Quadrilateral detection is the process of locating the object of quadrilateral shape that appears in an image, and it is the fundamental of many applications such as scanning the document, and digitizing printed photos using a smartphone. While there exist methods that detect the objects of quadrilateral shape fairly accurately, their performance significantly drops when occlusion is present or the edges of the quadrilateral are not completely straight. In our work, we propose an end-to-end system that accurately predicts the four corners of a quadrilateral in an image, which is robust to the occlusion and capable of detecting the quadrilateral even though it is slightly distorted.