Forensic Analysis of Images and Documents
This thesis involves three topics related to forensic analysis of media data. The first topic is the analysis of images and documents that have been created with a scanner. The goal is to detect and identify scanner model from the scanned images/documents. We propose a deep learning system that can automatically learn the inherent features of the scanned images. This system will produce a scanner model identification and a reliability map for a scanned image. The proposed system has shown promising results in the forensic analysis of scanned images. The second topic is related to forensic integrity of scientific papers. The project is divided into multiple tasks, data collection, image extraction, and manipulation detection. We have constructed a dataset of retracted scientific papers that have been verified to have issues with integrity. We design and maintain a web-based Scientific Integrity System for forensic analysis of the images within scientific publications. The third topic is related to media document analysis. Our goal is to identify the publication style for media document, aiding in the potential document manipulation. We are mainly focusing on image-text consistency check, and synthetic tweets analysis. For image-text inconsistency check, we describe a system that can examine an image in document and the corresponding text caption (or other associated text with the image) to check the image/text consistency. For synthetic tweets analysis, we propose a system to detect and identify the text generation models and paraphrase attack models.
History
Degree Type
- Doctor of Philosophy
Department
- Electrical and Computer Engineering
Campus location
- West Lafayette