Purdue University Graduate School
Browse

Identity Preservation and Content Control in Generative Image Customization

Download (75.96 MB)
thesis
posted on 2025-05-09, 15:16 authored by Yizhi SongYizhi Song

Generative image customization has emerged as a transformative paradigm for context-aware image editing, enabling the seamless integration of reference objects into novel scenes while preserving their identity. However, it faces critical challenges in realism, identity preservation, and artifact-free synthesis. Traditional composition-based methods require labor-intensive manual annotation and multi-stage processing for harmonization, geometry correction, and shadow generation. While recent advances in diffusion models enable self-supervised frameworks to address these tasks holistically, limitations persist in preserving fine-grained object identity and mitigating generative artifacts. These localized artifacts (e.g., distorted logos or textures) often persist in synthesized images, undermining fidelity. This thesis bridges these gaps through three interconnected contributions that advance the scalability, fidelity, and controllability of customized image editing.

First, we propose a novel unified framework (ObjectStitch) leveraging conditional diffusion models to automate image compositing. We construct a new training scheme and a data augmentation strategy to tackle this task without manual labeling. Our approach holistically transforms viewpoint, geometry, color, and shadows while preserving input characteristics via a novel content adapter. To further enhance the ability to maintain object identity across diverse contexts, we design a two-stage learning framework (IMPRINT) that decouples identity preservation from compositing. The first stage focuses on dense-representation learning, where view-invariant object embeddings are extracted from the reference; the second harmonization stage focuses on seamlessly integrating the object into backgrounds. In addition, a shape-guidance mechanism enables user-directed layout control.

To address the challenge of generative artifacts that widely exist in image synthesis, we present a reference-guided artifact refinement model (Refine-by-Align). Its two-stage framework—alignment and refinement—extracts regional features from reference images to repair artifacts in composited outputs. This model-agnostic solution enhances identity details and generalizes across customization, virtual try-on, and view synthesis tasks.

Together, the contributions form a cohesive pipeline: a self-supervised backbone for compositing, a decoupled framework for identity preservation, and a universal refiner for artifact correction. Extensive experiments and user studies validate our methods’ superiority in realism and faithfulness, establishing new benchmarks for personalized image editing.

Funding

III: Medium: Collaborative Research: Deep Generative Modeling for Urban and Archaeological Recovery

Directorate for Computer & Information Science & Engineering

Find out more...

Elements: Data: U-Cube: A Cyberinfrastructure for Unified and Ubiquitous Urban Canopy Parameterization

Directorate for Computer & Information Science & Engineering

Find out more...

History

Degree Type

  • Doctor of Philosophy

Department

  • Computer Science

Campus location

  • West Lafayette

Advisor/Supervisor/Committee Chair

Daniel Aliaga

Additional Committee Member 2

Bedrich Benes

Additional Committee Member 3

Raymond Yeh

Additional Committee Member 4

Voicu Popescu

Usage metrics

    Categories

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC