TOWARDS MOTION-RELATED VISUAL LEARNING

Lu, Yawen

doi:10.25394/PGS.28853507.v1

TOWARDS MOTION-RELATED VISUAL LEARNING

thesis

posted on 2025-04-25, 12:29 authored by Yawen LuYawen Lu

Computer systems face significant challenges in understanding and processing video content, despite recent advances in deep learning technologies. While specialized solutions exist for individual tasks like optical flow estimation and video depth perception, current approaches are limited by task-specific constraints, insufficient temporal modeling, and heavy reliance on densely annotated data. These limitations contrast sharply with human visual perception, which processes motion holistically and can function effectively with incomplete information.

In this dissertation, we address the research question: how can we develop more efficient and generalizable visual understanding systems that better align with human-like visual perception? Our investigation yields three major contributions. First, we present a optical flow estimation framework that effectively models long-range dependencies in video sequences. This approach uniquely addresses the challenge of occlusions and geometric variations by leveraging information from multiple frames through transformer-based spatial and temporal attention mechanisms. Second, we introduce a unified prototypical transformer framework that bridges the gap between task-specific solutions and holistic motion understanding. This novel approach achieves superior results in both optical flow and depth estimation tasks through innovative feature denoising and prototypical learning mechanisms. Third, we explore a label-efficient learning framework that generates high-quality simulated masks from sparsely annotated frames using optical flow estimation and differentiable warping, significantly reducing the annotation burden for video object segmentation tasks.

History

Degree Type

Doctor of Philosophy

Department

Computer Graphics Technology

Campus location

West Lafayette

Advisor/Supervisor/Committee Chair

Yingjie Victor Chen

Additional Committee Member 2

Songlin Fei

Additional Committee Member 3

Baijian Yang

Additional Committee Member 4

Christos Mousas

Usage metrics

Keywords

video analysis algorithms video understanding computer graphics applications forestry applications

Licence

CC BY 4.0

TOWARDS MOTION-RELATED VISUAL LEARNING

History

Degree Type

Department

Campus location

Advisor/Supervisor/Committee Chair

Additional Committee Member 2

Additional Committee Member 3

Additional Committee Member 4

Usage metrics

Categories

Keywords

Licence

Exports