Purdue University Graduate School
Browse

File(s) under embargo

5

month(s)

15

day(s)

until file(s) become available

TOWARDS IMPROVED REPRESENTATIONS ON HUMAN ACTIVITY UNDERSTANDING

thesis
posted on 2023-12-04, 19:15 authored by Hyung-gun ChiHyung-gun Chi

Human action recognition stands as a cornerstone in the domain of computer vision, with its utility spanning across emergency response, sign language interpretation, and the burgeoning fields of augmented and virtual reality. The transition from conventional video-based recognition to skeleton-based methodologies has been a transformative shift, offering a robust alternative less susceptible to environmental noise and more focused on the dynamics of human movement.

This body of work encapsulates the evolution of action recognition, emphasizing the pivotal role of Graph Convolution Network (GCN) based approaches, particularly through the innovative InfoGCN framework. InfoGCN has set a new precedent in the field by introducing an information bottleneck-based learning objective, a self-attention graph convolution module, and a multi-modal representation of the human skeleton. These advancements have collectively elevated the accuracy and efficiency of action recognition systems.

Addressing the prevalent challenge of occlusions, particularly in single-camera setups, the Pose Relation Transformer (PORT) framework has been introduced. Inspired by the principles of Masked Language Modeling in natural language processing, PORT refines the detection of occluded joints, thereby enhancing the reliability of pose estimation under visually obstructive conditions.

Building upon the foundations laid by InfoGCN, the Skeleton ODE framework has been developed for online action recognition, enabling real-time inference without the need for complete action observation. By integrating Neural Ordinary Differential Equations, Skeleton ODE facilitates the prediction of future movements, thus reducing latency and paving the way for real-time applications.

The implications of this research are vast, indicating a future where real-time, efficient, and accurate human action recognition systems could significantly impact various sectors, including healthcare, autonomous vehicles, and interactive technologies. Future research directions point towards the integration of multi-modal data, the application of transfer learning for enhanced generalization, the optimization of models for edge computing, and the ethical deployment of action recognition technologies. The potential for these systems to contribute to healthcare, particularly in patient monitoring and disease detection, underscores the need for continued interdisciplinary collaboration and innovation.

Funding

FW-HTF 1839971

History

Degree Type

  • Doctor of Philosophy

Department

  • Electrical and Computer Engineering

Campus location

  • West Lafayette

Advisor/Supervisor/Committee Chair

Karthik Ramani

Additional Committee Member 2

Stanley Chan

Additional Committee Member 3

David I. Inouye

Additional Committee Member 4

Ilias Bilionis

Usage metrics

    Categories

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC