Purdue University Graduate School
Browse

Optimizing Initialization, Feature Selection, and Tensor Dimension Reduction in Unsupervised Learning: Methods and Applications

Download (10.92 MB)
thesis
posted on 2025-04-17, 15:52 authored by Huyunting Huang Sr.Huyunting Huang Sr.

Unsupervised machine learning (ML) is essential for analyzing complex data without labels. Many challenges have been identified. This dissertation addresses three key challenges: clustering initialization, unsupervised feature selection, and dimension reduction for tensors. The thesis also applies unsupervised ML to the airborne LiDAR data.

Chapter 2 introduces an improved initialization strategy for K-Means clustering and Gaussian Mixture Models (GMM). The proposed method improves clustering stability and accuracy.

Chapter 3 develops a stepwise unsupervised feature selection framework, called the Forward Partial-Variable Clustering with Full-Variable Loss (FPCFL), to improve clustering performance in high-dimensional data.

Chapter 4 focuses on tensor dimension reduction and feature selection in multiway data. It introduces Low-Rank Sparse Tensor Approximation (LRSTA) for efficient data compression and High-Order Orthogonal Decomposition (HOOD) for improved sparsity and interpretability, particularly in large-scale datasets like image and video analysis.

Chapter 5 explores unsupervised ML in airborne LiDAR data, applying clustering and dimensionality reduction to enhance ground filtering and object detection in 3D point clouds.

This dissertation advances unsupervised ML by improving clustering reliability, optimizing feature selection, and enhancing tensor decomposition, contributing to more effective and scalable data-driven analysis.

History

Degree Type

  • Doctor of Philosophy

Department

  • Statistics

Campus location

  • West Lafayette

Advisor/Supervisor/Committee Chair

Tonglin Zhang

Additional Committee Member 2

Baijian Yang

Additional Committee Member 3

Chong Gu

Additional Committee Member 4

Qifan Song

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC