Purdue University Graduate School
Browse

VISUAL INTERPRETATION TO UNCERTAINTIES IN 2D EMBEDDING FROM PROBABILISTIC-BASED NON-LINEAR DIMENSIONALITY REDUCTION METHODS

Download (21.76 MB)
thesis
posted on 2022-06-07, 16:00 authored by Junhan ZhaoJunhan Zhao
Enabling human understanding of high-dimensional (HD) data is critical for scientific research but highly challenging. To deal with large datasets, probabilistic-based non-linear DR models, like UMAP and t-SNE, lead the performance on reducing the high dimensionality. However, considering the trade-off between global and local structure preservation and the randomness initialized for computation, applying non-linear models in different parameter settings to unknown high-dimensional structure data may return different 2D visual forms. Much critical neighborhood relationship may be falsely imposed, and uncertainty may be introduced into the low-dimensional embedding visualizations, so-called distortion. In this work, a survey has been conducted to illustrate the most state-of-the-art layout enrichment works for interpreting dimensionality reduction methods and results. Responding to the lack of visual interpretation techniques to probabilistic-based DR methods, we propose a visualization technique called ManiGraph, which facilitates users to explore multi-view 2D embeddings via mesoscopic structure graphs. A dynamic mesoscopic structure first subsets HD data by a hexagonal grid in visual space from non-linear embedding (e.g., UMAP). Then, it measures the regional adapted trustworthiness/continuity and visualizes the restored missing and highlighted false connections between subsets from high-dimensional space to the low-dimensional in a node-linkage manner. The visualization helps users understand and interpret the distortion from both visualization and model stages. We further demonstrate the user cases tested on intuitive 3D toy datasets, fashion-MNIST, and single-cell RNA sequencing with domain experts in unsupervised scenarios. This work will potentially benefit the data science community, from toolkit users to DR algorithm developers.<br>

Funding

Bilsland Fellowship

History

Degree Type

  • Doctor of Philosophy

Department

  • Computer Graphics Technology

Campus location

  • West Lafayette

Advisor/Supervisor/Committee Chair

Yingjie Chen

Additional Committee Member 2

James Mohler

Additional Committee Member 3

David Ebert

Additional Committee Member 4

Baijian Yang

Usage metrics

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC