The full paper

Reason: There are two paper contents included in this dissertation. One is about to submit for reviewing and the other is under editing. Therefore, we prefer not to publish in recent.

9

month(s)

2

day(s)

until file(s) become available

VISUAL INTERPRETATION TO UNCERTAINTIES IN 2D EMBEDDING FROM PROBABILISTIC-BASED NON-LINEAR DIMENSIONALITY REDUCTION METHODS

thesis
posted on 25.06.2021, 13:56 by Junhan ZhaoJunhan Zhao
Enabling human understanding of high-dimensional (HD) data is critical for scientific research but highly challenging. To deal with large datasets, probabilistic-based non-linear DR models, like UMAP and t-SNE, lead the performance on reducing the high dimensionality. However, considering the trade-off between global and local structure preservation and the randomness initialized for computation, applying non-linear models in different parameter settings to unknown high-dimensional structure data may return different 2D visual forms. Much critical neighborhood relationship may be falsely imposed, and uncertainty may be introduced into the low-dimensional embedding visualizations, so-called distortion. In this work, a survey has been conducted to illustrate the most state-of-the-art layout enrichment works for interpreting dimensionality reduction methods and results. Responding to the lack of visual interpretation techniques to probabilistic-based DR methods, we propose a visualization technique called ManiGraph, which facilitates users to explore multi-view 2D embeddings via mesoscopic structure graphs. A dynamic mesoscopic structure first subsets HD data by a hexagonal grid in visual space from non-linear embedding (e.g., UMAP). Then, it measures the regional adapted trustworthiness/continuity and visualizes the restored missing and highlighted false connections between subsets from high-dimensional space to the low-dimensional in a node-linkage manner. The visualization helps users understand and interpret the distortion from both visualization and model stages. We further demonstrate the user cases tested on intuitive 3D toy datasets, fashion-MNIST, and single-cell RNA sequencing with domain experts in unsupervised scenarios. This work will potentially benefit the data science community, from toolkit users to DR algorithm developers.

Funding

Bilsland Fellowship

History

Degree Type

Doctor of Philosophy

Department

Computer Graphics Technology

Campus location

West Lafayette

Advisor/Supervisor/Committee Chair

Yingjie Chen

Additional Committee Member 2

James Mohler

Additional Committee Member 3

David Ebert

Additional Committee Member 4

Baijian Yang