Exploring Self-Organizing Maps: Applications in Omics Data Analysis and Integration with a Generative Model
This thesis revisits self-organizing maps (SOM) and explores their applications as both a clustering and feature learning method, establishing SOM as a powerful tool for analyzing complex biological and temporal datasets. The first study integrates evolutionary information with co-fractionation mass spectrometry data to create enhanced benchmarks for protein complex predictions. The second study applies SOM to investigate the intricate relationships between mRNA and protein expression levels during cotton fiber development. By clustering mRNA-protein pairs based on their time-course profiles, SOM captures distinct, non-linear patterns that extend beyond traditional linear correlation methods, offering new insights into gene expression and protein production dynamics. In the third study, SOM is integrated with a generative model, the Variational Autoencoder (VAE), and a Long Short-Term Memory (LSTM) network to tackle challenges in time-series clustering. This framework combines VAE for learning latent representations, LSTM for capturing temporal dependencies, and SOM for clustering, achieving superior performance in modeling complex temporal patterns. Through these studies, this thesis redefines SOM as a versatile and effective tool for uncovering complex patterns in biological and temporal datasets, demonstrating its relevance and potential in modern data analysis.
History
Degree Type
- Doctor of Philosophy
Department
- Statistics
Campus location
- West Lafayette