Purdue University Graduate School

File(s) under embargo

Reason: Preparing and submitting manuscript for publication.





until file(s) become available

A Machine Learning Model of Perturb-Seq Data for use in Space Flight Gene Expression Profile Analysis

posted on 2024-04-27, 18:39 authored by Liam Fitzpatric JohnsonLiam Fitzpatric Johnson

The genetic perturbations caused by spaceflight on biological systems tend to have a system-wide effect which is often difficult to deconvolute into individual signals with specific points of origin. Single cell multi-omic data can provide a profile of the perturbational effects but does not necessarily indicate the initial point of interference within a network. The objective of this project is to take advantage of large scale and genome-wide perturbational or Perturb-Seq datasets by using them to pre-train a generalist machine learning model that is capable of predicting the effects of unseen perturbations in new data. Perturb-Seq datasets are large libraries of single cell RNA sequencing data collected from CRISPR knock out screens in cell culture. The advent of generative machine learning algorithms, particularly transformers, make it an ideal time to re-assess large scale data libraries in order to grasp cell and even organism-wide genomic expression motifs. By tailoring an algorithm to learn the downstream effects of the genetic perturbations, we present a pre-trained generalist model capable of predicting the effects of multiple perturbations in combination, locating points of origin for perturbation in new datasets, predicting the effects of known perturbations in new datasets, and annotation of large-scale network motifs. We demonstrate the utility of this model by identifying key perturbational signatures in RNA sequencing data from spaceflown biological samples from the NASA Open Science Data Repository.


Degree Type

  • Master of Science


  • Agricultural and Biological Engineering

Campus location

  • West Lafayette

Advisor/Supervisor/Committee Chair

Dr. Caitlin Proctor

Advisor/Supervisor/Committee co-chair

Dr. Marshall Porterfield

Additional Committee Member 2

Dr. Leopold Green

Additional Committee Member 3

Dr. Lauren Sanders