Computational methods for protein-protein interaction identification
Understanding protein-protein interactions (PPIs) in a cell is essential for learning protein functions, pathways, and mechanisms of diseases. This dissertation introduces the computational method to predict PPIs. In the first chapter, the history of identifying protein interactions and some experimental methods are introduced. Because interacting proteins share similar functions, protein function similarity can be used as a feature to predict PPIs. NaviGO server is developed for biologists and bioinformaticians to visualize the gene ontology relationship and quantify their similarity scores. Furthermore, the computational features used to predict PPIs are summarized. This will help researchers from the computational field to understand the rationale of extracting biological features and also benefit the researcher with a biology background to understand the computational work. After understanding various computational features, the computational prediction method to identify large-scale PPIs was developed and applied to Arabidopsis, maize, and soybean in a whole-genomic scale. Novel predicted PPIs were provided and were grouped based on prediction confidence level, which can be used as a testable hypothesis to guide biologists’ experiments. Since affinity chromatography combined with mass spectrometry technique introduces high false PPIs, the computational method was combined with mass spectrometry data to aid the identification of high confident PPIs in large-scale. Lastly, some remaining challenges of the computational PPI prediction methods and future works are discussed.