APPLICATION OF MANIFOLD EMBEDDING OF THE MOLECULAR SURFACE TO SOLID-STATE PROPERTY PREDICTION
The pharmaceutical industry depends on deeply understanding pharmaceutical excipients and active ingredients. The physicochemical properties must be sufficiently understood to create a safe and efficacious drug product. High-throughput methods have reduced the time and material required to measure many properties appropriately. However, some are more difficult to evaluate. One such property is solubility or the equilibrium dissolvable content of the material. Solubility is an essential factor in determining the bioavailability of an active ingredient and, therefore, directly impacts the effectiveness and marketability of the drug product.
Solubility can be a challenging, time-consuming, material-intensive property to measure correctly. Due to the challenge associated with determining experimental values, researchers have devoted a significant amount of time toward the accurate prediction of solubility values of drug-like compounds. This remains a difficult task as there are two hurdles to overcome: data quality and specificity of molecular descriptors. Large databases of reliable solubility values have become more readily available in recent years, lowering the first barrier to more accurate solubility predictions. The second hurdle has proven more challenging to overcome. Advances in artificial intelligence (AI) have provided opportunities for improvement in estimations. Expressly, the subsets of machine learning and neural networks have provided the ability to evaluate vast quantities of data with relative ease. The remaining barrier arises from appropriately selecting AI techniques with descriptors that accurately describe relevant features. Although many attempts have been made, no single set of descriptors with either data-driven approaches or ab initio methods has accurately predicted solubility.
The research within this dissertation focuses on an attempt to lower the second barrier to solubility prediction by starting with molecular features that are most important to solubility. By deriving molecular descriptors from the electronic properties on the surface of molecules, we obtain precise descriptions of the strength and locality of intermolecular interactions, critical factors in the extent of solubility. The novel molecular descriptors are readily integrated into a Deep-sets based Graph and Self-Attention Neural Network, which evaluates predictive performance. The findings of this research indicate significant improvement in predicting intrinsic solubility over other literature-reported methods.
History
Degree Type
- Doctor of Philosophy
Department
- Industrial and Physical Pharmacy
Campus location
- West Lafayette