The Design of an Oncology Knowledge Base from an Online Health Forum
Knowledge base completion is an important task that allows scientists to reason over knowledge bases and discover new facts. In this thesis, a patient-centric knowledge base
is designed and constructed using medical entities and relations extracted from the health forum r/cancer. The knowledge base stores information in binary relation triplets. It is enhanced with an is-a relation that is able to represent the hierarchical relationship between different medical entities. An enhanced Neural Tensor Network that utilizes the frequency of occurrence of relation triplets in the dataset is then developed to infer new facts from
the enhanced knowledge base. The results show that when the enhanced inference model uses the enhanced knowledge base, a higher accuracy (73.2 %) and recall@10 (35.4%) are obtained. In addition, this thesis describes a methodology for knowledge base and associated
inference model design that can be applied to other chronic diseases.
Funding
Merck Sharp & Dohme Corp., a subsidiary of Merck & Co., Inc., Kenilworth, NJ, USA
History
Degree Type
- Master of Science in Electrical and Computer Engineering
Department
- Electrical and Computer Engineering
Campus location
- Indianapolis