Multimodal Data Management in Open-world Environment
The availability of abundant multimodal data, including textual, visual, and sensor-based information, holds the potential to improve decision-making in diverse domains. Extracting data-driven decision-making information from heterogeneous and changing datasets in real-world data-centric applications requires achieving complementary functionalities of multimodal data integration, knowledge extraction and mining, situationally-aware data recommendation to different users, and uncertainty management in the open-world setting. To achieve a system that encompasses all of these functionalities, several challenges need to be effectively addressed: (1) How to represent and analyze heterogeneous source contents and application context for multimodal data recommendation? (2) How to predict and fulfill current and future needs as new information streams in without user intervention? (3) How to integrate disconnected data sources and learn relevant information to specific mission needs? (4) How to scale from processing petabytes of data to exabytes? (5) How to deal with uncertainties in open-world that stem from changes in data sources and user requirements?
This dissertation tackles these challenges by proposing novel frameworks, learning-based data integration and retrieval models, and algorithms to empower decision-makers to extract valuable insights from diverse multimodal data sources. The contributions of this dissertation can be summarized as follows: (1) We developed SKOD, a novel multimodal knowledge querying framework that overcomes the data representation, scalability, and data completeness issues while utilizing streaming brokers and RDBMS capabilities with entity-centric semantic features as an effective representation of content and context. Additionally, as part of the framework, a novel text attribute recognition model called HART was developed, which leveraged language models and syntactic properties of large unstructured texts. (2) In the SKOD framework, we incrementally proposed three different approaches for data integration of the disconnected sources from their semantic features to build a common knowledge base with the user information need: (i) EARS: A mediator approach using schema mapping of the semantic features and SQL joins was proposed to address scalability challenges in data integration; (ii) FemmIR: A data integration approach for more susceptible and flexible applications, that utilizes neural network-based graph matching techniques to learn coordinated graph representations of the data. It introduces a novel graph creation approach from the features and a novel similarity metric among data sources; (iii) WeSJem: This approach allows zero-shot similarity matching and data discovery by using contrastive learning
to embed data samples and query examples in a high-dimensional space using features as a novel source of supervision instead of relevance labels. (3) Finally, to manage uncertainties in multimodal data management for open-world environments, we characterized novelties in multimodal information retrieval based on data drift. Moreover, we proposed a novelty detection and adaptation technique as an augmentation to WeSJem.
The effectiveness of the proposed frameworks, models, and algorithms was demonstrated
through real-world system prototypes that solved open problems requiring large-scale human
endeavors and computational resources. Specifically, these prototypes assisted law enforcement officers in automating investigations and finding missing persons.
Funding
NGC REALM
W911NF2020003
History
Degree Type
- Doctor of Philosophy
Department
- Computer Science
Campus location
- West Lafayette