Purdue University Graduate School
Browse

File(s) under embargo

Reason: Pending patenting application.

3

month(s)

13

day(s)

until file(s) become available

Learning From Data Across Domains: Enhancing Human and Machine Understanding of Data From the Wild

thesis
posted on 2023-12-13, 14:53 authored by Sean Michael KulinskiSean Michael Kulinski

Data is collected everywhere in our world; however, it often is noisy and incomplete. Different sources of data may have different characteristics, quality levels, or come from dynamic and diverse environments. This poses challenges for both humans who want to gain insights from data and machines which are learning patterns from data. How can we leverage the diversity of data across domains to enhance our understanding and decision-making? In this thesis, we address this question by proposing novel methods and applications that use multiple domains as more holistic sources of information for both human and machine learning tasks. For example, to help human operators understand environmental dynamics, we show the detection and localization of distribution shifts to problematic features, as well as how interpretable distributional mappings can be used to explain the differences between shifted distributions. For robustifying machine learning, we propose a causal-inspired method to find latent factors that are robust to environmental changes and can be used for counterfactual generation or domain-independent training; we propose a domain generalization framework that allows for fast and scalable models that are robust to distribution shift; and we introduce a new dataset based on human matches in StarCraft II that exhibits complex and shifting multi-agent behaviors. We showcase our methods across various domains such as healthcare, natural language processing (NLP), computer vision (CV), etc. to demonstrate that learning from data across domains can lead to more faithful representations of data and its generating environments for both humans and machines.

Funding

Northrop Grumman Corporation

Army Research Laboratory (W911NF-2020-221)

Office of Naval Research (N00014-23-C-1016)

National Science Foundation (IIS-2212097)

Microsoft Research

History

Degree Type

  • Doctor of Philosophy

Department

  • Electrical and Computer Engineering

Campus location

  • West Lafayette

Advisor/Supervisor/Committee Chair

David Inouye

Additional Committee Member 2

Murat Kocaoglu

Additional Committee Member 3

Stanley Chan

Additional Committee Member 4

Xiaoqian Wang