Asymmetry Learning for Out-of-distribution Tasks

Sekar, Chandra Mouli

doi:10.25394/PGS.25686429.v1

AsymmetryLearning_thesis.pdf (2.68 MB)

Asymmetry Learning for Out-of-distribution Tasks

thesis

posted on 2024-05-02, 17:45 authored by Chandra Mouli SekarChandra Mouli Sekar

Despite their astonishing capacity to fit data, neural networks have difficulties extrapolating beyond training data distribution. When the out-of-distribution prediction task is formalized as a counterfactual query on a causal model, the reason for their extrapolation failure is clear: neural networks learn spurious correlations in the training data rather than features that are causally related to the target label. This thesis proposes to perform a causal search over a known family of causal models to learn robust (maximally invariant) predictors for single- and multiple-environment extrapolation tasks.

First, I formalize the out-of-distribution task as a counterfactual query over a structural causal model. For single-environment extrapolation, I argue that symmetries of the input data are valuable for training neural networks that can extrapolate. I introduce Asymmetry learning, a new learning paradigm that is guided by the hypothesis that all (known) symmetries are mandatory even without evidence in training, unless the learner deems it inconsistent with the training data. Asymmetry learning performs a causal model search to find the simplest causal model defining a causal connection between the target labels and the symmetry transformations that affect the label. My experiments on a variety of out-of-distribution tasks on images and sequences show that proposed methods extrapolate much better than the standard neural networks.

Then, I consider multiple-environment out-of-distribution tasks in dynamical system forecasting that arise due to shifts in initial conditions or parameters of the dynamical system. I identify key OOD challenges in the existing deep learning and physics-informed machine learning (PIML) methods for these tasks. To mitigate these drawbacks, I combine meta-learning and causal structure discovery over a family of given structural causal models to learn the underlying dynamical system. In three simulated forecasting tasks, I show that the proposed approach is 2x to 28x more robust than the baselines.

Funding

CAREER IIS-1943364

CCF-1918483

CNS-2212160

Wabash Heartland Innovation Network (WHIN)

Amazon Research Award

Ford

Nvidia

CISCO

AnalytiXIN

Amazon

History

Degree Type

Doctor of Philosophy

Department

Computer Science

Campus location

West Lafayette

Advisor/Supervisor/Committee Chair

Bruno Ribeiro

Additional Committee Member 2

David Gleich

Additional Committee Member 3

Yexiang Xue

Additional Committee Member 4

Christopher Clifton

Usage metrics

Keywords

Deep learning Robustness Out-of-distribution Causality physics-informed machine learning invariance symmetry

Licence

CC BY 4.0

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Asymmetry Learning for Out-of-distribution Tasks

Funding

CAREER IIS-1943364

CCF-1918483

CNS-2212160

Wabash Heartland Innovation Network (WHIN)

Amazon Research Award

Ford

Nvidia

CISCO

AnalytiXIN

Amazon

History

Degree Type

Department

Campus location

Advisor/Supervisor/Committee Chair

Additional Committee Member 2

Additional Committee Member 3

Additional Committee Member 4

Usage metrics

Categories

Keywords

Licence

Exports