Langevinized Ensemble Kalman Filter for Large-Scale Dynamic Systems
The Ensemble Kalman filter (EnKF) has achieved great successes in data assimilation in atmospheric and oceanic sciences, but its failure in convergence to the right filtering distribution precludes its use for uncertainty quantification. Other existing methods, such as particle filter or sequential importance sampler, do not scale well to the dimension of the system and the sample size of the datasets. In this dissertation, we address these difficulties in a coherent way.
In the first part of the dissertation, we reformulate the EnKF under the framework of Langevin dynamics, which leads to a new particle filtering algorithm, the so-called Langevinized EnKF (LEnKF). The LEnKF algorithm inherits the forecast-analysis procedure from the EnKF and the use of mini-batch data from the stochastic gradient Langevin-type algorithms, which make it scalable with respect to both the dimension and sample size. We prove that the LEnKF converges to the right filtering distribution in Wasserstein distance under the big data scenario that the dynamic system consists of a large number of stages and has a large number of samples observed at each stage, and thus it can be used for uncertainty quantification. We reformulate the Bayesian inverse problem as a dynamic state estimation problem based on the techniques of subsampling and Langevin diffusion process. We illustrate the performance of the LEnKF using a variety of examples, including the Lorenz-96 model, high-dimensional variable selection, Bayesian deep learning, and Long Short-Term Memory (LSTM) network learning with dynamic data.
In the second part of the dissertation, we focus on two extensions of the LEnKF algorithm. Like the EnKF, the LEnKF algorithm was developed for Gaussian dynamic systems containing no unknown parameters. We propose the so-called stochastic approximation- LEnKF (SA-LEnKF) for simultaneously estimating the states and parameters of dynamic systems, where the parameters are estimated on the fly based on the state variables simulated by the LEnKF under the framework of stochastic approximation. Under mild conditions, we prove the consistency of resulting parameter estimator and the ergodicity of the SA-LEnKF. For non-Gaussian dynamic systems, we extend the LEnKF algorithm (Extended LEnKF) by introducing a latent Gaussian measurement variable to dynamic systems. Those two extensions inherit the scalability of the LEnKF algorithm with respect to the dimension and sample size. The numerical results indicate that they outperform other existing methods in both states/parameters estimation and uncertainty quantification.