Statistica Sinica: Volume 34, Online Special Issue, April 2024This is an example of an RSS feedhttp://www3.stat.sinica.edu.tw/statistica/Tue, 23 April 2024 00:01:00 +0000 Tue, 23 April 2024 00:01:00 +00001800
/statistica/J34N21/J34N2100/J34N2100.html
LARGE SAMPLE PROPERTIES OF MATCHING FOR BALANCE Peiyi Zhang, Qifan Song and Faming Liang 1789-1808
/statistica/J34N21/J34N2101/J34N2101.html
LARGE SAMPLE PROPERTIES OF MATCHING FOR BALANCE Peiyi Zhang, Qifan Song and Faming Liang 1789-1808<span style='font-size=12pt;'><center>Abstract</center> The ensemble Kalman filter (EnKF) performs well in terms of data assimilation in atmospheric and oceanic sciences. However, it fails to converge to the correct filtering distribution, which precludes its use for uncertainty quantification in dynamic systems. Thus, we reformulate the EnKF under the framework of Langevin dynamics, yielding a new particle filtering algorithm, which we call the Langevinized EnKF (LEnKF). The LEnKF inherits the forecast-analysis procedure from the EnKF, and uses mini-batch data from stochastic gradient Langevin dynamics (SGLD). We prove that the LEnKF is a sequential preconditioned SGLD sampler, like the EnKF, but with its execution accelerated by the forecast-analysis procedure. Furthermore, the LEnKF converges to the correct filtering distribution in terms of the 2-Wasserstein distance as the number of iterations per i stage increases. We demonstrate the performance of the LEnKF using a variety of examples. The LEnKF is not only scalable with respect to the state dimension and the samplesize, but also tends to be immune to sample degeneracy for long-series dynamic data. <p>Key words and phrases: Data assimilation, inverse problem, state space model,stochastic gradient Markov chain Monte Carlo, uncertainty quantification. </span>
/statistica/J34N21/J34N2102/J34N2102.html
LARGE SAMPLE PROPERTIES OF MATCHING FOR BALANCE Yixin Wang and José R. Zubizarreta 1789-1808<span style='font-size=12pt;'><center>Abstract</center> We propose a divide-and-conquer approach to filtering. The proposed approach decomposes the state variable into low-dimensional components, to which standard particle filtering tools can be successfully applied, and recursively merges them to recover the full filtering distribution. This approach is less dependent on factorizing transition densities and observation likelihoods than are competing approaches, and can be applied to a broader class of models. We compare the performance of the proposed approach with that of state-of-the-art methods on a benchmark problem, and show that the proposed method is broadly comparable in settings in which the other methods are applicable, and that it can be applied in settings in which they cannot. <p>Key words and phrases: Data assimilation, marginal particle filter, particle filtering, spatio-temporal models, state-space model.</span>
/statistica/J34N21/J34N2103/J34N2103.html
LARGE SAMPLE PROPERTIES OF MATCHING FOR BALANCE Yixin Wang and José R. Zubizarreta 1789-1808<span style='font-size=12pt;'><center>Abstract</center> The particle-based rapid incremental smoother (PARIS) is a sequential Monte Carlo technique that allows for efficient online approximations of expectations of additive functionals under Feynman{Kac path distributions. Under weak assumptions, the algorithm has linear computational complexity and limited memory requirements. It also comes with a number of nonasymptotic bounds and convergence results. However, being based on self-normalized importance sampling, the PARIS estimator is biased. This bias is inversely proportional to the number of particles, but has been found to grow linearly with the time horizon, under appropriate mixing conditions. In this work, we propose the Parisian particle Gibbs (PPG) sampler, which has essentially the same complexity as that of the PARIS, but significantly reduces the bias for a given computational complexity at the cost of a modest increase in the variance. This method is a wrapper, in the sense that it uses the PARIS algorithm in the inner loop of the particle Gibbs algorithm to form a bias-reduced version of the targeted quantities. We substantiate the PPG algorithm with theoretical results, including new bounds on the bias and variance, as well as deviation inequalities. We illustrate our theoretical results using numerical experiments that support our claims. <p>Key words and phrases: Bias reduction, particle filters, particle Gibbs, sequential Monte Carlo, smoothing of additive functionals, state space smoothing.</span>
/statistica/J34N21/J34N2104/J34N2104.html
LARGE SAMPLE PROPERTIES OF MATCHING FOR BALANCE Yixin Wang and José R. Zubizarreta 1789-1808<span style='font-size=12pt;'><center>Abstract</center> We consider inference for a collection of partially observed stochastic interacting nonlinear dynamic processes. Each process is identified with a label, called its unit. Here, our primary motivation arises in biological metapopulation systems, in which a unit corresponds to a spatially distinct sub-population. Metapopulation systems are characterized by strong dependence over time within a single unit, and relatively weak interactions between units. These properties make block particle filters effective for simulation-based likelihood evaluation. Iterated filtering algorithms can facilitate likelihood maximization for simulation-based filters. Here, we introduce an iterated block particle filter that can be applied when parameters are unit-specific or shared between units. We demonstrate the proposed algorithm by performing inference on a coupled epidemiological model describing spatiotemporal measles case report data for 20 towns. <p>Key words and phrases: Data assimilation, inverse problem, state space model,stochastic gradient Markov chain Monte Carlo, uncertainty quantification. </span>
/statistica/J34N21/J34N2105/J34N2105.html
LARGE SAMPLE PROPERTIES OF MATCHING FOR BALANCE Yixin Wang and José R. Zubizarreta 1789-1808<span style='font-size=12pt;'><center>Abstract</center> Many epidemic models are naturally defined as individual-based models, in which we track the state of each individual within a susceptible population. However, inference for individual-based models is challenging because of the highdimensional state-space of such models, which increases exponentially with the population size. Here, we consider sequential Monte Carlo algorithms for inference for individual-based epidemic models, where we make direct observations of the state of a sample of individuals. Standard implementations, such as the bootstrap filter and auxiliary particle filter, are inefficient, owing to a mismatch between the proposal distribution of the state and future observations. We develop new efficient proposal distributions that consider future observations, leveraging the following properties: (i) we can analytically calculate the optimal proposal distribution for a single individual, given future observations and the future infection rate of that individual; and (ii) the dynamics of individuals are independent if we condition on their infection rates. Thus, we construct estimates of the future infection rate for each individual, and then use an independent proposal for the state of each individual, given this estimate. Empirical results show orders of magnitude improvement in efficiency of the sequential Monte Carlo sampler for both SIS and SEIR models. <p>Key words and phrases: Individual-based model, proposal distribution sequential Monte Carlo.</span>
/statistica/J34N21/J34N2106/J34N2106.html
LARGE SAMPLE PROPERTIES OF MATCHING FOR BALANCE Yixin Wang and José R. Zubizarreta 1789-1808<span style='font-size=12pt;'><center>Abstract</center> Monte Carlo sample paths of a dynamic system are useful for studying the underlying system and making statistical inferences related to the system. In many applications, the dynamic system being studied requires various types of constraints or observable features. In this study, we use a sequential Monte Carlo framework to investigate efficient methods for generating sample paths (with importance weights) from dynamic systems with rare and strong constraints. Specifically, we present a general formulation of the constrained sampling problem. Under such a formulation, we propose a exible resampling strategy based on a potentially time-varying lookahead timescale, and identify the corresponding optimal resampling priority scores based on an ensemble of forward or backward pilots. Several examples illustrate the performance of the proposed methods. <p>Key words and phrases: Constrained sampling, pilot, priority score, resampling,sequential Monte Carlo. </span>
/statistica/J34N21/J34N2107/J34N2107.html
LARGE SAMPLE PROPERTIES OF MATCHING FOR BALANCE Yixin Wang and José R. Zubizarreta 1789-1808<span style='font-size=12pt;'><center>Abstract</center> We develop a (nearly) unbiased particle filtering algorithm for a specific class of continuous-time state-space models in which (a) the latent process X<sub>t</sub> is a linear Gaussian diffusion, and (b) the observations arise from a Poisson process with intensity λ(X<sub>t</sub>). The likelihood and the posterior probability density function of the latent process include an intractable path integral. Our algorithm relies on using Poisson estimates to approximate this integral in an unbiased manner. We show how to tune these Poisson estimates to ensure that, with large probability, all but a few of the estimates generated by the algorithm are positive. Then setting the negative estimates to zero leads to a much smaller bias than that obtained using discretization. We quantify the probability of negative estimates for certain special cases, and show that our particle filter is effectively unbiased. We apply our method to a challenging 3D single molecule tracking example using a Born-Wolf observation model. <p>Key words and phrases: Continuous-time, Cox process, diffusions, hidden Markovmodel, particle filter, path integral, Poisson estimate, sequential Monte Carlo. </span>
/statistica/J34N21/J34N2108/J34N2108.html
LARGE SAMPLE PROPERTIES OF MATCHING FOR BALANCE Yixin Wang and José R. Zubizarreta 1789-1808<span style='font-size=12pt;'><center>Abstract</center> Particle filters, also known as sequential Monte Carlo, are a powerful computational tool for making inference with dynamical systems. In particular, it is widely used in state space models to estimate the likelihood function. However, estimating the gradient of the likelihood function is hard with sequential Monte Carlo, partially because the commonly used reparametrization trick is not applicable due to the discrete nature of the resampling step. To address this problem, we propose utilizing the smoothly jittered particle filter, which smooths the discrete resampling by adding noise to the resampled particles. We show that when the noise level is chosen correctly, no additional asymptotic error is introduced to the resampling step. We support our method with simulations. <p>Key words and phrases: Reparametrization trick, resampling, sequential Monte Carlo, state space models. </span>
/statistica/J34N21/J34N2109/J34N2109.html
LARGE SAMPLE PROPERTIES OF MATCHING FOR BALANCE Yixin Wang and José R. Zubizarreta 1789-1808<span style='font-size=12pt;'><center>Abstract</center> Sequential Monte Carlo (SMC) methods are widely used to draw samples from intractable target distributions. Weight degeneracy can hinder the use of SMC when the target distribution is highly constrained. As a motivating application, we consider the problem of sampling protein structures from the Boltzmann distribution. This paper proposes a general SMC method that propagates multiple descendants for each particle, followed by resampling to maintain the desired number of particles. A simulation study demonstrates the efficacy of the method for tackling the protein sampling problem, compared to existing SMC methods. As a real data example, we estimate the number of atomic contacts for a key segment of the SARS-CoV-2 viral spike protein. <p>Key words and phrases: Monte Carlo methods, particle filter, protein structureanalysis, SARS-CoV-2. </span>