Impact Statement
Stochastic models describing time-evolving deterioration processes are widespread in engineering. In the modern data-rich engineering landscape, Bayesian methods can exploit monitoring data to sequentially update knowledge on underlying model parameters. The precise probabilistic characterization of these parameters is indispensable for several real-world tasks, where decisions need to be taken in view of the evaluated margins of risk and uncertainty. This work investigates and compares on-line and off-line Bayesian filters and adapts the former for posterior uncertainty quantification of time-invariant parameters. We show that tailored on-line particle filters are competitive alternatives to off-line Bayesian filters, especially in certain high-dimensional settings.
1. Introduction
Structural deterioration of various forms is present in most mechanical and civil structures and infrastructure systems. Accurate and effective tracking of structural deterioration processes can help to effectively manage it and minimize the total life-cycle costs (Cadini et al., Reference Cadini, Zio and Avram2009a; Frangopol, Reference Frangopol2011; Kim et al., Reference Kim, An and Choi2017; Kamariotis et al., Reference Kamariotis, Chatzi and Straub2022). The deployment of sensors on structural components/systems can enable long-term monitoring of such processes. Monitoring data obtained sequentially at different points in time must be utilized in an efficient manner within a Bayesian framework to enable data-informed estimation and prediction of the deterioration process evolution.
Monitored structural deterioration processes are commonly modeled using Markovian state-space representations (Myötyri et al., Reference Myötyri, Pulkkinen and Simola2006; Cadini et al., Reference Cadini, Zio and Avram2009b; Baraldi et al., Reference Baraldi, Cadini, Mangili and Zio2013), whereby the deterioration state evolution is represented by a recursive Markov process equation, and is subject to stochastic process noise (Corbetta et al., Reference Corbetta, Sbarufatti, Giglio and Todd2018). Monitoring information is incorporated by means of the measurement equation. The deterioration models further contain time-invariant uncertain parameters. The state-space can be augmented to include these parameters, if one wishes to obtain updated estimates thereof conditional on the monitoring information (Saha et al., Reference Saha, Goebel, Poll and Christophersen2009; Straub, Reference Straub2009; Sun et al., Reference Sun, Zuo, Wang and Pecht2014; Corbetta et al., Reference Corbetta, Sbarufatti, Giglio and Todd2018; Yi and Song, Reference Yi and Song2018; Cristiani et al., Reference Cristiani, Sbarufatti and Giglio2021; Kamariotis et al., Reference Kamariotis, Chatzi and Straub2023); this is referred to as joint state-parameter estimation (Särkkä, Reference Särkkä2013; Kantas et al., Reference Kantas, Doucet, Singh and Chopin2015).
A different approach entails defining the structural deterioration state evolution solely as a function of uncertain time-invariant model parameters (Ditlevsen and Madsen, Reference Ditlevsen and Madsen1996; Vu and Stewart, Reference Vu and Stewart2000; Elingwood, Reference Elingwood2005; Stewart and Mullard, Reference Stewart and Mullard2007), which can be updated in view of the monitoring data. This updating, referred to herein as Bayesian parameter estimation, is often the primary task of interest. In this approach, the deterioration state variables are obtained as outputs of the calibrated deterioration model with posterior parameter estimates (Kennedy and O’Hagan, Reference Kennedy and O’Hagan2001; Ramancha et al., Reference Ramancha, Astroza, Madarshahian and Conte2022). The parameter estimation problem can be cast into a Markovian state-space representation. Quantifying the full posterior uncertainty of the time-invariant model parameters is essential for performing monitoring-informed predictions on the deterioration process evolution, the subsequent monitoring-informed estimation of the time-variant structural reliability (Melchers and Beck, Reference Melchers and Beck2017; Straub et al., Reference Straub, Schneider, Bismut and Kim2020) or the remaining useful life (Sun et al., Reference Sun, Zuo, Wang and Pecht2014; Kim et al., Reference Kim, An and Choi2017), and eventually for predictive maintenance planning.
Bayesian estimation of time-invariant deterioration model parameters is the main focus of this article. In long-term deterioration monitoring settings, where data is obtained sequentially at different points in time, Bayesian inference can be performed either in an on-line or an off-line framework (Storvik, Reference Storvik2002; Kantas et al., Reference Kantas, Doucet, Singh and Chopin2015; Azam et al., Reference Azam, Chatzi, Papadimitriou and Smyth2017). In literature, these are also referred to as recursive (on-line) and batch (off-line) estimation (Särkkä, Reference Särkkä2013). Parameter estimation is cast into a state-space setup to render it suitable for application with on-line Bayesian filtering algorithms (Kantas et al., Reference Kantas, Doucet, Singh and Chopin2015), such as the Kalman filter (Kalman, Reference Kalman1960) and its nonlinear variants (Jazwinski, Reference Jazwinski1970; Julier and Uhlmann, Reference Julier, Uhlmann and Kadar1997; Daum, Reference Daum2005; Song et al., Reference Song, Astroza, Ebrahimian, Moaveni and Papadimitriou2020), the ensemble Kalman filter (Evensen, Reference Evensen2006), and particle filters (Doucet et al., Reference Doucet, de Freitas and Gordon2001; Chopin, Reference Chopin2004; Doucet and Johansen, Reference Doucet and Johansen2008; Hu et al., Reference Hu, Schon and Ljung2008; Särkkä, Reference Särkkä2013; Tatsis et al., Reference Tatsis, Dertimanis and Chatzi2022). We employ on-line particle filter methods for pure recursive estimation of time-invariant deterioration model parameters. This is not the typical use case for such methods, which often yield degenerate and impoverished posterior estimates, hence, failing to effectively characterize the posterior uncertainty (Del Moral et al., Reference Del Moral, Doucet and Jasra2006; Särkkä, Reference Särkkä2013). In this work, we tailor on-line particle filters for quantifying the full posterior uncertainty of time-invariant model parameters. Subsequently, we provide a formal investigation and discussion on their suitability with respect to this task.
In its most typical setting within engineering applications, Bayesian parameter estimation is commonly performed with the use of off-line Markov Chain Monte Carlo (MCMC) methods, which have been used extensively in statistics and engineering to sample from complex posterior distributions of model parameters (Hastings, Reference Hastings1970; Gilks et al., Reference Gilks, Richardson and Spiegelhalter1995; Beck and Au, Reference Beck and Au2002; Haario et al., Reference Haario, Laine, Mira and Saksman2006; Ching and Chen, Reference Ching and Chen2007; Papaioannou et al., Reference Papaioannou, Betz, Zwirglmaier and Straub2015; Wu et al., Reference Wu, Angelikopoulos, Papadimitriou and Koumoutsakos2017; Lye et al., Reference Lye, Cicirello and Patelli2021). However, use of off-line methods for on-line estimation tasks is computationally prohibitive (Del Moral et al., Reference Del Moral, Doucet and Jasra2006; Kantas et al., Reference Kantas, Doucet, Singh and Chopin2015). Additionally, when considering off-line inference, in settings when measurements are obtained sequentially at different points in time, off-line MCMC methods tend to induce a larger computational cost than on-line particle filter methods, which can be important, for example, when optimizing inspection and monitoring (Papakonstantinou and Shinozuka, Reference Papakonstantinou and Shinozuka2014; Luque and Straub, Reference Luque and Straub2019; Kamariotis et al., Reference Kamariotis, Chatzi and Straub2023). Questions that we investigate in this context include: Can one precisely quantify the posterior uncertainty of time-invariant parameters when employing on-line particle filter methods? How does this quantification compare against the posterior estimates obtained with off-line MCMC methods? How does the estimation accuracy depend on the nature of the problem, that is, dimensionality, nonlinearity, or non-Gaussianity? What is the computational cost induced by the different methods? Ideally, one would opt for the method which can provide sufficiently accurate posterior results at the expense of the least computational cost. To address these questions, this article selects and adapts algorithms in view of parameter estimation, and performs a comparative assessment of selected off-line and on-line Bayesian filters specifically tailored for posterior uncertainty quantification of time-invariant parameters. The innovative comparative assessment results in a set of suggestions on the choice of the appropriate algorithm in function of problem characteristics.
The article is structured as follows. Section 2 provides a detailed description of on-line and off-line Bayesian inference in the context of parameter estimation. Three different selected and adapted Bayesian filters are presented in algorithmic detail, namely an on-line particle filter with Gaussian mixture-based resampling (PFGM) (van der Merwe and Wan, Reference van der Merwe and Wan2003; McLachlan and Krishnan, Reference McLachlan and Krishnan2007), the on-line iterated batch importance sampling filter (IBIS) (Chopin, Reference Chopin2002), which performs off-line MCMC steps with a Gaussian mixture as a proposal distribution, and an off-line MCMC-based sequential Monte Carlo (SMC) filter (Del Moral et al., Reference Del Moral, Doucet and Jasra2006), which enforces tempering of the likelihood function (known as simulated annealing) to sequentially arrive to the single final posterior density of interest (Neal, Reference Neal2001; Jasra et al., Reference Jasra, Stephens, Doucet and Tsagaris2011). The tPFGM and tIBIS variants, which adapt the PFGM and IBIS filters by employing tempering of the likelihood function of each new measurement, are further presented and proposed for problems with high sensor information amount. Section 3 describes the two case studies that serve as the basis for numerical investigations, one nonlinear, non-Gaussian, and low-dimensional, and one linear, Gaussian, and high-dimensional. MATLAB codes implementing the different algorithms and applying them on the two case studies introduced in this article are made publicly available via a GitHub repository.Footnote 1 Section 4 summarizes the findings of this comparative assessment, provides suggestions on choice of the appropriate method according to the nature of the problem, discusses cases which are not treated in our investigations, and concludes this work.
2. On-Line and Off-Line Bayesian Filtering for Time-Invariant Parameter Estimation
This work assumes the availability of a stochastic deterioration model $ D $ , parametrized by a vector $ \boldsymbol{\theta} \in \mathrm{I}{\mathrm{R}}^d $ containing the $ d $ uncertain time-invariant model parameters. We collect the uncertain parameters influencing the deterioration process in the vector $ \boldsymbol{\theta} $ . In the Bayesian framework, $ \boldsymbol{\theta} $ is modeled as a vector of random variables with a prior distribution $ {\pi}_{\mathrm{pr}}\left(\boldsymbol{\theta} \right) $ . We assume that the deterioration process is monitored via a permanently installed monitoring system. Long-term monitoring of a deterioration process leads to sets of noisy measurements $ \left\{{y}_1,\dots, {y}_n\right\} $ obtained sequentially at different points in time $ \left\{{t}_1,\dots, {t}_n\right\} $ throughout the lifetime of a structural component/system. Such measurements can be used to update the distribution of $ \boldsymbol{\theta} $ ; this task is referred to as Bayesian parameter estimation. Within a deterioration monitoring setting, Bayesian parameter estimation can be performed either in an on-line or an off-line framework (Kantas et al., Reference Kantas, Doucet, Singh and Chopin2015), depending on the task of interest.
In an on-line framework, one is interested in updating the distribution of $ \boldsymbol{\theta} $ in a sequential manner, that is, at every time step $ {t}_n $ when a new measurement $ {y}_n $ becomes available, conditional on all measurements available up to $ {t}_n $ . Thus, in an on-line framework, inference of the sequence of posterior densities $ {\left\{{\pi}_{\mathrm{pos}}\left(\boldsymbol{\theta} |{\mathbf{y}}_{1:n}\right)\right\}}_{n\ge 1} $ is the goal, where $ {\mathbf{y}}_{1:n} $ denotes the components $ \left\{{y}_1,\dots, {y}_n\right\} $ . We point out that in this article the term on-line does not relate to “real-time” estimation, although on-line algorithms are also used in real-time estimation (Chatzi and Smyth, Reference Chatzi and Smyth2009; Russell and Norvig, Reference Russell and Norvig2021).
In contrast, in an off-line framework, inference of $ \boldsymbol{\theta} $ is performed at a fixed time step $ {t}_N $ using a fixed set of measurements $ \left\{{y}_1,\dots, {y}_N\right\} $ , and the single posterior density $ {\pi}_{\mathrm{pos}}\left(\boldsymbol{\theta} |{\mathbf{y}}_{1:N}\right) $ is sought, which can be estimated via Bayes’ rule as:
where $ L\left({\mathbf{y}}_{1:N}|\boldsymbol{\theta} \right) $ denotes the likelihood function of the whole measurement set $ {\mathbf{y}}_{1:N} $ given the parameters $ \boldsymbol{\theta} $ . With the assumption that measurements are independent given the parameter state, $ L\left({\mathbf{y}}_{1:N}|\boldsymbol{\theta} \right) $ can be expressed as a product of the likelihoods $ L\left({y}_n|\boldsymbol{\theta} \right) $ as:
MCMC methods sample from $ {\pi}_{\mathrm{pos}}\left(\boldsymbol{\theta} |{\mathbf{y}}_{1:N}\right) $ via simulation of a Markov chain with $ {\pi}_{\mathrm{pos}}\left(\boldsymbol{\theta} |{\mathbf{y}}_{1:N}\right) $ as its stationary distribution, for example, by performing Metropolis Hastings (MH) steps (Hastings, Reference Hastings1970). MCMC methods do not require estimation of the normalization constant in equation (1). However, in the on-line framework, MCMC methods are impractical, since they require simulating anew a different Markov chain for each new posterior $ {\pi}_{\mathrm{pos}}\left(\boldsymbol{\theta} |{\mathbf{y}}_{1:n}\right) $ , and the previously generated Markov chain for the posterior estimation of $ {\pi}_{\mathrm{pos}}\left(\boldsymbol{\theta} |{\mathbf{y}}_{1:n-1}\right) $ is not accounted for, except when choosing the seed for initializing the new Markov chain. This implies that MCMC methods quickly become computationally prohibitive in the on-line framework, already for a small $ n $ . An additional computational burden stems from the fact that each step within the MCMC sampling process requires evaluation of the full likelihood function $ L\left({\mathbf{y}}_{1:n}|\boldsymbol{\theta} \right) $ , that is, the whole set of measurements $ {\mathbf{y}}_{1:n} $ needs to be processed. This leads to increasing computational complexity for increasing $ n $ , and can render use of MCMC methods computationally inefficient even for off-line inference, especially when $ N $ is large.
On-line particle filters (Särkkä, Reference Särkkä2013; Kantas et al., Reference Kantas, Doucet, Singh and Chopin2015) operate in a sequential fashion by making use of the Markovian property of the employed state-space representation, that is, they compute $ {\pi}_{\mathrm{pos}}\left(\boldsymbol{\theta} |{\mathbf{y}}_{1:n}\right) $ solely based on $ {\pi}_{\mathrm{pos}}\left(\boldsymbol{\theta} |{\mathbf{y}}_{1:n-1}\right) $ and the new measurement $ {y}_n $ . The typical use of particle filters targets the tracking of a system’s response (dynamic state) by means of a state-space representation (Gordon et al., Reference Gordon, Salmond and Smith1993; Särkkä, Reference Särkkä2013), while they are often also used also for joint state-parameter estimation tasks, wherein the state-space is augmented to include the model parameters to be estimated (Särkkä, Reference Särkkä2013; Kantas et al., Reference Kantas, Doucet, Singh and Chopin2015). In addition, particle filters can also be applied for pure recursive estimation of time-invariant parameters, for which the noise in the dynamic model is formally zero (Del Moral et al., Reference Del Moral, Doucet and Jasra2006; Särkkä, Reference Särkkä2013), although this is not the typical setting for application of particle filters. A model of the Markovian discrete-time state-space representation for the case of time-invariant parameter estimation is given in equations (3a) and (3b).
where $ {\varepsilon}_n $ models the error/noise of the measurement at time $ {t}_n $ , and $ {\boldsymbol{\theta}}_n $ denotes the time-invariant parameter vector at time step $ n $ . The dynamic equation for the time-invariant parameters equation (3a) is introduced for the sole purpose of casting the problem into a state-space representation. Since the measurements are assumed independent given the parameter state, the errors $ {\varepsilon}_n $ in equation (3b) are independent. It should be noted that the measurement error, which is introduced in multiplicative form in equation (3b), is commonly expressed in an additive form (Corbetta et al., Reference Corbetta, Sbarufatti, Giglio and Todd2018). Indeed, equation (3b) can be reformulated in the logarithmic scale, whereby the measurement error is expressed in an additive form. The multiplicative form of the measurement error in equation (3b) is consistent with the fact that—in the context of structural deterioration—measurements $ {y}_n $ cannot be negative. All target distributions of interest in the sequence $ {\pi}_{\mathrm{pos}}\left({\boldsymbol{\theta}}_n|{\mathbf{y}}_{1:n}\right) $ are defined on the same space of $ \boldsymbol{\theta} \in \mathrm{I}{\mathrm{R}}^d $ . In the remainder of this article, the subscript $ n $ will therefore be dropped from $ {\boldsymbol{\theta}}_n $ . As previously discussed, particle filters are mainly used for on-line inference. However, these can also be used in exactly the same way for off-line inference, where only a single posterior density $ {\pi}_{\mathrm{pos}}\left(\boldsymbol{\theta} |{\mathbf{y}}_{1:N}\right) $ is of interest. In this case, particle filters use the sequence of measurements successively to sequentially arrive to the final single posterior density of interest via estimating all the intermediate distributions.
2.1. On-line particle filter
Particle filter (PF) methods, also referred to as sequential Monte Carlo (SMC) methods, are importance sampling-based techniques that use a set of weighted samples $ \left\{\left({\boldsymbol{\theta}}_n^{(i)},{w}_n^{(i)}\right):i=1,\dots, {N}_{\mathrm{par}}\right\} $ , called particles, to represent the posterior distribution of interest at estimation time step $ n $ , $ {\pi}_{\mathrm{pos}}\left(\boldsymbol{\theta} |{\mathbf{y}}_{1:n}\right) $ . PFs form the following approximation to the posterior distribution of interest:
where $ \delta $ denotes the Dirac delta function.
When a new measurement $ {y}_n $ becomes available, PFs shift from $ {\pi}_{\mathrm{pos}}\left(\boldsymbol{\theta} |{\mathbf{y}}_{1:n-1}\right) $ to $ {\pi}_{\mathrm{pos}}\left(\boldsymbol{\theta} |{\mathbf{y}}_{1:n}\right) $ by importance sampling using an appropriately chosen importance distribution, which results in a reweighting procedure (updating of the weights). An important issue that arises from this weight-updating procedure is the sample degeneracy problem (Särkkä, Reference Särkkä2013). This relates to the fact that the importance weights $ {w}_n^{(i)} $ become more unevenly distributed with each updating step. In most cases, after a certain number of updating steps, the weights of almost all the particles assume values close to zero (see Figure 1). This problem is alleviated by the use of adaptive resampling procedures based on the effective sample size $ {N}_{\mathrm{eff}}=1/{\sum}_{i=1}^{N_{\mathrm{par}}}{\left({w}_n^{(i)}\right)}^2 $ (Liu and Chen, Reference Liu and Chen1998). Most commonly, resampling is performed with replacement according to the particle weights whenever $ {N}_{\mathrm{eff}} $ drops below a user-defined threshold $ {N}_{\mathrm{T}}={cN}_{\mathrm{par}},c\in \left[0,1\right] $ . Resampling introduces additional variance to the parameter estimates (Särkkä, Reference Särkkä2013). In the version of the PF algorithm presented in Algorithm 1, the dynamic model of equation (3) is used as the importance distribution, as originally proposed in the bootstrap filter by Gordon et al. (Reference Gordon, Salmond and Smith1993). Theoretical analysis of PF algorithms can be found in numerous seminal sources (e.g., Chopin, Reference Chopin2004; Del Moral et al., Reference Del Moral, Doucet and Jasra2006; Hu et al., Reference Hu, Schon and Ljung2008).
Algorithm 1 Particle Filter (PF)
1: generate $ {N}_{\mathrm{par}} $ initial particles $ {\boldsymbol{\theta}}^{(i)} $ from $ {\pi}_{\mathrm{pr}}\left(\boldsymbol{\theta} \right) $ , $ i=1,\dots, {N}_{\mathrm{par}} $
2: assign initial weights $ {w}_0^{(i)}=1/{N}_{\mathrm{par}} $ , $ i=1,\dots, {N}_{\mathrm{par}} $
3: for $ n=1,\dots, N $ do
4: evaluate likelihood of the particles based on new measurement $ {y}_n $ , $ {L}_n^{(i)}=L\left({y}_n|{\boldsymbol{\theta}}^{(i)}\right) $
5: update particle weights $ {w}_n^{(i)}\propto {L}_n^{(i)}\cdot {w}_{n-1}^{(i)} $ and normalize $ \mathrm{s}.\mathrm{t}.\hskip1em {\sum}_{i=1}^{N_{\mathrm{par}}}{w}_n^{(i)}=1 $
6: evaluate $ {N}_{\mathrm{eff}}=\frac{1}{\sum_{i=1}^{N_{\mathrm{par}}}{\left({w}_n^{(i)}\right)}^2} $
7: if $ {N}_{\mathrm{eff}}<{N}_{\mathrm{T}} $ then
8: resample particles $ {\boldsymbol{\theta}}^{(i)} $ with replacement according to $ {w}_n^{(i)} $
9: reset particle weights to $ {w}_n^{(i)}=1/{N}_{\mathrm{par}} $
10: end if
11: end for
When using PFs to estimate time-invariant parameters, for which the process noise in the dynamic equation is zero, one runs into the issue of sample impoverishment (Särkkä, Reference Särkkä2013). The origin of this issue is the resampling process. More specifically, after a few resampling steps, most (or in extreme cases all) of the particles in the sample set end up assuming the exact same value, that is, the particle set consists of only few (or one) distinct particles (see Figure 1). The sample impoverishment issue poses the greatest obstacle for time-invariant parameter estimation with PFs. A multitude of techniques has been suggested in literature to alleviate the sample impoverishment issue in joint state-parameter estimation setups (see, e.g., Gilks and Berzuini, Reference Gilks and Berzuini2001; Liu and West, Reference Liu and West2001; Musso et al., Reference Musso, Oudjane and Le Gland2001; Storvik, Reference Storvik2002; Andrieu et al., Reference Andrieu, Doucet, Singh and Tadic2004; Andrieu et al., Reference Andrieu, Doucet and Holenstein2010; Carvalho et al., Reference Carvalho, Johannes, Lopes and Polson2010; Chatzi and Smyth, Reference Chatzi and Smyth2013; Chopin et al., Reference Chopin, Jacob and Papaspiliopoulos2013). Fewer works have proposed solutions for resolving this issue in parameter estimation setups (see, e.g., Chopin, Reference Chopin2002; Del Moral et al., Reference Del Moral, Doucet and Jasra2006). One of the simplest and most commonly used approaches consists in introducing artificial dynamics in the dynamic model of the parameter vector, that is, the dynamic model $ {\boldsymbol{\theta}}_n={\boldsymbol{\theta}}_{n-1}+{\boldsymbol{\varepsilon}}_{n-1} $ is employed, where $ {\boldsymbol{\varepsilon}}_{n-1} $ is a small artificial process noise (Kitagawa, Reference Kitagawa1998). In this way, the time-invariant parameter vector is transformed into a time-variant one, therefore, the parameter estimation problem deviates from the original one (Särkkä, Reference Särkkä2013; Kantas et al., Reference Kantas, Doucet, Singh and Chopin2015). This approach can introduce a bias and an artificial variance inflation in the estimates (Kantas et al., Reference Kantas, Doucet, Singh and Chopin2015). For these reasons, this approach is not considered in this article.
To resolve the sample impoverishment issue encountered when using the PF Algorithm 1 for parameter estimation, this work employs the particle filter with Gaussian mixture resampling (PFGM), described in Algorithm 2. The PFGM algorithm relates to preexisting concepts (van der Merwe and Wan, Reference van der Merwe and Wan2003; Veettil and Chakravorty, Reference Veettil and Chakravorty2016), and is here specifically tailored for the parameter estimation task, with its main goal being, in contrast to previous works, the quantification of the full posterior parameter uncertainty. A comparison between Algorithms 1 and 2 shows that the only difference lies in the way that the resampling step is performed. PFGM replaces the standard resampling process of PF by first approximating the posterior distribution at estimation step $ n $ by a Gaussian mixture model (GMM), which is fitted via the Expectation–Maximization (EM) algorithm (McLachlan and Krishnan, Reference McLachlan and Krishnan2007; Chen et al., Reference Chen, Gupta, Chen and Gupta2010) on the weighted particle set. The degenerating particle set is then rejuvenated by sampling $ {N}_{\mathrm{par}} $ new particles from the GMM of equation (5),
where $ {\phi}_i $ represents the weight of the Gaussian component $ i $ , while $ {\boldsymbol{\mu}}_i $ and $ {\boldsymbol{\Sigma}}_i $ are the respective mean vector and covariance matrix. The number of Gaussians in the mixture $ {N}_{\mathrm{GM}} $ , has to be chosen in advance, or can be estimated by use of appropriate algorithms (Schubert et al., Reference Schubert, Sander, Ester, Kriegel and Xu2017; Celeux et al., Reference Celeux, Fruewirth-Schnatter and Robert2019; Geyer et al., Reference Geyer, Papaioannou and Straub2019). In the numerical investigations of Section 3, we set $ {N}_{\mathrm{GM}} $ = 8. We point out that the efficacy of PFGM strongly depends on the quality of the GMM posterior approximation. The reason for applying a GMM (and not a single Gaussian) is that the posterior distribution can deviate from the normal distribution, and can even be multimodal or heavy-tailed.
Algorithm 2 Particle Filter with Gaussian mixture resampling (PFGM)
1: generate $ {N}_{\mathrm{par}} $ initial particles $ {\boldsymbol{\theta}}^{(i)} $ from $ {\pi}_{\mathrm{pr}}\left(\boldsymbol{\theta} \right) $ , $ i=1,\dots, {N}_{\mathrm{par}} $
2: assign initial weights $ {w}_0^{(i)}=1/{N}_{\mathrm{par}} $ , $ i=1,\dots, {N}_{\mathrm{par}} $
3: for $ n=1,\dots, N $ do
4: evaluate likelihood of the particles based on new measurement $ {y}_n $ , $ {L}_n^{(i)}=L\left({y}_n|{\boldsymbol{\theta}}^{(i)}\right) $
5: update particle weights $ {w}_n^{(i)}\propto {L}_n^{(i)}\cdot {w}_{n-1}^{(i)} $ and normalize $ \mathrm{s}.\mathrm{t}.{\sum}_{i=1}^{N_{\mathrm{par}}}{w}_n^{(i)}=1 $
6: evaluate $ {N}_{\mathrm{eff}}=\frac{1}{\sum_{i=1}^{N_{\mathrm{par}}}{\left({w}_n^{(i)}\right)}^2} $
7: if $ {N}_{\mathrm{eff}}<{N}_{\mathrm{T}} $ then
8: EM: fit a Gaussian mixture proposal distribution $ {g}_{\mathrm{GM}}\left(\boldsymbol{\theta} \right) $ according to $ \left\{{\boldsymbol{\theta}}^{(i)},{w}_n^{(i)}\right\} $
9: sample $ {N}_{\mathrm{par}} $ new particles $ {\boldsymbol{\theta}}^{(i)} $ from $ {g}_{\mathrm{GM}}\left(\boldsymbol{\theta} \right) $
10: reset particle weights to $ {w}_n^{(i)}=1/{N}_{\mathrm{par}} $
11: end if
12: end for
The simple reweighting procedure used in the on-line PFs is based on the premise that $ {\pi}_{\mathrm{pos}}\left(\boldsymbol{\theta} |{\mathbf{y}}_{1:n-1}\right) $ and $ {\pi}_{\mathrm{pos}}\left(\boldsymbol{\theta} |{\mathbf{y}}_{1:n}\right) $ are likely to be similar, that is, that the new measurement $ {y}_n $ will not cause a very large change in the posterior. However, when that is not the case, this simple reweighting procedure is bound to perform poorly, leading to very fast degeneration of the particle set. In cases where already the first measurement set $ {y}_1 $ is strongly informative relative to the prior, the PF is bound to strongly degenerate already in the first weight updating step (e.g., we observe this in the second case study of subsec:RF in the case of 10 sensors). To counteract this issue, in this article, we incorporate the idea of simulated annealing (enforcing tempering of the likelihood function) (Neal, Reference Neal2001) when needed within the on-line PFGM algorithm, which we term the tPFGM Algorithm 3. The tPFGM algorithm draws inspiration from previous works (Deutscher et al., Reference Deutscher, Blake and Reid2000; Gall et al., Reference Gall, Potthoff, Schnörr, Rosenhahn and Seidel2007), but is here tailored for the parameter estimation task, opting for the quantification of the full posterior parameter uncertainty. The algorithm operates as follows: At estimation time step $ n $ , before performing the reweighting operation, the algorithm first checks the updated effective sample size for indication of sample degeneracy. If no degeneracy is detected, tPFGM operates exactly like PFGM. When sample degeneracy occurs, tPFGM employs adaptive tempering of the likelihood $ L\left({y}_n|\boldsymbol{\theta} \right) $ of the new measurement $ {y}_n $ in order to “sequentially” sample from $ {\pi}_{\mathrm{pos}}\left(\boldsymbol{\theta} |{\mathbf{y}}_{1:n-1}\right) $ to $ {\pi}_{\mathrm{pos}}\left(\boldsymbol{\theta} |{\mathbf{y}}_{1:n}\right) $ by visiting a sequence of artificial intermediate posteriors, as defined by the tempered likelihood function $ {L}^q\left({y}_n|\boldsymbol{\theta} \right) $ . The tempering factor $ q $ takes values between 0 and 1. When $ q=0 $ , the new measurement $ {y}_n $ is neglected, while $ q=1 $ entails considering the whole likelihood function of $ {y}_n $ , thus reaching to $ {\pi}_{\mathrm{pos}}\left(\boldsymbol{\theta} |{\mathbf{y}}_{1:n}\right) $ . The intermediate values of $ q $ are adaptively selected via solution of the optimization problem in line 11 of Algorithm 3, which ensures that the effective sample size does not drop below the threshold $ {N}_{\mathrm{T}} $ for the chosen $ q $ value. Naturally, use of tPFGM can trigger more resampling events than PFGM, as resampling can occur more than once within a time step $ n $ .
Algorithm 3 Particle Filter with Gaussian mixture resampling and likelihood tempering (tPFGM)
1: generate $ {N}_{\mathrm{par}} $ initial particles $ {\boldsymbol{\theta}}^{(i)} $ from $ {\pi}_{\mathrm{pr}}\left(\boldsymbol{\theta} \right) $ , $ i=1,\dots, {N}_{\mathrm{par}} $
2: assign initial weights $ {w}_0^{(i)}=1/{N}_{\mathrm{par}} $ , $ i=1,\dots, {N}_{\mathrm{par}} $
3: for $ n=1,\dots, N $ do
4: evaluate likelihood of the particles based on new measurement $ {y}_n $ , $ {L}_n^{(i)}=L\left({y}_n|{\boldsymbol{\theta}}^{(i)}\right) $
5: set $ q=0 $ and create auxiliary particle weights $ {w}_a^{(i)}={w}_{n-1}^{(i)} $
6: while $ q\ne 1 $ do
7: if $ {N}_{\mathrm{eff}}={\left({\sum}_{i=1}^{N_{\mathrm{par}}}{w}_a^{(i)}\cdot {L_n^{(i)}}^{1-q}\right)}^2/{\sum}_{i=1}^{N_{\mathrm{par}}}{\left({w}_a^{(i)}\cdot {L_n^{(i)}}^{1-q}\right)}^2>{N}_{\mathrm{T}} $ then
8: update auxiliary particle weights $ {w}_a^{(i)}\propto {w}_a^{(i)}\cdot {L_n^{(i)}}^{1-q} $ and normalize $ \mathrm{s}.\mathrm{t}.\hskip1em {\sum}_{i=1}^{N_{\mathrm{par}}}{w}_a^{(i)}=1 $
9: set $ q=1 $
10: else
11: solve $ {\left({\sum}_{i=1}^{N_{\mathrm{par}}}{w}_a^{(i)}\cdot {L_n^{(i)}}^{dq}\right)}^2/{\sum}_{i=1}^{N_{\mathrm{par}}}{\left({w}_a^{(i)}\cdot {L_n^{(i)}}^{dq}\right)}^2-{N}_{\mathrm{T}}=0 $ for $ dq $
12: set $ {q}_{\mathrm{new}}=\min \left[q+ dq,1\right] $
13: set $ dq={q}_{\mathrm{new}}-q $ and $ q={q}_{\mathrm{new}} $
14: update auxiliary particle weights $ {w}_a^{(i)}\propto {w}_a^{(i)}\cdot {L_n^{(i)}}^{dq} $ and normalize $ \mathrm{s}.\mathrm{t}.\hskip1em {\sum}_{i=1}^{N_{\mathrm{par}}}{w}_a^{(i)}=1 $
15: EM: fit a Gaussian mixture proposal distribution $ {g}_{\mathrm{GM}}\left(\boldsymbol{\theta} \right) $ according to $ \left\{{\boldsymbol{\theta}}^{(i)},{w}_a^{(i)}\right\} $
16: sample $ {N}_{\mathrm{par}} $ new particles $ {\boldsymbol{\theta}}^{(i)} $ from $ {g}_{\mathrm{GM}}\left(\boldsymbol{\theta} \right) $
17: reset auxiliary particle weights to $ {w}_a^{(i)}=1/{N}_{\mathrm{par}} $
18: end if
19: end while
20: set $ {w}_n^{(i)}={w}_a^{(i)} $
21: end for
The PFGM and tPFGM filters rely entirely on the posterior approximation via a GMM for sampling $ {N}_{\mathrm{par}} $ new particles during the resampling process. However, there is no guarantee that these new particles follow the true posterior distribution of interest. The IBIS filter of the following Section 2.2 aims at addressing this issue.
2.2. Iterated batch importance sampling
Implementing MCMC steps within PF methods to move the particles after a resampling step was originally proposed by Gilks and Berzuini (Reference Gilks and Berzuini2001), in the so-called resample-move algorithm. Chopin (Reference Chopin2002) introduced a special case of the resample-move algorithm, specifically tailored for application to static parameter estimation purposes, namely the iterated batch importance sampling (IBIS) filter. IBIS was originally introduced as an iterative method for solving off-line estimation tasks by incorporating the sequence of measurements one at a time. In doing this, the algorithm visits the sequence of intermediate posteriors within its process, and can therefore also be used to perform on-line estimation tasks. An on-line version of the IBIS filter is presented in Algorithm 5, used in conjuction with the MCMC routine of Algorithm 4.
The core idea of the IBIS filter is the following: At estimation step $ n $ , if sample degeneracy is identified, first the particles are resampled with replacement, and subsequently the resampled particles are moved with a Markov chain transition kernel whose stationary distribution is $ {\pi}_{\mathrm{pos}}\left(\boldsymbol{\theta} |{\mathbf{y}}_{1:n}\right) $ . More specifically, each of the $ {N}_{\mathrm{par}} $ resampled particles is used as the seed to perform a single MCMC step. This approach is inherently different to standard applications of MCMC, where a transition kernel is applied multiple times on one particle.
A question that arises is how to choose the Markov chain transition kernel. Chopin (Reference Chopin2002) argues for choosing a transition kernel that ensures that the proposed particle only weakly depends on the seed particle value. It is therefore recommended to use an independent Metropolis-Hastings (IMH) kernel, wherein the proposed particle is sampled from a proposal distribution $ g $ , which has to be as close as possible to the target distribution $ {\pi}_{\mathrm{pos}}\left(\boldsymbol{\theta} |{\mathbf{y}}_{1:n}\right) $ . In obtaining such a proposal distribution, along the lines of what is described in Section 2.1, in this work we employ a GMM approximation (see equation (5)) of the target distribution as the proposal density $ {g}_{\mathrm{GM}}\left(\boldsymbol{\theta} \right) $ within the IMH kernel (Papaioannou et al., Reference Papaioannou, Papadimitriou and Straub2016; South et al., Reference South, Pettitt and Drovandi2019). The IMH kernel with a GMM proposal distribution is denoted IMH-GM herein. The acceptance probability (line 6 of Algorithm 4) of the IMH-GM kernel is a function of both the initial seed particle and the GMM proposed particle. The acceptance rate can indicate how efficient the IMH-GM kernel is in performing the MCMC move step within the IBIS algorithm. It is important to note that when computing the acceptance probability, a call of the full likelihood function is invoked, which requires the whole set of measurements $ {\mathbf{y}}_{1:n} $ to be processed; this leads to a significant additional computational demand, which pure on-line methods are not supposed to accommodate (Doucet et al., Reference Doucet, de Freitas and Gordon2001).
The performance of the IBIS sampler depends highly on the mixing properties of the IMH-GM kernel. If the kernel leads to slowly decreasing chain auto-correlation, the moved particles are bound to remain in regions close to the particles obtained by the resampling step. This can lead to an underrepresentation of the parameter space of the intermediate posterior distribution. It might therefore be beneficial to add a burn-in period within the IMH-GM kernel (Del Moral et al., Reference Del Moral, Doucet and Jasra2006). Implementing that is straightforward and is shown in Algorithm 4, where $ {n}_{\mathrm{B}} $ is the user-defined number of burn-in steps. Naturally, the computational cost of the IMH-GM routine increases linearly with the number of burn-in steps.
Algorithm 4 Independent Metropolis Hastings with GM proposal (IMH-GM)
1: IMH-GM Input: $ \left\{{\boldsymbol{\theta}}^{(i)},{L}^{(i)}\cdot {\pi}_{\mathrm{pr}}\left({\boldsymbol{\theta}}^{(i)}\right)\right\} $ , $ {\pi}_{\mathrm{pr}}\left(\boldsymbol{\theta} \right) $ , $ L\left({\mathbf{y}}_{1:n}|\boldsymbol{\theta} \right) $ and $ {g}_{\mathrm{GM}}\left(\boldsymbol{\theta} \right) $
2: for $ i=1,\dots, {N}_{\mathrm{par}} $ do
3: for $ j=1,\dots, {n}_{\mathrm{B}}+1 $ do
4: sample candidate particle $ {\boldsymbol{\theta}}_{c,j}^{(i)} $ from $ {g}_{\mathrm{GM}}\left(\boldsymbol{\theta} \right) $
5: evaluate $ {L}_{c,j}^{(i)}=L\left({\mathbf{y}}_{1:n}|{\boldsymbol{\theta}}_{c,j}^{(i)}\right) $ for candidate particle
6: evaluate acceptance ratio $ \alpha =\min \left[1,\frac{L_{c,j}^{(i)}\cdot {\pi}_{\mathrm{pr}}\left({\boldsymbol{\theta}}_{c,j}^{(i)}\right)\cdot {g}_{\mathrm{GM}}\left({\boldsymbol{\theta}}^{(i)}\right)}{L^{(i)}\cdot {\pi}_{\mathrm{pr}}\left({\boldsymbol{\theta}}^{(i)}\right)\cdot {g}_{\mathrm{GM}}\left({\boldsymbol{\theta}}_{c,j}^{(i)}\right)}\right] $
7: generate uniform random number $ u\in \left[0,1\right] $
8: if $ u<\alpha $ then
9: replace $ \{{\boldsymbol{\theta}}^{(i)},{L}^{(i)}\cdot {\pi}_{\mathrm{pr}}({\boldsymbol{\theta}}^{(i)})\} $ with $ \left\{{\boldsymbol{\theta}}_{c,j}^{(i)},{L}_{c,j}^{(i)}\cdot {\pi}_{\mathrm{pr}}\left({\boldsymbol{\theta}}_{c,j}^{(i)}\right)\right\} $
10: end if
11: end for
12: end for
13: IMH-GM Output: $ \left\{{\boldsymbol{\theta}}^{(i)},{L}^{(i)}\cdot {\pi}_{\mathrm{pr}}\left({\boldsymbol{\theta}}^{(i)}\right)\right\} $
Algorithm 5 details the workings of the IMH-GM-based IBIS filter used in this work. In line 11 of this algorithm, the IMH-GM routine of Algorithm 4 is called, which implements the IMH-GM kernel for the MCMC move step. Comparing Algorithms 2 and 5, it is clear that both filters can be used for on-line inference within a single run, but the IBIS filter has significantly larger computational cost, as will also be demonstrated in the numerical investigations of Section 3. In the same spirit as the proposed tPFGM Algorithm 3, which enforces simulated annealing (tempering of the likelihood function) in cases when $ {\pi}_{\mathrm{pos}}\left(\boldsymbol{\theta} |{\mathbf{y}}_{1:n-1}\right) $ and $ {\pi}_{\mathrm{pos}}\left(\boldsymbol{\theta} |{\mathbf{y}}_{1:n}\right) $ are likely to be quite different, the same idea can be implemented also within the IBIS algorithm. That leads to what we refer to as the tIBIS algorithm in this article.
Algorithm 5 IMH-GM-based Iterated Batch Importance Sampling (IBIS)
1: generate $ {N}_{\mathrm{par}} $ initial particles $ {\boldsymbol{\theta}}^{(i)} $ from $ {\pi}_{\mathrm{pr}}\left(\boldsymbol{\theta} \right) $ , $ i=1,\dots, {N}_{\mathrm{par}} $
2: assign initial weights $ {w}_0^{(i)}=1/{N}_{\mathrm{par}} $ , $ i=1,\dots, {N}_{\mathrm{par}} $
3: for $ n=1,\dots, N $ do
4: evaluate likelihood of the particles based on new measurement $ {y}_n $ , $ {L}_n^{(i)}=L\left({y}_n|{\boldsymbol{\theta}}^{(i)}\right) $
5: evaluate the new target distribution, $ L\left({\mathbf{y}}_{1:n}|{\boldsymbol{\theta}}^{(i)}\right)\cdot {\pi}_{\mathrm{pr}}\left({\boldsymbol{\theta}}^{(i)}\right)={L}_n^{(i)}\cdot L\left({\mathbf{y}}_{1:n-1}|{\boldsymbol{\theta}}^{(i)}\right)\cdot {\pi}_{\mathrm{pr}}\left({\boldsymbol{\theta}}^{(i)}\right) $
6: update particle weights $ {w}_n^{(i)}\propto {L}_n^{(i)}\cdot {w}_{n-1}^{(i)} $ and normalize $ \mathrm{s}.\mathrm{t}.\hskip1em {\sum}_{i=1}^{N_{\mathrm{par}}}{w}_n^{(i)}=1 $
7: evaluate $ {N}_{\mathrm{eff}}=\frac{1}{\sum_{i=1}^{N_{\mathrm{par}}}{\left({w}_n^{(i)}\right)}^2} $
8: if $ {N}_{\mathrm{eff}}<{N}_{\mathrm{T}} $ then
9: EM: fit a Gaussian mixture proposal distribution $ {g}_{\mathrm{GM}}\left(\boldsymbol{\theta} \right) $ according to $ \left\{{\boldsymbol{\theta}}^{(i)},{w}_n^{(i)}\right\} $
10: resample $ {N}_{\mathrm{par}} $ new particles $ \left\{{\boldsymbol{\theta}}^{(i)},L\left({\mathbf{y}}_{1:n}|{\boldsymbol{\theta}}^{(i)}\right)\cdot {\pi}_{\mathrm{pr}}\left({\boldsymbol{\theta}}^{(i)}\right)\right\} $ with replacement according to $ {w}_n^{(i)} $
11: IMH-GM step with inputs $ \left\{{\boldsymbol{\theta}}^{(i)},L\left({\mathbf{y}}_{1:n}|{\boldsymbol{\theta}}^{(i)}\right)\cdot {\pi}_{\mathrm{pr}}\left({\boldsymbol{\theta}}^{(i)}\right)\right\} $ , $ {\pi}_{\mathrm{pr}}\left(\boldsymbol{\theta} \right) $ , $ L\left({\mathbf{y}}_{1:n}|\boldsymbol{\theta} \right) $ , and $ {g}_{\mathrm{GM}}\left(\boldsymbol{\theta} \right) $
12: reset particle weights to $ {w}_n^{(i)}=1/{N}_{\mathrm{par}} $
13: end if
14: end for
2.3. Off-line sequential Monte Carlo sampler
In Section 4 of Del Moral et al. (Reference Del Moral, Doucet and Jasra2006), the authors presented a generic approach to convert an off-line MCMC sampler into a sequential Monte Carlo (SMC) sampler tailored for performing off-line estimation tasks, that is, for estimating the single posterior density of interest $ {\pi}_{\mathrm{pos}}\left(\boldsymbol{\theta} |{\mathbf{y}}_{1:N}\right) $ . The off-line SMC sampler used in this work is presented in Algorithm 6 based on Del Moral et al. (Reference Del Moral, Doucet and Jasra2006) and Jasra et al. (Reference Jasra, Stephens, Doucet and Tsagaris2011). The key idea of this sampler is to adaptively construct the following artificial sequence of densities,
where $ {q}_j $ is a tempering parameter which obtains values between 0 and 1, in order to “sequentially” sample in a smooth manner from the prior to the final single posterior density of interest. Once $ {q}_j=1 $ , $ {\pi}_{\mathrm{pos}}\left(\boldsymbol{\theta} |{\mathbf{y}}_{1:N}\right) $ is reached. Similar to what was described in tPFGM, the intermediate values of $ {q}_j $ are adaptively found via solution of the optimization problem in line 5 of Algorithm 6. The GMM approximation of the intermediate posteriors and the IMH-GM kernel of Algorithm 4 in order to move the particles after resampling are also key ingredients of this SMC sampler. Unlike PFGM and IBIS, this SMC algorithm cannot provide the on-line solution within a single run, and has to be rerun from scratch for every new target posterior of interest. In this regard, use of Algorithm 6 for on-line inference is impractical. We choose to include this algorithm for the purpose of the comparative assessment in this article, as off-line algorithms are generally considered to be better suited for the full posterior uncertainty quantification of time-invariant parameters.
Algorithm 6 IMH-GM-based Sequential Monte Carlo (SMC)
1: generate $ {N}_{\mathrm{par}} $ initial particles $ {\boldsymbol{\theta}}^{(i)} $ from $ {\pi}_{\mathrm{pr}}\left(\boldsymbol{\theta} \right) $ , $ i=1,\dots, {N}_{\mathrm{par}} $
2: evaluate for every particle the full likelihood $ {L}^{(i)}=L\left({\mathbf{y}}_{1:N}|{\boldsymbol{\theta}}^{(i)}\right) $ and the prior $ {\pi}_{\mathrm{pr}}\left({\boldsymbol{\theta}}^{(i)}\right) $
3: set $ q=0 $
4: while $ q\ne 1 $ do
5: solve $ {({\sum}_{i=1}^{N_{\mathrm{par}}}{L}^{(i{)}^{dq}})}^2/{\sum}_{i=1}^{N_{\mathrm{par}}}{L}^{(i{)}^{2\cdot dq}}-{N}_{\mathrm{T}}=0 $ for $ dq $
6: set $ {q}_{\mathrm{new}}=\min \left[q+ dq,1\right] $
7: set $ dq={q}_{\mathrm{new}}-q $ and $ q={q}_{\mathrm{new}} $
8: evaluate particle weights $ {w}^{(i)}\propto {L}^{(i) dq} $ and normalize $ \mathrm{s}.\mathrm{t}.\hskip1em {\sum}_{i=1}^{N_{\mathrm{par}}}{w}^{(i)}=1 $
9: EM: fit a Gaussian mixture proposal distribution $ {g}_{\mathrm{GM}}\left(\boldsymbol{\theta} \right) $ according to $ \left\{{\boldsymbol{\theta}}^{(i)},{w}^{(i)}\right\} $
10: resample $ {N}_{\mathrm{par}} $ new particles $ \left\{{\boldsymbol{\theta}}^{(i)},{L}^{(i)q}\cdot {\pi}_{\mathrm{pr}}\left({\boldsymbol{\theta}}^{(i)}\right)\right\} $ with replacement according to $ {w}^{(i)} $
11: IMH-GM step with inputs $ \{{\boldsymbol{\theta}}^{(i)},{L}^{(i{)}^q}\cdot {\pi}_{\mathrm{pr}}({\boldsymbol{\theta}}^{(i)})\} $ , $ {\pi}_{\mathrm{pr}}\left(\boldsymbol{\theta} \right) $ , $ {L}^q\left({\mathbf{y}}_{1:N}|\boldsymbol{\theta} \right) $ , and $ {g}_{\mathrm{GM}}\left(\boldsymbol{\theta} \right) $
12: reset particle weights to $ {w}^{(i)}=1/{N}_{\mathrm{par}} $
13: end while
2.4. Computational remarks
The algebraic operations in all presented algorithms are implemented in the logarithmic scale, which employs evaluations of the logarithm of the likelihood function and, hence, ensures computational stability. Furthermore, the EM step for fitting the GMM is performed after initially transforming the prior joint probability density function of $ \boldsymbol{\theta} $ to an underlying vector $ \mathbf{u} $ of independent standard normal random variables (Kiureghian and Liu, Reference Kiureghian and Liu1986). In standard normal space, the parameters are decorrelated, which enhances the performance of the EM algorithm.
3. Numerical Investigations
3.1. Low-dimensional case study: Paris–Erdogan fatigue crack growth model
A fracture mechanics-based model serves as the first case study. This describes the fatigue crack growth evolution under increasing stress cycles (Paris and Erdogan, Reference Paris and Erdogan1963; Ditlevsen and Madsen, Reference Ditlevsen and Madsen1996). The crack growth follows the following first-order differential equation (7), known as Paris–Erdogan law,
where $ \boldsymbol{\theta} =\left[a,\Delta S,{C}_{\mathrm{ln}},m\right] $ is a vector containing the uncertain time-invariant model parameters. Specifically, $ a $ is the crack length, $ n $ is the number of stress cycles, $ \Delta S $ is the stress range per cycle when assuming constant stress amplitudes, $ C $ and $ m $ represent empirically determined model parameters; $ {C}_{\mathrm{ln}} $ corresponds to the natural logarithm of $ C $ .
The solution to this differential equation, with boundary condition $ a\left(n=0\right)={a}_0 $ , can be written as a function of the number of stress cycles $ n $ and the vector $ \boldsymbol{\theta} =\left[{a}_0,\Delta S,{C}_{\mathrm{ln}},m\right] $ as (for the derivation see, e.g., Ditlevsen and Madsen (Reference Ditlevsen and Madsen1996)):
We assume that noisy measurements of the crack $ {y}_n $ are obtained sequentially at different values of $ n $ . The measurement equation (9) assumes a multiplicative lognormal measurement error, $ \exp \left({\varepsilon}_n\right) $ .
In numerical investigations that follow, the measurement equation (9) is used for generating synthetic measurements of the deterioration state. In this context, a multiplicative lognormal measurement error ensures that nonnegative generated measurements of the deterioration state are not feasible.
Under this assumption in equation (9), the likelihood function for a measurement at a given $ n $ is shown in equation (10).
Table 1 shows the prior probability distribution model for each random variable in the vector $ \boldsymbol{\theta} $ (Ditlevsen and Madsen, Reference Ditlevsen and Madsen1996; Straub, Reference Straub2009), as well as the assumed probabilistic model of the measurement error. In this case study we are dealing with a nonlinear model and a parameter vector with non-Gaussian prior distribution.
3.1.1. Markovian state-space representation for application of on-line filters
A Markovian state-space representation of the deterioration process is required for application of on-line filters. The number of stress cycles is discretized as $ n=k\Delta n $ , with $ k=1,\dots, 100 $ denoting the estimation time step and $ \Delta n=1\times {10}^5 $ the number of stress cycles per time step. The dynamic and measurement equations of the discrete-time state-space representation of the fatigue crack growth model with unknown time-invariant parameters $ \boldsymbol{\theta} =\left[{a}_0,\Delta S,{C}_{\mathrm{ln}},m\right] $ are shown below:
The state-space model of equation (11) is nonlinear and the prior is non-Gaussian. For reasons explained in Section 2, the subscript $ k $ in $ {\boldsymbol{\theta}}_k $ is dropped in the remainder of this section.
3.1.2. Reference posterior solution
For the purpose of performing a comparative assessment of the different filters, an underlying “true” realization of the fatigue crack growth process $ {a}^{\ast}\left(n,\boldsymbol{\theta} \right) $ is generated for $ n=k\Delta n $ , with $ k=1,\dots, 100 $ and $ \Delta n=1\times {10}^5 $ . This realization corresponds to the randomly generated “true” vector of time-invariant parameters $ {\boldsymbol{\theta}}^{\ast }=\left[{a}_0^{\ast }=2.0,\Delta {S}^{\ast }=50.0,{C}_{\mathrm{ln}}^{\ast }=-33.5,{m}^{\ast }=3.7\right] $ . Sequential synthetic crack monitoring measurements $ {y}_k $ are sampled from the measurement equation (9) for $ a\left(k\Delta n,{\boldsymbol{\theta}}^{\ast}\right) $ , and for randomly generated measurement noise samples $ \exp \left({\varepsilon}_k\right) $ . These measurements are scattered in green in Figure 3.
Based on the generated measurements, the sequence of reference posterior distributions $ {\pi}_{\mathrm{pos}}\left(\boldsymbol{\theta} |{\mathbf{y}}_{1:k}\right) $ is obtained using the prior distribution as an envelope distribution for rejection sampling (Smith and Gelfand, Reference Smith and Gelfand1992; Rubinstein and Kroese, Reference Rubinstein and Kroese2016). More specifically, for each of the 100 posterior distributions of interest $ {\pi}_{\mathrm{pos}}\left(\boldsymbol{\theta} |{\mathbf{y}}_{1:k}\right) $ , $ {10}^5 $ independent samples are generated. The results of this reference posterior estimation of the four time-invariant model parameters are plotted in Figure 2. With posterior samples, the reference filtered estimate of the crack length $ a\left(k\Delta n,\boldsymbol{\theta} \right) $ at each estimation step is also obtained via the model of equation (8) and plotted in Figure 3. In the left panel of this figure, the filtered state is plotted in logarithmic scale. In an off-line estimation, a single posterior density is of interest. One such reference posterior estimation result for the last estimation step, $ {\pi}_{\mathrm{pos}}\left(\boldsymbol{\theta} |{\mathbf{y}}_{1:100}\right) $ , is plotted for illustration in Figure 4.
3.1.3. Comparative assessment of the investigated on-line and off-line filters
We apply the PFGM filter with 5,000 and 50,000 particles, the IBIS filter with 5,000 particles, and the SMC filter with 5,000 particles for performing on-line and off-line time-invariant parameter estimation tasks. We evaluate the performance of each filter by taking the relative error of the estimated mean and standard deviation of each of the four parameters with respect to the reference posterior solution. For example, the relative error in the estimation of the mean of parameter $ {a}_0 $ at a certain estimation step $ k $ is computed as $ |\frac{\mu_{a_0,k}-{\hat{\mu}}_{a_0,k}}{\mu_{a_0,k}}| $ , where $ {\mu}_{a_0,k} $ is the reference posterior mean from rejection sampling (Section 3.1.2), and $ {\hat{\mu}}_{a_0,k} $ is the posterior mean estimated with each filter. Each filter is run 50 times, and the mean relative error of the mean and the standard deviation of each parameter, together with the 90% credible intervals (CI), are obtained. These are plotted in Figure 5.
Figure 6 plots the $ {L}^2 $ relative error norm of the mean and the standard deviation of all four parameters, that is, the quantity of equation (12) (here formulated for the mean at estimation step $ k $ ).
where $ d $ is the dimensionality of the time-invariant parameter vector $ \boldsymbol{\theta} $ (in this example $ d=4 $ ). More specifically, Figure 6 plots the mean and credible intervals of the $ {L}^2 $ relative error norm of the estimated mean and standard deviation, as obtained from 50 runs of each filter.
Figures 5 and 6 reveal that, when all three filters are run with the same number of particles, the IBIS and SMC filters yield superior performance over PFGM. When the number of particles in the PFGM filter is increased to 50,000, the PFGM filter performance is comparable to the one of the IBIS and SMC filters. In estimating the mean, the mean $ {L}^2 $ relative error norm obtained from the PFGM filter with 50,000 particles is slightly larger than the corresponding error obtained from IBIS and SMC with 5,000 particles, while the 90% credible intervals of the PFGM filter estimation are still wider. In estimating the standard deviation, the PFGM filter with 50,000 particles proves competitive.
Figures 5 and 6 show the estimation accuracy of each filter when used for on-line inference, that is, for estimating the whole sequence of 100 posterior distributions $ {\pi}_{\mathrm{pos}}\left(\boldsymbol{\theta} |{\mathbf{y}}_{1:k}\right) $ , $ k=1,\dots, 100 $ . The PFGM and IBIS filters, being intrinsically on-line filters, provide the whole posterior sequence with one run. On the other hand, the off-line SMC filter is run anew for each of the 100 required posterior estimations. Hence, Figures 5 and 6 enclose the results of both the on-line and the off-line inference. If one is interested in the off-line estimation accuracy at a specific stress cycle $ n $ , one can simply consider a vertical “cut” at $ n $ .
Table 2 documents the computational cost associated with each filter, expressed in the form of required model evaluations induced by calls of the likelihood function. When using the term model, we refer to the model defined in equation (8), which corresponds to an analytical expression with negligible associated runtime. However, unlike the simple measurement equation that we have assumed in this example, in many realistic deterioration monitoring settings, the deterioration state cannot be measured directly (e.g., in vibration-based structural health monitoring Kamariotis et al., Reference Kamariotis, Chatzi and Straub2022). In such cases, each deterioration model evaluation often entails evaluation of a finite element (FE) model, which has substantial runtime. It, therefore, appears appropriate to evaluate the filters’ computational cost in terms of required model evaluations. The on-line PFGM filter with 5,000 particles requires $ 5\times {10}^5 $ model evaluations, and yields by far the smallest computational cost, while at the same time providing the solution to both on-line and off-line estimation tasks. However, it also yields the worst performance in terms of accuracy of the posterior estimates. Running the IBIS filter with 5,000 particles, which performs MCMC move steps, leads to $ 3.4\times {10}^6 $ model evaluations. Comparing this value against the $ 5\times {10}^5 $ model evaluations required by the PFGM filter with 5,000 particles for performing the same task distinctly shows the computational burden associated with MCMC move steps, which require a complete browsing of the whole measurement data set in estimating the acceptance probability. However, the IBIS filter also leads to enhanced estimation accuracy, which might prove significant when the subsequent tasks entail prognosis of the deterioration evolution, the structural reliability or the remaining useful lifetime, and eventually the predictive maintenance planning. Using 50,000 particles, the PFGM filter performance increases significantly with a computational cost that is comparable to the IBIS filter with 5,000 particles. For the off-line SMC algorithm, $ 4.5\times {10}^6 $ model evaluations are required only for the task of estimating the final posterior density. The $ 1.9\times {10}^8 $ model evaluations required by the SMC for obtaining the whole sequence of posteriors $ {\pi}_{\mathrm{pos}}\left(\boldsymbol{\theta} |{\mathbf{y}}_{1:k}\right) $ , $ k=1,\dots, 100 $ , clearly demonstrate that off-line MCMC techniques are unsuited to on-line estimation tasks.
3.2. High-dimensional case study: Corrosion deterioration spatially distributed across beam
As a second case study, we employ the deterioration model of equation (13), which describes the spatially and temporally varying corrosion deterioration across the structural beam shown in Figure 7.
$ A(x),B(x),x\in \Omega $ are random fields defined on $ \Omega =\left[0,L\right] $ , with $ L $ denoting the length of the beam taken as $ L=4 $ m. $ A(x) $ models the deterioration rate, while $ B(x) $ models the nonlinearity effect of the deterioration process in terms of a power law in time. The corrosion deterioration $ D\left(t,x\right) $ is therefore also a spatial random field.
A random field, by definition, contains an infinite number of random variables, and must therefore be discretized (Vanmarcke, Reference Vanmarcke2010). One of the most common methods for discretization of random fields is the midpoint method (Der Kiureghian and Ke, Reference Der Kiureghian and Ke1988). Thereby the spatial domain $ \Omega $ is discretized into $ m $ elements, and the random field is approximated within each element through the random variable that corresponds to midpoint of the element. In that case, the uncertain time-invariant deterioration model parameter vector is $ \boldsymbol{\theta} =\left[{A}_1,\dots, {A}_m,{B}_1,\dots, {B}_m\right] $ , where are the random variables corresponding to the element midpoints. With the midpoint discretization, the spatial deterioration $ D\left(t,x\right) $ is parametrized by $ \boldsymbol{\theta} $ .
We assume that noisy measurements of the corrosion deterioration state $ {D}_{t,l}\left(\boldsymbol{\theta} \right):= D\left(t,{x}_l,\boldsymbol{\theta} \right) $ at time $ t $ and at certain locations $ l $ of the beam are obtained sequentially (summarized in one measurement per year) from $ {n}_l $ sensors deployed at these locations ( $ {n}_l=10 $ sensor locations are shown in Figure 7). The measurement equation (14), describing the corrosion measurement at time $ t $ and sensor location $ l $ , assumes a multiplicative measurement error, $ \exp \left({\varepsilon}_{t,l}\right) $ .
where $ {i}_l $ returns the discrete element number of the midpoint discretization within which the measurement location $ l $ lies. Table 3 shows the prior distribution model for the two random fields of the deterioration model of equation (13) and the assumed probabilistic model of the multiplicative measurement error. Since $ A(x) $ models a lognormal random field, $ \ln \left(A(x)\right) $ follows the normal distribution. For both random fields $ \ln \left(A(x)\right) $ and $ B(x) $ , the exponential correlation model with correlation length of 2 m is applied (Sudret and Der Kiureghian, Reference Sudret and Der Kiureghian2000).
The goal is to update the time-invariant deterioration model parameters $ \boldsymbol{\theta} =\left[{A}_1,\dots, {A}_m,{B}_1,\dots, {B}_m\right] $ given sequential noisy corrosion measurements $ {y}_{t,l} $ from $ {n}_l $ deployed sensors. The dimensionality of the problem is $ d=2\times m $ . Hence, the more elements in the midpoint discretization, the higher the dimensionality of the parameter vector.
The main goal of this second case study is to investigate the effect of the problem dimensionality and the amount of sensor information on the posterior results obtained with each filter. We choose the following three midpoint discretization schemes:
-
1. $ m=25 $ elements: $ d=50 $ time-invariant parameters to estimate.
-
2. $ m=50 $ elements: $ d=100 $ time-invariant parameters to estimate.
-
3. $ m=100 $ elements: $ d=200 $ time-invariant parameters to estimate.
Furthermore, we choose the following three potential sensor arrangements:
-
1. $ {n}_l=2 $ sensors (the $ 4\mathrm{th} $ and $ 7\mathrm{th} $ sensors of Figure 7).
-
2. $ {n}_l=4 $ sensors (the $ 1\mathrm{st},4\mathrm{th},7\mathrm{th} $ , and $ 10\mathrm{th} $ sensors of Figure 7).
-
3. $ {n}_l=10 $ sensors of Figure 7.
We, therefore, study nine different cases of varying problem dimensionality and the number of sensors.
3.2.1. Markovian state-space representation for application of on-line filters
A Markovian state-space representation of the deterioration process is required for application of on-line filters. The dynamic and measurement equations are shown in equation (15). The measurement equation is written in the logarithmic scale. Time $ t $ is discretized in yearly estimation time steps $ k $ , that is, $ k=1,\dots, 50 $ , and the subscript $ l=1,\dots, {n}_l $ corresponds to the sensor location.
where $ {\boldsymbol{\theta}}_k $ denotes the time-invariant parameter vector at time step $ k $ . In the logarithmic scale, both the dynamic and measurement equations are linear functions of Gaussian random variables. For reasons explained in Section 2, the subscript $ k $ in $ {\boldsymbol{\theta}}_k $ is dropped in the following.
3.2.2. Underlying “true” realization
To generate a high-resolution underlying “true” realization of the two random fields $ A(x) $ and $ B(x) $ , and the corresponding synthetic monitoring data set, we employ the Karhunen–Loève (KL) expansion using the first 400 KL modes. The KL expansion is an alternative random field discretization scheme to the midpoint method that represents the random field in terms of the eigenfunctions of its autocovariance function (Ghanem and Spanos, Reference Ghanem and Spanos1991; Sudret and Der Kiureghian, Reference Sudret and Der Kiureghian2000). The implementation of the KL expansion can be found in the Matlab codes accompanying this article.Footnote 2 We remark that a fine resolution KL expansion is chosen to represent the “true” realization in order to avoid the inverse crime (Wirgin, Reference Wirgin2004). These realizations are shown in the left panel of Figure 8. Given these $ A(x) $ and $ B(x) $ realizations, the underlying “true” realizations of the deterioration process at 10 specific beam locations are generated, which correspond to the 10 potential sensor placement locations shown in Figure 7. Subsequently, a synthetic corrosion sensor measurement data set (one measurement per year) at these 10 locations is generated from the measurement equation (14). These are shown in the right panel of Figure 8.
3.2.3. Reference posterior solution
For the investigated linear Gaussian state space representation of equation (15), we create reference on-line posterior solutions for each of the nine considered cases by applying the Kalman filter (KF) (Kalman, Reference Kalman1960), which is the closed-form solution to the Bayesian filtering equations. The process noise covariance matrix in the KF equations is set equal to zero. The linear Gaussian nature of the chosen problem ensures existence of an analytical reference posterior solution obtained with the KF. One such reference on-line posterior solution for the case described by $ m=25 $ elements ( $ d=50 $ ) and $ {n}_l=4 $ sensors is shown in Figure 9.
3.2.4. Comparative assessment of the investigated on-line and off-line filters
The goal of this section is to offer a comparative assessment among the three Bayesian filtering algorithms presented in this article when applied on a high-dimensional problem. To be able to derive a reference solution, as described above, a linear Gaussian state space representation of a structural deterioration problem has been defined (equation (15)). We apply the tPFGM filter, the tIBIS filter, and the SMC filter, all with $ {N}_{\mathrm{par}} $ = 2000 particles, for estimating the time-invariant parameter vector $ \boldsymbol{\theta} $ . For each of the nine cases of varying problem dimensionality and number of sensors described above, we compute the $ {L}^2 $ relative error norm of the estimated means, correlation coefficients, and standard deviations of the parameters with respect to the corresponding KF reference posterior solution, that is, we estimate a quantity as in equation (12) for all estimation steps $ k=1,\dots, 50 $ . In Figures 10–12, we plot the mean and credible intervals of these relative errors as obtained from 50 different runs. The off-line SMC filter, which does not provide the on-line solution within a single run, is run anew for estimating the single posterior density of interest at years 10, 20, 30, 40, and 50, and in between, the relative error is linearly interpolated. Although each of the nine panels in the figures corresponds to a different case with a different underlying KF reference solution, their y-axes have the same scaling. Table 4 documents the computational cost of each filter in each considered case, measured by average number of evaluations of the model of equation (13).
Note. For the SMC, the required model evaluations for obtaining the single final posterior density are reported.
Figures 10 and 11 show that the off-line IMH-GM-based SMC filter yields the best performance in estimating the KF reference posterior mean and correlation, for all nine considered cases, while at the same time producing the narrowest credible intervals. Comparison of the relative errors obtained with the SMC and tIBIS filters reveals that, although they are both reliant on the IMH-GM MCMC move step, the on-line tIBIS filter leads to larger estimation errors. The on-line tPFGM and tIBIS filters generate quite similar results in estimating the reference posterior mean and correlation, thus rendering the benefit of the MCMC move step in tIBIS unclear, except in cases with more sensors and lower parameter dimension. Figures 10 and 11 reveal a slight trend, indicating that for fixed dimensionality, the availability of more sensors (i.e., stronger information content in the likelihood function) leads to a slight decrease in the relative errors when using the SMC and tIBIS filters, whereas the opposite trend can be identified for the tPFGM filter. Increasing problem dimensionality (for fixed number of sensors) does not appear to have strong influence on the posterior results in any of the columns of Figures 10–12, a result that initially appears puzzling.
Figure 12 conveys that the tPFGM filter, which entirely depends on the GMM posterior approximation, induces the smallest relative errors for the estimation of the standard deviation of the parameters in all considered cases. This result reveals a potential inadequacy of the single application of the IHM-GM kernel for the move step within the tIBIS and SMC filters in properly exploring the space of $ \boldsymbol{\theta} $ . In all 50 runs of the tIBIS and SMC filters, the standard deviation of the parameters is consistently underestimated compared to the reference, unlike when applying the tPFGM filter.
Based on the discussion of Section 2.2, we introduce a burn-in period of $ {n}_B $ = 5 in the IMH-GM kernel of Algorithm 4 and perform 50 new runs of the tIBIS and SMC filters. One can expect that inclusion of a burn-in is more likely to ensure sufficient exploration of the intermediate posterior distributions. However, at the same time, the computational cost of tIBIS and SMC increases significantly, with a much larger number of required model evaluations than in Table 4. In Figures 13 and 14, we plot the mean and credible intervals for the relative errors in the estimation of the mean and standard deviation of the parameters. Comparing Figures 10 and 13, inclusion of burn-in is shown to lead to an improved performance of tIBIS and SMC in estimating the mean of the parameters in all cases. This improvement is more evident in the lower-dimensional case with 25 elements, and lessens as the problem dimension increases. Hence, it is only after the inclusion of burn-in, which leads to an enhanced posterior solution, that one starts observing the anticipated deterioration of the tIBIS and SMC filters’ performance with increasing dimensionality. This effect was masked in the results of Figure 10 without burn-in. This point becomes more evident when looking at the relative errors of the estimated standard deviation in Figure 14. With burn-in, the tIBIS and SMC filters provide better results than the tPFGM filter in estimating the standard deviation in the case of 25 elements, but perform progressively worse as the dimensionality increases, where they underestimate the KF reference standard deviation. This underestimation is clearly illustrated in Figure 15. The reason for this behavior is the poor performance of the IMH-GM algorithm in high dimensions, which is numerically demonstrated in Papaioannou et al. (Reference Papaioannou, Papadimitriou and Straub2016). We suspect that this behavior is related to the degeneracy of the acceptance probability of MH samplers in high dimensions, which has been extensively discussed in the literature for random walk samplers, for example, in Gelman et al. (Reference Gelman, Gilks and Roberts1997), Au and Beck (Reference Au and Beck2001), Katafygiotis and Zuev (Reference Katafygiotis and Zuev2008), Beskos and Stuart (Reference Beskos, Stuart, L’ Ecuyer and Owen2009), Cotter et al. (Reference Cotter, Roberts, Stuart and White2013), and Papaioannou et al. (Reference Papaioannou, Betz, Zwirglmaier and Straub2015). Single application of the IHM-GM kernel without burn-in yielded acceptance rates of around 50% for all cases. With inclusion of burn-in, in higher dimensions, the acceptance rate in IMH-GM drops significantly in the later burn-in steps, leading to rejection of most proposed particles. To alleviate this issue, one could consider using the preconditioned Crank Nicolson (pCN) sampler to perform the move step within the IBIS and SMC filters, whose performance is shown to be independent of the dimension of the parameter space when the prior is Gaussian (Cotter et al., Reference Cotter, Roberts, Stuart and White2013).
Increase of dimensionality does not seem to have any influence on the results obtained with the tPFGM filter. The illustrated efficacy of the tPFGM filter in estimating the time-invariant parameters in all considered cases of increasing dimensionality is related to the nature of the studied problem. The tPFGM filter relies entirely on the GMM approximation of the posterior distribution within its resampling process, in that it simply “accepts” all the $ {N}_{\mathrm{par}} $ GMM-proposed particles, unlike the tIBIS and SMC filters, which contain the degenerating acceptance-rejection step within the IMH-GM move step. Clearly, the worse the GMM fit, the worse the expected performance of the tPFGM filter. The particular case investigated here has a Gaussian reference posterior solution, hence the GMM fitted by EM proves effective in approximating the posterior with a relatively small number of particles, even when going up to $ d $ = 200 dimensions, thus leading to a good proposal distribution for sampling $ {N}_{\mathrm{par}} $ new particles in tPFGM. As reported in Table 4, the tPFGM filter is associated with a significantly lower computational cost than its MCMC-based counterparts.
4. Concluding Remarks
In this article, we present in algorithmic detail three different on-line and off-line Bayesian filters. The on-line filters are specifically tailored for quantifying the full posterior uncertainty of time-invariant deterioration model parameters in long-term monitoring settings. Specifically, the three presented methods are an on-line particle filter with Gaussian mixture resampling (PFGM), an on-line iterated batch importance sampling (IBIS) filter, and an off-line sequential Monte Carlo (SMC) filter, which applies simulated annealing to sequentially arrive to a single posterior density of interest. The IBIS and SMC filters perform Markov Chain Monte Carlo (MCMC) move steps via application of an independent Metropolis Hastings kernel with a Gaussian mixture proposal distribution (IMH-GM) whenever degeneracy is identified. A simulated annealing process (tempering of the likelihood function) is further incorporated within the update step of the on-line PFGM and IBIS filters for cases when each new measurement is expected to have a strong information content; this leads to the presented tPFGM and tIBIS filters. The SMC filter can be employed only for off-line inference, while the PFGM, tPFGM, IBIS, and tIBIS filters can perform both on-line and off-line inference tasks.
With the aid of two numerical examples, a comparative assessment of these algorithms for off-line and on-line Bayesian filtering of time-invariant deterioration model parameters is performed. In contrast to other works, the main focus here lies on the efficacy of the investigated Bayesian filters in quantifying the full posterior uncertainty of deterioration model parameters, as well as on the induced computational cost.
For the first nonlinear, non-Gaussian, and low-dimensional case study, the IBIS and SMC filters, which both contain IMH-GM-based MCMC move steps, are shown to consistently outperform the purely on-line PFGM filter in estimating the parameters’ reference posterior distributions. However, they induce a computational cost of at least an order of magnitude larger than the PFGM filter, when the same initial number of particles is used in all three filters. With similar computational cost, that is, when increasing the number of particles in PFGM, it achieves enhanced posterior accuracy, comparable to the IBIS and SMC filters.
The second case study involves a linear, Gaussian, and high-dimensional model. The focus here lies on evaluating the performance of the investigated filters in increasing dimensions. The linear Gaussian nature of the problem allows access to an exact reference posterior solution with the Kalman filter. For this case study, the results vary with increasing problem dimensionality and number of sensors. The on-line tPFGM filter achieves a consistently satisfactory quality with increasing dimensionality, a behavior explained by the linear Gaussian nature of the problem, while a slight drop in the posterior quality is observed for increasing amount of sensor information. The tIBIS and SMC filters are shown to consistently outperform the tPFGM filter in lower dimensions, they however perform progressively worse in higher dimensions, a behavior likely explained by the degeneracy of the acceptance probability of MH samplers in high dimensions. The computational cost of the tIBIS and SMC filters is an order of magnitude larger than the tPFGM filter.
Some general conclusions drawn from the delivered comparative assessment are listed below.
-
1. The purely on-line PFGM (and its tPFGM variant) filter is competitive with MCMC-based filters, especially for higher-dimensional well-behaved problems.
-
2. The IBIS (and its tIBIS variant) and SMC filters, which contain MCMC move steps, offer better approximations of the posterior mean of the model parameters than the purely on-line PFGM (and its tPFGM variant) filter with the same number of samples, as shown in both studied examples.
-
3. The independent Metropolis Hastings (IMH)-based MCMC move step performed within the IBIS, tIBIS, and SMC filters proves inadequate in properly exploring the posterior parameter space in high-dimensional problems.
Finally, to support the reader with the selection of the appropriate algorithm for a designated scenario, we provide Table 5, which contains an assessment of the methods presented in this article in function of problem characteristics.
This article does not investigate the performance of these filters when applied to high-dimensional and highly non-Gaussian problems. Such problems are bottlenecks for most existing filters and we expect the investigated filters to be confronted with difficulties in approximating the posterior distributions.
Author contribution
Conceptualization: A.K., L.S., I.P., E.C., D.S.; Formal analysis: A.K., L.S.; Funding acquisition: E.C., D.S.; Methodology: A.K., L.S., I.P., E.C., D.S.; Software: A.K., L.S.; Visualization: A.K., L.S.; Writing—original draft: A.K., Writing—review and editing: A.K., L.S., I.P., E.C., D.S. All authors approved the final submitted draft.
Competing interest
The authors declare no competing interests exist.
Data availability statement
The data and code that support the findings of this study are openly available at https://github.com/antoniskam/Offline_online_Bayes.
Funding statement
The work of A.K. and E.C. has been carried out with the support of the Technical University of Munich—Institute for Advanced Study, Germany, funded by the German Excellence Initiative and the TÜV SÜD Foundation.
Comments
No Comments have been published for this article.