Impact Statement
This research paper discusses the extension of the 4DVarNet framework to ensemble-based approaches. It paves the way to uncertainty quantification and enables to initiate the discussion of how to retrieve posterior pdf from such deep learning framework. This paper also involves pieces of previous work initiated inside the 4DVarNet team of developers in the framework of AI Chair Oceanix. It is meant to help the readers in understanding this framework from a broader point of view while diving into some new and ongoing developments.
1. Introduction
In state-of-the-art data assimilation (DA) systems, quantifying the estimation uncertainty is a key aspect that must be addressed all together with the mean of the state space. One major issue is that uncertainty quantification (UQ) is computationally costly when it comes with variational or ensemble-based approaches and their combination (Evensen et al., Reference Evensen, Vossepoel and van Leeuwen2022). Recently, we proposed an end-to-end deep learning (DL) scheme called 4DVarNet drawing from variational DA to learn jointly priors and solvers. This approach was successfully applied on the reconstruction of sea surface state based on altimetric datasets (Beauchamp et al., Reference Beauchamp, Febvre, Georgenthum and Fablet2022a) in the context of both traditional nadir altimeters and the upcoming SWOT mission. Though, 4DVarNet currently only provides a reconstruction of the state space without quantifying the related uncertainties.
In this work, we draw from geostatistical tools to easily generate ensemble of 4DVarNet reconstructions. The main advantage of our solution is that the ensemble generation is provided as a post-processing of 4DVarNet, which enables to simulate the pdf of the state space based on a lot of members (typically a hundred). It does not require any additional training that might come with a high and potentially untractable computational cost if the optimization scheme involves a large number of ensemble members.
2. Materials
2.1. 4DVarNet: Basic concepts
Recently, new neural-network-based architectures have emerged to solve inverse problems based on variational formulation. More precisely, restarting from the variational DA scheme (Asch et al., Reference Asch, Bocquet and Nodet2016), the state analysis $ {\mathbf{x}}^{\star } $ is obtained by solving the minimization problem:
where the variational cost function $ \mathcal{J}\left(\mathbf{x}\right)={\mathcal{J}}_{\Phi}\left(\mathbf{x},\mathbf{y},\Omega \right) $ is the sum of an observation term and a regularization term involving a dynamical prior operator $ \Phi $ :
and $ \mathbf{y} $ denotes the observations available on subdomain $ \Omega \subset D $ .
Here, the parameterization of operator $ \Phi $ is based on convolutional neural networks (CNN) embedding Markovian priors, see, for example, Fablet et al. (Reference Fablet, Drumetz and Rousseau2020). DL automatic differentiation tools are involved for the computation of the gradient operator $ {\nabla}_{\mathbf{x}}{\mathcal{J}}_{\Phi} $ given the architecture of operator $ \Phi $ and is also a composition of operators such as tensors, convolutions, and activation functions. The proposed end-to-end architecture consists in embedding an iterative gradient-based solver $ \Gamma $ based on a specific variational representation. Following meta-learning schemes (Andrychowicz et al., Reference Andrychowicz, Denil, Gomez, Hoffman, Pfau, Schaul, Shillingford and De Freitas2016), a residual LSTM-based representation (Fablet et al., Reference Fablet, Drumetz and Rousseau2020) of $ \Gamma $ is considered.
Formally, we define $ \Theta =\left\{{\Theta}_{\Phi},{\Theta}_{\Gamma}\right\} $ the set of parameters to train and we state this training of the considered neural scheme as a bilevel optimization problem:
where $ \mathcal{L}\left(\mathbf{x},\mathbf{y},{\mathbf{x}}^{\star}\right) $ defines a training loss, typically the mean squared error (MSE) w.r.t the Ground Truth with additional regularization terms.
Overall, let denote by $ {\Psi}_{\Phi, \Gamma}\left({\mathbf{x}}^{(0)},\mathbf{y},\Omega \right) $ the output of the so-called 4DVarNet end-to-end learning scheme, see Figure 1 and Algorithm 1, given architectures for both NN-based operators $ \Phi $ and $ \Gamma $ , the initialization $ {\mathbf{x}}^{(0)} $ of state $ \mathbf{x} $ and the observations $ \mathbf{y} $ on domain $ \Omega $ , subdomain of $ \mathcal{D} $ where observations are available.
Algorithm 1. 4DVarNet algorithm Fablet et al. (Reference Fablet, Drumetz and Rousseau2020)
Data:
– $ \mathbf{x}\in {\mathrm{\mathbb{R}}}^{T\times m}=\left\{{\mathbf{x}}_k\right\},k=1,\cdots, T $
– $ {\mathbf{y}}_{\Omega}=\left\{{\mathbf{y}}_{k,{\Omega}_k}\right\},k=1,\cdots, T $ : observations on domains $ {\Omega}_k\subset \mathcal{D} $
– $ {N}_I $ : number of iterations
– $ \eta $ : gradient step
List of procedures:
– Train_ $ {\Psi}_{\Phi, \Gamma} $ : end-to-end learning procedure with:
* $ \Phi $ : NN-based representation of the dynamical system
* $ \mathrm{GradLSTM} $ : residual NN-based representation of $ {\nabla}_{\mathbf{x}}\mathcal{J}\left(\mathbf{x}\right) $
* $ \Gamma $ : iterative gradient-based update operator
$ \mathbf{x}=\Phi \left(\mathbf{y}\right) $
while $ i<{N}_I $ do
$ \left|\begin{array}{l}{\mathbf{x}}^{\left(i+1\right)}\leftarrow {\mathbf{x}}^{(i)}-\eta \times \mathrm{GradLSTM}\left({\mathbf{x}}^{(i)}\right)\\ {}{N}_I\nearrow; \eta \searrow; i\leftarrow i+1\end{array}\right. $
end
Train $ {\Psi}_{\Phi, \Gamma} $
Result: $ {\mathbf{x}}^{\star}\leftarrow {\Psi}_{\Phi, \Gamma}\left({\mathbf{x}}^{(0)},\mathbf{y},\Omega \right) $
Whereas classic gradient-based iterative methods may suffer from a relatively low convergence rate, which dramatically worsens when dealing with high-dimensional states, the proposed trainable solver $ \Gamma $ considerably speed up the convergence and reach good interpolation performance with only 10–100 gradient steps, depending on the complexity of underlying dynamical process. Regarding the convolutional architecture of both trainable prior and solver, it also guarantees their scalability and linear complexity w.r.t. the size of the spatial domain as well as the number of observations. Such a framework also allows to explore different configurations for the inner variational cost $ {\mathcal{J}}_{\Phi} $ and the outer training loss $ \mathcal{L} $ .
2.2. Toward a stochastic formulation
Because 4DVarNet draws from a DA state space formalism, it is easy to consider this DL model under a stochastic formulation. Indeed, most operational DA methods are based on Gaussian assumptions and simply estimate the two first moments of the posterior distribution, mean $ {\mathbf{x}}^{\star } $ and covariance $ {\mathbf{P}}^{\star } $ . We suppose that for a given time $ t $ and related DA window $ t-L:t+L $ of length $ 2L+1 $ , we know how to simulate $ N $ nonconditional simulations $ {\mathbf{x}}_{t-L\mapsto t+L}^{s,i} $ , $ i=1,\cdots, N $ whose compatibility of their statistical properties, typically the mean and covariance for a Gaussian process (GP), with the true state $ {\mathbf{x}}_{t-L\mapsto t+L} $ is verified. By denoting $ \mathbf{u}\in \mathcal{D} $ a spatiotemporal point in the state vector $ \mathbf{x}=\left[x\left(\mathbf{u}\right)\right]\in {\unicode{x211D}}^m $ , we have the following set of (nonstationary) point-wise properties:
In what follows, we simplify the notations and $ {\mathbf{x}}_{t-L\mapsto t+L} $ writes $ \mathbf{x} $ and $ C\left[\mathbf{x}\left(\mathbf{u}\right),\mathbf{x}\left({\mathbf{u}}^{\prime}\right)\right]=C\left(\mathbf{u},{\mathbf{u}}^{\prime}\right) $ . Then, the previous set of properties translates into matrix notation with state-space DA formalism as $ \mathbf{x}\sim \mathcal{N}\left(\mathbf{m},\mathbf{P}\right) $ with $ \mathbf{P}=\left[C\left(\mathbf{u},{\mathbf{u}}^{\prime}\right)\right] $ which is nothing else than the prior distribution involved in the Bayesian estimation making the link between all DA approaches.
By “nonconditional,” we mean that these simulations do not comply with observations $ {\mathbf{y}}_{\mathbf{t}-\mathbf{L}\mapsto \mathbf{t}+\mathbf{L}} $ . To make these simulations compliant with the observations, we draw from traditional geostatistics and kriging-based conditioning (Wackernagel, Reference Wackernagel2003) and replace the prior model and kriging solver, also known as Optimal Interpolation (OI) or BLUE (Best Linear Unbiased Estimator) in the DA community, see, for example, Asch et al. (Reference Asch, Bocquet and Nodet2016), by our 4DVarNet estimation. A conditional simulation $ {\mathbf{x}}^{\star, i} $ writes:
where $ {\mathbf{x}}^{\star } $ is the 4DVarNet reconstruction, $ {\mathbf{x}}^{s,i} $ is a nonconditional simulation of the underlying process with similar statistical properties than $ \mathbf{x} $ and $ {\mathbf{x}}^{\star, s,i} $ is the 4DVarNet reconstruction of $ {\mathbf{x}}^{s,i} $ using as sparse observations a subsampling of the simulation with same mask $ \Omega $ than true observations.
In absence of observations errors, and because it is backboned on variational DA, we proved, see Beauchamp et al. (Reference Beauchamp, Thompson, Georgenthum, Febvre and Fablet2022b), that 4DVarNet applied to the GP Optimal Interpolation with:
-
• inner variational cost being the OI variational cost, and
-
• outer 4DVarNet training loss function being the MSE loss w.r.t the true state when the latter is available in a supervised setting, or the same OI variational cost for unsupervised configurations,
complies in the asymptotic limit with optimal kriging state $ {\mathbf{x}}^{\star}\to {\mathbf{x}}^{\mathrm{OI}} $ and its related properties: nonbias and optimal variance.
In this GP formalism, it implies that conditioning a simulation by the observations based on equation (4) have the following set of point-wise properties, drawn from conditional Gaussian distributions theory, see, for example, Chilès and Delfiner (Reference Chilès and Delfiner2012):
where $ \alpha, \hskip0.2em \beta =1,\cdots, n $ denotes the observation index and $ {\lambda}_{\alpha } $ are the optimal kriging weights in the estimation $ {\mathbf{x}}^{\mathrm{OI}}\left(\mathbf{u}\right) $ . Making the link with state-space DA formalism, equation (5a) becomes:
where we recognize the posterior covariance formulation with $ \mathbf{K}={\left[{\lambda}_{\alpha}\left(\mathbf{u}\right)\right]}_{\begin{array}{c}\alpha =1,\cdots, n\\ {}\mathbf{u}\in \mathcal{D}\end{array}}\in {\mathbf{R}}^{n\times m} $ and $ \mathbf{H} $ the observation operator mapping the state to the observations.
Then, the covariance between two conditionally simulated locations $ \mathbf{u} $ and $ {\mathbf{u}}^{\prime } $ is less than the prior covariance which makes sense because conditioning a process with observations shall constrain the uncertainty.
The latter set of properties is verified when a GP approximation of state $ \mathbf{x} $ is valid. When the underlying covariance model is not known and/or a GP approximation with linear dynamics does not apply, it was shown that 4DVarNet estimation (Fablet et al., Reference Fablet, Beauchamp, Drumetz and Rousseau2021; Beauchamp et al., Reference Beauchamp, Febvre, Georgenthum and Fablet2022a) enables to improve the OI interpolation, that is, it is still unbiased but with lower MSE w.r.t the ground truth, that is, lower variance of its error $ \operatorname{var}\left[{\mathbf{x}}^{\star}\left(\mathbf{u}\right)\right]<\operatorname{var}\left[{\mathbf{x}}^{\mathrm{OI}}\left(\mathbf{u}\right)\right] $ . Consequently, running an ensemble of $ N $ simulations conditioned by 4DVarNet will provide an approximation of the probability distribution function $ {p}_{\mathbf{x}\mid \mathbf{y}} $ of state $ \mathbf{x}\mid \mathbf{y} $ with both improvements on the two first moments $ {\mathbf{x}}^{\star } $ and $ {\mathbf{P}}^{\star } $ compared to simulations conditioned by kriging. The ensemble mean $ \overline{{\mathbf{x}}^{\star, i}} $ will be the 4DVarNet reconstruction in the limits of $ N\to +\infty $ :
We stress that such a formulation is consistent with traditional DA: indeed when looking at the spatial sequential scheme of the Ensemble Kalman Filter (Evensen, Reference Evensen2009), the forecast step can be seen as nonconditional simulations at time $ t-1 $ generating $ N $ realizations for time $ t $ . The average forecast and the covariance matrix are then computed on this set of realizations. and the analysis step corresponds to the conditioning of these realizations. Here, our approach follows the same idea but we do not address the problem sequentially and produce spatio-temporal nonconditional simulations over the assimilation window, before conditioning them with the observations. Such a formulation is more suitable to produce ensemble of 4DVarNet realizations. In the following, we refer to our approach as En4DVarNet. We believe this formulation is the more appropriate amongst the set of hybrid methods which aims to incorporate the statistical flow-dependent strength of EnKF into 4DVar. Indeed, when comparing this approach with En4DVar (Zhang et al., Reference Zhang, Zhang and Hansen2009), our method share some ideas for both prior and solvers:
-
• the minimization procedure for both En4DVar and En4DVarNet are still backboned on their original 4DVar and 4DVarNet optimization scheme, respectively based on the adjoint model and the neural prior encoder $ \Phi $ . It is not always the case for other hybrid methods, see, for example, the class of 4DEnVar methods (Qiu et al., Reference Qiu, Shao, Xu and Wei2007) which avoid the adjoint model by approximating the linearized tangent operator with the simulated observation perturbations;
-
• En4DVar retrieves the statistical ensemble background covariance from a parallel EnKF implementation, which allows for flow-dependency of the latter. In En4DVarNet, because we sampled spatiotemporal simulation with similar properties than the true state along the DA window, their associated prior mean and covariance matrix may differ from time to time. We precise that our approach, though combining ensemble-based formulation with OI-based conditioning should not be confused with the so-called EnOI approach in DA, see, for example, Counillon and Bertino (Reference Counillon and Bertino2009) and Asch et al. (Reference Asch, Bocquet and Nodet2016), which is just a EnKF in which the flow-dependent error covariance matrix is replaced by a stationary matrix calculated from an historical ensemble.
Also, as mentioned above, the generic presentation of En4DVarNet supposes to be able to sample in the prior distribution. In practice, this is not an easy task when no model is available, see Section 2.3 for a practical solution. When not possible, future works may consider the DL problem of joint learning for both the reconstruction and prior emulation based on neural or differentiable formulations of the latter, see, for example, differentiable physical quasi-geostrophic model (Frezat et al., Reference Frezat, Le Sommer, Fablet, Balarac and Lguensat2022) or SPDE-based linear approximations (Beauchamp et al., Reference Beauchamp, Thompson, Georgenthum, Febvre and Fablet2022b). Such additional developments would enable to incorporate the ensemble statistics into the 4DVarNet outer training loss, in order to comply with maximum likelihood criteria w.r.t to the Ground Truth (supervised setting) or the observations (unsupervised setting).
In the end, such a name for the proposed approach leaves space for additional flavors of ensemble-based 4DVarNets in the future, as it is done with a combination of ensemble and variational approaches in DA.
2.3. Practical aspects
The key aspect of our approach lies in the capability of providing a prior model from which we are able to sample realizations of the underlying process. In the current 4DVarNet formulation, the operator $ \Phi $ plays the role of autoencoding the state and is learnt jointly with the solver to satisfy at most the loss function used in the 4DVarNet training process. Such an operator should not be seen as an emulator able to generate simulations of a geophysical variable, like sea surface height (SSH), as used in Section 3. For now, because the main 4DVarNet applications were developed in a supervised setting, we propose to sample the realizations in the catalogue of the model-based experiments used during training to estimate the pdf of the 4DVarNet reconstructions applied on real-world datasets. Such a sampling will require to retrieve the analogs (Lguensat et al., Reference Lguensat, Huynh Viet, Sun, Chen, Fenglin, Chapron and Fablet2017) of the state for a given DA window, which is done by comparing the low-resolution properties of the available observations to the model-based catalogue. This is why the En4DVarNet explored in this paper shall be seen as a post-processing as the traditional 4DVarNet method.
Algorithm 2. Ensemble-based 4DVarNet (En4DVarNet) algorithm
Data:
$ \mathbf{y}{\left(\Omega \right)}_{t-L\mapsto t+L} $
$ \Omega $ , subdomain of space $ \mathcal{D} $ with observations
$ {\mathbf{x}}_T $ , model-based 4DVarNet training dataset, also use as catalogue for post-processing
List of procedures:
$ {\Psi}_{\Phi, \Gamma}\left(\mathbf{y}\left(\Omega \right)\right) $ : 4DVarNet reconstruction
$ \Phi $ : CNN-based representation of the dynamical system
$ \Gamma $ : iterative gradient-based update operator with LSTM-based representation of $ {\nabla}_{\mathbf{x}}\mathcal{J}\left(\mathbf{x}\right) $
knn $ \left(\mathbf{y}\left(\Omega \right),{\mathbf{x}}_T,k\right) $ : Get the kth nearest neighbor (analog) of $ \mathbf{y} $ in the training dataset
Init:
$ {\mathbf{x}}^{\star, (0)}=\left\{\begin{array}{l}\mathbf{y}\hskip0.5em \mathrm{over}\hskip0.5em \Omega \\ {}0\hskip0.5em \mathrm{elsewhere}\hskip0.5em \end{array}\right. $
for $ i\in 1,\dots, N $ do
$ \left|\begin{array}{l}\hskip0.24em {\mathbf{x}}^{\star }={\Psi}_{\Phi, \Gamma}\left(\mathbf{y}\left(\Omega \right)\right)\\ {}\hskip0.24em {\mathbf{x}}^{s,i}=\mathbf{knn}\left(\mathbf{y}\left(\Omega \right),{\mathbf{x}}_T,i\right)\\ {}\hskip0.24em {\mathbf{x}}^{\star, s,i}={\Psi}_{\Phi, \Gamma}\left({\mathbf{x}}^{s,i}\left(\Omega \right)\right)\end{array}\right. $
end
Result: $ {\mathbf{x}}^{\star, i}={\mathbf{x}}^{\star }+\left({\mathbf{x}}^{s,i}-{\mathbf{x}}^{\star, s,i}\right) $
Algorithm 2 details the entire procedure to derive the ensemble of 4DVarNet reconstructions based on the analog sampling of independent realizations in the training dataset.
3. Real-World SSH Application
To evaluate our approach, we use a real-world nadir altimetric dataset in January 2017. The SSH observations include a constellation of seven nadir altimeters: SARAL/Altika, Jason 2, Jason 3, Sentinel 3A, Haiyang-2A, and Cryosat-2. Note that the Cryosat-2 altimeter dataset is not included in the mapping reconstruction, it is kept to perform an independent validation of both reconstruction and UQ.
In the previous 4DVarNet SSH-related publications, we use the NATL60 OSSE (Observation System Simulation experiment) experiment ran with the state-of-the-art NEMO model (Nucleus for European Modeling of the Ocean) at high resolution (1/60°) over the whole North Atlantic (NATL) basin to train 4DVarNet (Beauchamp et al., Reference Beauchamp, Febvre, Georgenthum and Fablet2022a). We used a supervised configuration where the target variable is defined as the anomaly between the Ground Truth and DUACS (Data Unification and Altimeter Combination System) OI (Taburet et al., Reference Taburet, Sanchez-Roman, Ballarotta, Pujol, Legeais, Fournier, Faugere and Dibarboure2019). Recent applications of 4DVarNet on the raw SSH, without providing the DUACS OI baseline as a coarse version of the reconstruction, showed that this flavor improves again the performance when using as length for the DA window $ \sim 15 $ days. For our experiment, we use this 4DVarNet pretrained model to reconstruct SSH over the same 10° × 10° GulfStream domain used during the OSSE-based training.
3.1. Results
In the following experiment, we apply our ensemble-based 4DVarNet approach with 60 members on the 6 nadir constellation on January 2017. Figure 2 displays an example of the 4DVarNet reconstruction on January 10, 2017 as a mean state and we extract the time series of three spatial positions P1, P2, and P3, marked by black circles. P1 is located in a region less energetic than the rest of the GulfStream domain, P2 lies along the main meander and P3 is away from the latter, but it stays close to energetic eddies that may appear from time to time. The corresponding time series of ensemble mean and ensemble spread are provided on the right-hand side of Figure 2: we can see how depending on the position, the variability among the members may differ. Interestingly, a periodic pattern seems to reproduce for all the positions selected which indicates the strong weight of the observation nadir sampling on the 4DVarNet uncertainty.
As an illustration of the En4DVarNet variability, Figure 3 displays the SSH gradient field of the 4DVarNet reconstruction (January 4, 2021): we focus on a small bottom-left 50 × 50 pixels subdomain to show how members may be able to reproduce different realistic patterns and small eddy structures while maintaining the members compliant with the observations along the DA window.
In the same vein, Figure 4a,b respectively displays the SSH gradient and corresponding spreads from December 31, 2016 to January 25, 2017 (every 5 days). We can see how the main GulfStream meander and smaller eddies evolve over time. It is clear in Figure 4b how the observations help in decaying the reconstruction uncertainty. Then, when moving away from the observations at the center of the assimilation window, the uncertainty grows quickly, not only based on the distance from the nadir altimeters but also influenced by the SSH spatiotemporal dynamics.
Figure 5 enables to understand the evolution of the 4DVarNet uncertainty over time. We provide the full standard deviation maps for $ t=0 $ (December 31, 2016) and only displays the highest uncertainties levels along z-axis for the next 30 dates over the test period. While the observation sampling evolves every day, it is clear that the ensemble approach identifies areas of the GulfStream domain with persistently high uncertainty levels. This is the case on some positions along the main meandrum where no observations are ever available. Let us note the specific case of the bottom-left region (see again Figure 3) where a small area exhibits a high and recurrent ensemble spread. Such behaviors are typical for smoothers like 4DVarNet which cannot retrieve smaller energy areas without any additional observations. On this point, the future SWOT mission will surely help to solve for such problems (Gaultier et al., Reference Gaultier, Ubelmann and Fu2015; Metref et al., Reference Metref, Cosme, Le Guillou, Le Sommer, Brankart and Verron2020; Beauchamp et al., Reference Beauchamp, Febvre, Georgenthum and Fablet2022a; Febvre et al., Reference Febvre, Fablet, Sommer and Ubelmann2022).
Because we are using an OSE experiment, we kept the Cryosat-2 alongtrack dataset for validation. Our computations take into account the observational noise of the Cryosat-2 nadir altimeter above the GulfStream domain (https://doi.org/10.48670/moi-00144) whose standard deviation varies between 0.01 and 0.02 m. Figure 6a,b displays the two alongtrack Cryosat-2 nadir crossings on January 10, 2021, with corresponding DUACS OI, 4DVarNet reconstruction and both En4DVarNet mean and interval of confidence. We also built a 1° × 1° map over the GulfStream domain to accumulate all the occurrences of Cryosat-2 appearing in a pixel. This allows for computing the empirical probability that observations fall within the estimated 4DVarNet confidence interval (between the 5 and 95 percentile of the ensemble members) during the test period. Figure 6c shows this probability map on January 2017, with the number of Cryosat-2 occurrences given on the z-axis. The UQ provided by En4DVarNet seems often consistent with Cryosat-2 observations and probability values are greater than 0.7. Still, smaller areas display lower values (about 0.5) which may indicate the NATL60 simulation does not fully retrieve the statistical properties of true SSH dynamics over the GulfStream domain and there is a need for a larger catalogue or additional tools to emulate a stochastic version of such a variable in our framework.
4. Conclusion
In this work, we propose an ensemble-based version of the end-to-end DL 4DVarNet framework to both reconstruct and quantify uncertainty of SSH based on sparse altimetric nadir observations. Our solution relies on the ability to select the analog situations (nearest neighbors) of the available observations from a model-based dataset. Doing so, the underlying assumption is that the samples drawn would share similar statistical properties than the true state. Then, a 4DVarNet realization compliant with the observations is obtained by adding to the classic 4DVarNet reconstruction a residual which is simply the difference between the analog and its 4DVarNet estimation using as observation a subsampling of the analog with similar sparsity mask than the actual observations. This solution and its application to real-world nadir SSH observations demonstrates that post-conditioning independent simulations is a promising way to create a stochastic formulation of 4DVarNet. Along this line, future works would consist in generalize the approach by getting free from the analog solution and being able to generate independent realizations. We also plan to explore other ways involving ensemble approaches. For instance, an alternate post-processing of 4DVarNet may be to add to the observations a random noise as it is done in stochastic EnKF formulations. An other solution would be to embed through a Gaussian or ensemble formulation a new uncertainty term to optimize during training based on likelihood loss functions.
Author contribution
Conceptualization: M.B.; Data curation: Q.F.; Methodology: M.B., R.F.; Resources: Q.F.; Supervision: R.F.; Visualization: M.B.; Writing—original draft: M.B. All authors approved the final submitted draft.
Competing interest
The authors declare no competing interests exist.
Data availability statement
The open-source 4DVarNet version of the code is available at https://doi.org/10.5281/zenodo.7186322. The datasets for training 4DVarNet are shared through the 2020 BOOST-SWOT ocean data challenge (ODC) available on GitHub (https://github.com/ocean-data-challenges/2020a_SSH_mapping_NATL60). The datasets for application on real-world nadir are also available on GitHub and shared by the 2021 BOOST-SWOT ODC (https://github.com/ocean-data-challenges/2021a_SSH_mapping_OSE).
Funding statement
This work was supported by LEFE program (LEFE MANU project IA-OAC), CNES (grant OSTST DUACS-HR), and ANR Projects Melody and OceaniX. It benefited from HPC and GPU resources from Azure (Microsoft Azure grant) and from GENCI-IDRIS (Grant 2020-101030).
Provenance statement
This article is part of the Climate Informatics 2023 proceedings and was accepted in Environmental Data Science on the basis of the Climate Informatics peer review process.