Hostname: page-component-586b7cd67f-t7fkt Total loading time: 0 Render date: 2024-11-26T08:26:48.974Z Has data issue: false hasContentIssue false

Laser wakefield accelerator modelling with variational neural networks

Published online by Cambridge University Press:  06 January 2023

M. J. V. Streeter*
Affiliation:
School of Mathematics and Physics, Queen’s University Belfast, Belfast, UK
C. Colgan
Affiliation:
The John Adams Institute for Accelerator Science, Imperial College London, London, UK
C. C. Cobo
Affiliation:
York Plasma Institute, School of Physics, Engineering and Technology, University of York, York, UK
C. Arran
Affiliation:
York Plasma Institute, School of Physics, Engineering and Technology, University of York, York, UK
E. E. Los
Affiliation:
The John Adams Institute for Accelerator Science, Imperial College London, London, UK
R. Watt
Affiliation:
The John Adams Institute for Accelerator Science, Imperial College London, London, UK
N. Bourgeois
Affiliation:
Central Laser Facility, STFC Rutherford Appleton Laboratory, Didcot, UK
L. Calvin
Affiliation:
School of Mathematics and Physics, Queen’s University Belfast, Belfast, UK
J. Carderelli
Affiliation:
Gérard Mourou Center for Ultrafast Optical Science, University of Michigan, Ann Arbor, USA
N. Cavanagh
Affiliation:
School of Mathematics and Physics, Queen’s University Belfast, Belfast, UK
S. J. D. Dann
Affiliation:
Central Laser Facility, STFC Rutherford Appleton Laboratory, Didcot, UK
R. Fitzgarrald
Affiliation:
Gérard Mourou Center for Ultrafast Optical Science, University of Michigan, Ann Arbor, USA
E. Gerstmayr
Affiliation:
The John Adams Institute for Accelerator Science, Imperial College London, London, UK
A. S. Joglekar
Affiliation:
Gérard Mourou Center for Ultrafast Optical Science, University of Michigan, Ann Arbor, USA Ergodic LLC, San Francisco, USA
B. Kettle
Affiliation:
The John Adams Institute for Accelerator Science, Imperial College London, London, UK
P. Mckenna
Affiliation:
Department of Physics, SUPA, University of Strathclyde, Glasgow, UK
C. D. Murphy
Affiliation:
York Plasma Institute, School of Physics, Engineering and Technology, University of York, York, UK
Z. Najmudin
Affiliation:
The John Adams Institute for Accelerator Science, Imperial College London, London, UK
P. Parsons
Affiliation:
Central Laser Facility, STFC Rutherford Appleton Laboratory, Didcot, UK
Q. Qian
Affiliation:
Gérard Mourou Center for Ultrafast Optical Science, University of Michigan, Ann Arbor, USA
P. P. Rajeev
Affiliation:
Central Laser Facility, STFC Rutherford Appleton Laboratory, Didcot, UK
C. P. Ridgers
Affiliation:
York Plasma Institute, School of Physics, Engineering and Technology, University of York, York, UK
D. R. Symes
Affiliation:
Central Laser Facility, STFC Rutherford Appleton Laboratory, Didcot, UK
A. G. R. Thomas
Affiliation:
Gérard Mourou Center for Ultrafast Optical Science, University of Michigan, Ann Arbor, USA
G. Sarri
Affiliation:
School of Mathematics and Physics, Queen’s University Belfast, Belfast, UK
S. P. D. Mangles
Affiliation:
The John Adams Institute for Accelerator Science, Imperial College London, London, UK
*
Correspondence to: M. J. V. Streeter, School of Mathematics and Physics, Queen’s University Belfast, BT7 1NN Belfast, UK. Email: [email protected]

Abstract

A machine learning model was created to predict the electron spectrum generated by a GeV-class laser wakefield accelerator. The model was constructed from variational convolutional neural networks, which mapped the results of secondary laser and plasma diagnostics to the generated electron spectrum. An ensemble of trained networks was used to predict the electron spectrum and to provide an estimation of the uncertainty of that prediction. It is anticipated that this approach will be useful for inferring the electron spectrum prior to undergoing any process that can alter or destroy the beam. In addition, the model provides insight into the scaling of electron beam properties due to stochastic fluctuations in the laser energy and plasma electron density.

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2023. Published by Cambridge University Press in association with Chinese Laser Press

1 Introduction

Laser wakefield accelerators (LWFAs) generate multi-GeV electron beams from cm-scale plasma channels using approximately 100 TW laser pulses[ Reference Leemans, Nagler, Gonsalves, Tóth, Nakamura, Geddes, Esarey, Schroeder and Hooker 1 Reference Gonsalves, Nakamura, Daniels, Benedetti, Pieronek, de Raadt, Steinke, Bin, Bulanov, van Tilborg, Geddes, Schroeder, T/oth, Esarey, Swanson, Fan-Chiang, Bagdasarov, Bobrova, Gasilov, Korn, Sasorov and Leemans 6 ]. The extreme acceleration gradients of LWFAs, coupled with their relative accessibility, have led to widespread research and pursuit of several applications, such as compact light sources[ Reference Schlenvoigt, Haupt, Debus, Budde, Jäckel, Pfotenhauer, Schwoerer, Rohwer, Gallacher, Brunetti, Shanks, Wiggins and Jaroszynski 7 Reference Wang, Feng, Ke, Yu, Xu, Qi, Chen, Qin, Zhang, Fang, Liu, Jiang, Wang, Wang, Yang, Wu, Leng, Liu, Li and Xu 10 ], generation of bright $\unicode{x3b3}$ -rays[ Reference Sarri, Schumaker, Di Piazza, Vargas, Dromey, Dieckmann, Chvykov, Maksimchuk, Yanovsky, He, Hou, Nees, Thomas, Keitel, Zepf and Krushelnick 11 ] and ultra-relativistic positron beams[ Reference Sarri, Corvan, Schumaker, Cole, Di Piazza, Ahmed, Harvey, Keitel, Krushelnick, Mangles, Najmudin, Symes, Thomas, Yeung, Zhao and Zepf 12 ], and for future particle colliders[ Reference Cros and Muggli 13 ]. Also, the combination of GeV electron beams and high-intensity laser pulses allows for the study of fundamental physics such as strong-field quantum electrodynamics[ Reference Thomas, Ridgers, Bulanov, Griffin and Mangles 14 Reference Poder, Tamburini, Sarri, Di Piazza, Kuschel, Baird, Behm, Bohlen, Cole, Corvan, Duff, Gerstmayr, Keitel, Krushelnick, Mangles, McKenna, Murphy, Najmudin, Ridgers, Samarin, Symes, Thomas, Warwick and Zepf 17 ].

In LWFAs, the non-linear laser pulse evolution[ Reference Thomas, Najmudin, Mangles, Murphy, Dangor, Kamperidis, Lancaster, Mori, Norreys, Rozmus and Krushelnick 18 , Reference Streeter, Kneip, Bloom, Bendoyro, Chekhlov, Dangor, Döpp, Hooker, Holloway, Jiang, Lopes, Nakamura, Norreys, Palmer, Rajeev, Schreiber, Symes, Wing, Mangles and Najmudin 19 ] and its effect on the injection and acceleration processes[ Reference Xia, Liu, Wang, Lu, Cheng, Deng, Li, Zhang, Liang, Leng, Lu, Wang, Wang, Nakajima, Li and Xu 20 Reference Zhang, Wan, Guo, Hua, Pai, Li, Zhang, Ma, Wu, Xu, Mori, Chu, Wang, Lu and Joshi 23 ] are highly sensitive to the initial conditions and can lead to significant shot-to-shot variation of the electron beam properties[ Reference Osterhoff, Popp, Major, Marx, Rowlands-Rees, Fuchs, Geissler, Hörlein, Hidding, Becker, Peralta, Schramm, Grüner, Habs, Krausz, Hooker and Karsch 24 , Reference Hafz, Jeong, Choi, Lee, Pae, Kulagin, Sung, Yu, Hong, Hosokai, Cary, Ko and Lee 25 ]. Recent work on high-stability laser systems and plasma sources has demonstrated improved stability, with the observation of few-percent variation in the electron beam energy and charge over 24 hours of continuous operation[ Reference Maier, Delbos, Eichner, Hübner, Jalas, Jeppe, Jolly, Kirchen, Leroux, Messner, Schnepp, Trunk, Walker, Werle and Winkler 26 ]. Long-term high-repetition rate operation has opened up the possibility of using machine learning techniques to model the sources of electron beam variation and to use closed-loop algorithms to optimise performance[ Reference Maier, Delbos, Eichner, Hübner, Jalas, Jeppe, Jolly, Kirchen, Leroux, Messner, Schnepp, Trunk, Walker, Werle and Winkler 26 Reference Kirchen, Jalas, Messner, Winkler, Eichner, Hübner, Hülsenbusch, Jeppe, Parikh, Schnepp and Maier 31 ].

For applications such as the study of the radiation reaction, knowledge of the pre-interaction electron beam properties is required to make precise measurements of any changes of these properties and thereby infer the validity of theoretical models[ Reference Samarin, Zepf and Sarri 32 Reference Arran, Cole, Gerstmayr, Blackburn, Mangles and Ridgers 34 ]. The destructive nature of the measurements necessitates predictable LWFA performance through one of the following: improved stability; preserving part of the spectrum as a reference[ Reference Baird, Murphy, Blackburn, Ilderton, Mangles, Marklund and Ridgers 33 ]; or by developing models capable of producing the electron beam properties from a given shot. In general, the ability to make predictions of the outputs from plasma accelerators will be advantageous to many of their applications.

Previous work in developing machine learning models for LWFAs has demonstrated the prediction of scalar metrics of the electron beam, such as total charge or peak energy[ Reference Shalloo, Dann, Gruse, Underwood, Antoine, Arran, Backhouse, Baird, Balcazar, Bourgeois, Cardarelli, Hatfield, Kang, Krushelnick, Mangles, Murphy, Lu, Osterhoff, Põder, Rajeev, Ridgers, Rozario, Selwood, Shahani, Symes, Thomas, Thornton, Najmudin and Streeter 29 Reference Kirchen, Jalas, Messner, Winkler, Eichner, Hübner, Hülsenbusch, Jeppe, Parikh, Schnepp and Maier 31 , Reference Lin, Qian, Murphy, Hsu, Hero, Ma, Thomas and Krushelnick 35 ]. However, many applications will require the prediction of vector properties, such as the spectrum or the longitudinal phase space, for which neural networks provide a convenient framework. A densely connected neural network (DNN) is made of densely connected layers, in which every input is the weighted sum of all of the outputs of the previous layer, with the individual weights as free parameters of the model. A non-linear activation function (e.g., a sigmoid function) then takes the weighted sum plus a bias value (another free model parameter) as its argument and returns an output value. An alternative to deeply connected layers is a convolutional layer, which performs convolutions between the input vector and a set of kernels. Networks using these layers, known as convolutional neural networks (CNNs), have been shown to be better suitable for learning meaningful features from natural signals[ Reference Rawat and Wang 36 ]. Further improvement to the predictive power of neural networks has been seen when including stochasticity in the outputs of individual nodes, in an architecture known as a variational neural network (VNN)[ Reference Kristiadi, Hein and Hennig 37 ].

In conventional accelerators, Emma et al. [Reference Emma, Edelen, Hogan, O’Shea, White and Yakimenko38 ] demonstrated training of a DNN to produce synthetic diagnostic outputs that matched the measured outputs for a new unseen dataset. CNNs have been used to predict X-ray properties from the post-undulator electron beam spectrum[ Reference Ren, Edelen, Lutman, Marcus, Maxwell and Ratner 39 ], while ensembles of DNNs have also been used to predict the electron beam longitudinal phase space and current profile from non-destructive bending radiation measurements[ Reference Hanuka, Emma, Maxwell, Fisher, Jacobson, Hogan and Huang 40 ]. In this work, we report on the training of an ensemble of VNNs to model the LWFA-generated electron spectrum using secondary diagnostics of the laser and plasma conditions. The LWFA ensemble was trained using a subset of experimental measurements of the electron spectrum with the remainder used for model validation. Each individual VNN in the ensemble was trained with a different subset of the training data, so that the ensemble provided both a mean prediction and an estimate of its uncertainty. The model also reveals the extent to which the measurements obtained from the available diagnostics are predictive of the accelerator performance, and which parameters have the strongest influence.

2 Experimental methods and results

The experiment was performed using the Gemini laser system at the Central Laser Facility in the UK (see Figure 1 for details). Laser pulses with an energy of ${E}_{\rm L} = \left(6.6\pm 0.5\right)$ J and a pulse duration of approximately equal to 50 fs were used to drive a GeV-scale LWFA. The pulses were focused with an $f/40$ off-axis parabolic mirror to a spot size of $\left(50\pm 2\right)\times \left(45\pm 2\right)$ μm in the horizontal (polarisation) and vertical planes, respectively, giving a peak intensity of $\left(5.5\pm 0.5\right)\times {10}^{18}$ W cm−2. The focus was aligned to a gas jet that was composed of a mixture of 2% nitrogen and 98% helium, enabling ionisation injection[ Reference Rowlands-Rees, Kamperidis, Kneip, Gonsalves, Mangles, Gallacher, Brunetti, Ibbotson, Murphy, Foster, Streeter, Budde, Norreys, Jaroszynski, Krushelnick, Najmudin and Hooker 41 Reference Chen, Esarey, Schroeder, Geddes and Leemans 44 ]. The gas jet had an average electron density of $\left(1.00\pm 0.07\right)\times {10}^{18}$ cm ${}^{-3}$ over a 17 mm length.

Figure 1 Illustration of the experimental setup (not to scale). The primary laser focus was aligned to the front edge of a supersonic gas jet emitted from a 15 mm diameter nozzle positioned 10 mm below the laser pulse propagation axis. The input laser energy was measured by integrating the signal on a near-field camera before the compressor, which was cross-calibrated with an energy meter and adjusted for the 60% compressor throughput. The scattered laser signal was observed from above by an optical camera, and the plasma channel electron density profile was measured using interferometry with a transverse short-pulse probe laser. The small ( $\lesssim 0.1\%$ ) transmission of the focusing laser pulse through a dielectric mirror was directed onto a CCD camera to obtain an on-shot far-field image. Electron beams from the LWFA were deflected by a magnetic dipole onto two Lanex screens (only the first is shown here), which were used to determine the electron spectrum in the range of $0.3<E<2.5$ GeV.

The LWFA-generated electron energy spectrum $\mathrm{d}W/\mathrm{d}E$ was measured using the spectrometer scintillator screen images, which were energy-calibrated by numerical tracking of electron trajectories in the magnetic field. The interferometry and top view cameras were used to extract the electron density profile, ${n}_{\rm e}(z)$ , and the laser scattering profile, ${S}_{\rm L}(z)$ , respectively, where $z$ is the laser propagation axis. A 2D Gaussian fit was performed on the far-field image to obtain six parameters: the peak fluence ${I}_0$ ; the centroids ${x}_0$ and ${y}_0$ ; the major and minor root-mean-square (RMS) spot widths ${\sigma}_a$ and ${\sigma}_b$ ; and the angle of the major axis of the ellipse with $x$ -axis $\theta$ . Due to the aberrations and clips caused by this beam-line, this far-field is not an exact replica of the main laser focus, but is representative of the shot-to-shot focal spot fluctuations.

The experimental results for this analysis were taken from an investigation of the radiation reaction, in which a second counter-propagating laser pulse is used to collide with the LWFA electron beam. For training and validating our predictive tool, we wish to only use shots where the laser pulse did not significantly overlap with the electron beam, so that the electron spectrum was not affected. For successful collisions, a gamma-beam was generated via the inverse Compton scattering interaction and was diagnosed spatially with a CsI scintillator array[ Reference Cole, Behm, Gerstmayr, Blackburn, Wood, Baird, Duff, Harvey, Ilderton, Joglekar, Krushelnick, Kuschel, Marklund, McKenna, Murphy, Poder, Ridgers, Samarin, Sarri, Symes, Thomas, Warwick, Zepf, Najmudin and Mangles 16 ] imaged onto a $1024\times 1024$ pixels charge-coupled device (CCD).

Due to the shot-to-shot variation in the electron beam position, most shots did not result in a significant collision, providing a large number of null shots for model training and testing. The brightness of the signal on the gamma detector was used to provide an approximate metric of the collision intensity. The 99.99th percentile pixel value of the background subtracted CCD image was taken as the peak of the gamma signal ${C}_{\gamma }$ . The highest value of this metric was ${C}_{\gamma} = 4380$ , whereas the median value was ${C}_{\gamma} = 12$ . From analysis of the collision statistics, a value of ${C}_{\gamma}\le 100$ was estimated to result from collisions with a peak normalised vector potential of ${a}_0<1.4$ . For 1 GeV electrons, this would result in a less than 1% energy loss[ Reference Thomas, Ridgers, Bulanov, Griffin and Mangles 14 ], approximately equal to the resolution of the spectrometer. Therefore, this value was taken as a threshold for null shots, for which the electron beam is unaffected by the collision. The experimental data were taken during a 5-hour period with a total of 779 shots. Model training and validation datasets were taken from shots for which ${C}_{\gamma}\le 100$ , with 90% (570 shots) used for training and 10% (75 shots) reserved for model validation.

3 Neural network architecture and training

The measurements of ${n}_{\rm e}(z)$ , ${S}_{\rm L}(z)$ and $\mathrm{d}W/\mathrm{d}E$ were stored as 1D vectors of lengths 310, 100 and 200, respectively. Although each of these signals is composed of at least 100 values, the variations over the full dataset are limited, and so in principle only a few parameters are required for each to encode these variations. An appropriate decoder would be able to generate a good approximation to the measured signals from this reduced set of parameters, which are called latent space variables. In this work, variational autoencoders (VAEs)[ Reference Kingma and Welling 45 , Reference Burgess, Higgins, Pal, Matthey, Watters, Desjardins and Lerchner 46 ] incorporating convolutional and densely connected layers were trained, as illustrated in Figure 2. By using a bottleneck of only a few nodes, the VAEs were trained to find an optimal latent space representation of the data, which allowed the decoder to reconstruct the measured signals.

Figure 2 Variational autoencoder (VAE) architecture for determining the latent space representation of the diagnostics. The type and dimension of each layer are indicated in the labels. The inset plots show an example laser scattering signal ${S}_{\rm L}$ and the approximation returned by the VAE. The input (and output) size ${N}_{\rm i}$ is equal to the data binning of the results for each individual diagnostic. Max pooling was used at the output of each convolution layer, which combined neighbouring output pairs and returned only the maximum of each pair. The average signal, in this case $\left\langle {S}_{\rm L}\right\rangle$ , was passed as an additional latent space parameter for the encoder and was used to scale the output of the decoder. The autoencoder structure was the same for each diagnostic, except for the size of the latent space.

The trained encoders for ${n}_{\rm e}(z)$ and ${S}_{\rm L}(z)$ were used to encode their respective measurements to their latent space representations, which were then combined with measurements of the laser far-field and the laser energy to create the inputs for the predictive model. A VNN, which we call the translator network, takes those inputs and returns values that are passed to the trained electron spectra decoder to generate the predicted spectrum. The translator was trained to learn the correlation between the reduced input set and the latent variables of the electron spectra decoder, as illustrated in Figure 3.

Figure 3 Diagram of the translator network architecture. Shown in the inset is an example measurement from the experimental data (black), with the mean prediction of the LWFA model ensemble (red) and individual model predictions (pink).

For the variational layers, two parameters are calculated for each node that represent the expectation value ${\mu}_m$ and standard deviation ${\sigma}_m$ . During training, values were sampled from Gaussian distributions given by these parameters, $\mathcal{N}({\mu}_m,{\sigma}_m)$ , such that the latent values for a given input set, ${x}_m$ , would vary according to ${\sigma}_m$ .

Table 1 Summary of autoencoder parameters used for each diagnostic and for the translator model.

a For the LWFA translator models, the value of $\beta$ varied from high to low during the training, with the final value given in the table. The training time for each autoencoder was 10 minutes and training of the 100 translator networks took a total of 3 hours, using an Intel Xeon Gold 6130 CPU at 2.1 GHz with 32 GB of RAM. The analysis and model training were performed on CLF Data Analysis as a Service (CDAS)[ Reference Barnsley, Matthews, Griffin, Salt, Ross, Cătălin, Dibbo and Crompton 49 ]. The neural networks were built using the Keras API (https://keras.io).

The training loss function used was as follows[ Reference Kingma and Welling 45 ]:

(1) $$\begin{align}{\mathrm{\mathcal{L}}}_{\rm T} &= {\mathrm{\mathcal{L}}}_{\mathrm{MSE}}-\beta {D}_{\mathrm{KL}},\nonumber\\ {}{\mathrm{\mathcal{L}}}_{\rm T} &= \frac{1}{N}\sum \limits_{n = 0}^N{\left[W\left({E}_n\right)-{W}_{\rm R}\left({E}_n\right)\right]}^2-\beta {D}_{\mathrm{KL}},\end{align}$$

where ${D}_{\mathrm{KL}} = {\sum}_{m = 0}^M\left[1+\log \left({\sigma}_m\right)-{\mu}_m^2-{\sigma}_m\right]/(2M)$ is the Kullback–Leibler (KL) divergence, ${\mathrm{\mathcal{L}}}_{\mathrm{MSE}}$ is the mean squared error (MSE) and $M$ is the total number of input sets in a given training iteration. The same loss function was used to train each VAE and also the final translator VNN, with the MSE taken between the predicted and measured diagnostic output ( ${n}_{\rm e}(z)$ , ${S}_{\rm L}(z)$ or $\mathrm{d}W/\mathrm{d}E$ ). The $\beta$ parameter was used to scale the relative importance of the regularisation, following the beta-VAE approach[ Reference Kingma and Welling 45 ]. During model validation, only the mean weights for the variational layers were used and the ${D}_{\rm KL}$ term from Equation (1) was omitted. Every node of the neural networks used the leaky rectified linear unit (leaky-ReLU)[ Reference Xu, Wang, Chen and Li 47 ] activation function with $\alpha = 0.3$ , which exhibited superior learning performance in comparison to sigmoid and hyperbolic tan functions, as well as leaky-ReLU with other values of $\alpha$ .

For the diagnostic VAEs, the number of latent parameters was chosen to be the minimum that gave high-fidelity reconstructions, with the $\beta$ parameter manually tuned to ensure that the distribution of each latent parameter for the training datasets was close to a standard normal distribution ( $\mathcal{N}\left(0,1\right)$ ). One latent space parameter was directly set as the average of the input signal (normalised by the training dataset). This parameter was then used to scale the decoder output and ensured that one of the latent space variables represents the amplitude of the signal, aiding interpretation of the trained networks. Once the VAEs were trained, the weights were frozen during the translator training process.

The translator is a DNN with a variational last layer. The translator VNN architecture (number of nodes and number of layers) and the value of $\beta$ were optimised using a genetic algorithm. During this process the training data were divided in two parts, with 50% of the data used to train each network and the other 50% used to calculate the test loss. This ensured that the validation dataset was kept purely for validation of the final model performance and not used in any tuning of the predictive model. The optimal architecture for the translator network, shown in Figure 3, comprises three densely connected layers, with a final variational layer with five outputs.

In order to quantify the uncertainty in the model predictions, 100 translator VNNs were trained, each using randomly selected 50% samples of the training dataset. The prediction of each of these models can then be used to obtain an average prediction, while the variation between model predictions is indicative of the random uncertainty and the finite size of the training data. In particular, the random sub-sampling affects the predictive quality in regions where the training data are sparse, typically at the extremes of the input parameters, resulting in a larger uncertainty in those regions.

The parameters for the trained VAEs and translator networks are summarised in Table 1. Each autoencoder was trained for 1000 iterations with a batch size of 64. The translator network was trained in three stages with 200, 400 and 300 iterations performed at 10, 4 and 1 times the final $\beta$ value to balance reconstruction fidelity with latent space smoothness[ Reference Burgess, Higgins, Pal, Matthey, Watters, Desjardins and Lerchner 46 ]. The training processes were all performed using the Adam optimiser[ Reference Kingma and Ba 48 ], with a learning rate of ${10}^{-3}$ , which was found to converge well.

4 LWFA prediction results

The measured electron spectra from the validation dataset are shown in Figure 4(a), along with the reconstructions by the electron spectra VAE (Figure 4(b)) and the average of the LWFA model ensemble predictions (Figure 4(c) ). The electron spectra VAE had an MSE of 0.011, and shows good qualitative and quantitative reproduction of the measured electron spectra. This indicates that the five parameters of the latent space, in combination with the structures learnt by the decoder, are sufficient to accurately generate the set of observations from the validation dataset. In other words, the five latent parameters are sufficient to generate the full variability of electron beams for this experimental setup. The question is then whether the secondary diagnostics are sufficient to determine the correct latent variables for each shot and thereby give an accurate prediction of the electron spectrum. The mean prediction of the LWFA model ensemble had an MSE of 0.057 and shows a similar trend in cut-off energy to the data, except for the few high- and low-energy outliers. By comparison, a naive prediction that all measured spectra are equal to the average spectrum from the training dataset gives an MSE value of 0.11, indicating that the LWFA model has a significant predictive capability.

Figure 4 (a) Measured electron spectra and reproduced electron spectra using (b) the trained variational autoencoder and (c) the mean prediction of the ensemble of the LWFA models. The individual shots are sorted by cut-off energy, determined as the highest energy for which the spectra exceed a threshold value.

Individual predictions of each model of the LWFA ensemble, along with the corresponding measured electron spectra, are shown in Figure 5. The variation in model predictions for a given shot is indicative of the uncertainty, due to the random sub-sampling of the training data and the stochastic training process. For a large region of the parameter space, the LWFA model predictions show a good agreement with the measurements, with large discrepancies occurring for the outliers in terms of cut-off energy. These shots also exhibit the largest variation in predictions between individual models within the ensemble. The total electron beam energy is reasonably accurately predicted, with relative RMS error of 12% for the entire validation dataset, compared to the relative beam energy RMS variation of 30%.

Figure 5 Individual shots selected at equally spaced intervals of the sorted shot index from Figure 4. The measured spectra (black) are shown alongside the predictions of each LWFA model from the trained ensemble (red) and an individual spectrum measurement closest to the median of the training data (blue). The sorted shot index is shown in the top right of each panel.

The relative influence of each input parameter on the LWFA model can be seen by varying each one in turn and measuring the effect on the resultant spectrum, as shown in Figure 6. The plasma density parameters have a relatively modest effect on the electron spectrum, indicating that the shot-to-shot variation of the plasma density profile is not the dominant contributor to the electron spectrum variation. Variations of the laser energy and the scattering profile are more significant, having the greatest effect on the generated electron spectra. The spatio-temporal distribution of the laser pulse is only indirectly diagnosed from the far-field diagnostic and the effect on the scattering profile, and is known to have a large influence on the accelerated electrons[Reference Maier, Delbos, Eichner, Hübner, Jalas, Jeppe, Jolly, Kirchen, Leroux, Messner, Schnepp, Trunk, Walker, Werle and Winkler26, Reference Dann, Baird, Bourgeois, Chekhlov, Eardley, Gregory, Gruse, Hah, Hazra, Hawkes, Hooker, Krushelnick, Mangles, Marshall, Murphy, Najmudin, Nees, Osterhoff, Parry, Pourmoussavi, Rahul, Rajeev, Rozario, Scott, Smith, Springate, Tang, Tata, Thomas, Thornton, Symes and Streeter28, Reference Shalloo, Dann, Gruse, Underwood, Antoine, Arran, Backhouse, Baird, Balcazar, Bourgeois, Cardarelli, Hatfield, Kang, Krushelnick, Mangles, Murphy, Lu, Osterhoff, Põder, Rajeev, Ridgers, Rozario, Selwood, Shahani, Symes, Thomas, Thornton, Najmudin and Streeter29]. Including additional laser diagnostics, such as measurement of the spatial phase profile[Reference Maier, Delbos, Eichner, Hübner, Jalas, Jeppe, Jolly, Kirchen, Leroux, Messner, Schnepp, Trunk, Walker, Werle and Winkler26, Reference Jalas, Kirchen, Messner, Winkler, Hübner, Dirkwinkel, Schnepp, Lehe and Maier30], should enable higher fidelity predictions.

Figure 6 Relative influence of the translator VNN input parameters on the predicted electron spectra. Each parameter is set to the mean value of the training dataset and then varied over $\pm 3$ standard deviations in 11 steps, with the variation in the spectrum quantified by the average RMS change to the spectrum. The nth latent space parameters for the scattering and density profile encoders are labelled ${S}_{\rm L}(n)$ and ${n}_{\rm e}(n)$ , respectively. Here, ${S}_{\rm L}(6)$ and ${n}_{\rm e}(5)$ are proportional to the average laser scattering signal and plasma electron density, respectively.

Although many of the input parameters are not straightforward to interpret physically, that is, those that are the latent space of the autoencoders, the laser energy is a physically important parameter in LWFAs. In practice, the inputs for the LWFA models are not independent of one another, as characterised by calculating the Pearson correlation coefficients for the training dataset. This reveals relatively strong correlations between the laser energy and several other parameters, especially ${S}_{\rm L}(3)$ , ${S}_{\rm L}(4)$ , ${S}_{\rm L}(6)$ , ${n}_{\rm e}(4)$ , ${n}_{\rm e}(5)$ and ${I}_0$ , which had correlation coefficients ranging from $r = 0.31$ to $r = 0.55$ . The trained LWFA model is then able to show what effect laser energy fluctuations have on the electron spectrum by varying each parameter proportionally according to their correlation coefficients with laser energy ${E}_{\rm L}$ , as shown in Figure 7(a). As the laser energy increases, the peak electron energy is relatively constant, while the overall charge increases. The total electron beam charge ${Q}_{\rm B}$ is plotted as a function of laser energy in Figure 7(b) , for both the raw data and the LWFA model predictions. The model prediction shows an approximately linear increase with laser energy with the equation ${Q}_{\rm B}\left[\mathrm{nC}\right] = 0.48{E}_{\rm L}\left[\mathrm{J}\right]-2.1$ .

Figure 7 The model predicted effect of varying the laser energy on (a) the predicted electron spectra and (b) the total electron beam charge. The data for each shot in the training data (red) are shown in (b), overlaid from the values calculated from the predicted spectra of the LWFA model (black points) with a linear fit (black dashed line).

The scaling parameters ${S}_{\rm L}(6)$ and ${n}_{\rm e}(5)$ are also easy to interpret, as they are the average scattering signal and electron density, respectively (normalised to the mean and variance over the training dataset). The effect of ${n}_{\rm e}(5)$ on the electron density profile and the predicted electron spectrum is shown in Figure 8. The average plasma electron density varied by 4% over the training dataset, as illustrated by the small perturbations to the density profile observed in Figure 8(a). A more significant effect is seen on the electron spectra in Figure 8(b), with the peak energy shifting higher as the average density drops, as expected for a dephasing-limited LWFA[Reference Lu, Tzoufras, Joshi, Tsung, Mori, Vieira, Fonseca and Silva50, Reference Bloom, Streeter, Kneip, Bendoyro, Cheklov, Cole, Döpp, Hooker, Holloway, Jiang, Lopes, Nakamura, Norreys, Rajeev, Symes, Schreiber, Wood, Wing, Najmudin and Mangles51]. The effect on the spectrum is much smaller than that seen to be caused by the laser energy variation in Figure 7. This indicates that the level of natural variations of the plasma electron density in this dataset was sufficiently low that it was not a dominant contributor to the shot-to-shot variations in the electron spectra.

Figure 8 The effect of changing ${n}_{\rm e}(5)$ on (a) the electron density profile and (b) the predicted electron spectrum. All other latent space parameters are kept fixed at zero (i.e., their average values from the training dataset), while ${n}_{\rm e}(5)$ is varied over the range of $\pm 3$ standard deviations in the training dataset.

The other latent parameters generated by the VAEs do not have straightforward physical interpretations and only have meaning in combination with the trained encoders. In order to gain some insight into their physical meaning, the effect of changing each parameter can be observed on the corresponding diagnostic output, as well as on the predicted electron spectrum. An example is shown in Figure 9, where the effect of varying ${S}_{\rm L}(3)$ , the most dominant input parameter to the translator VNN, is shown.

Figure 9 The effect of changing ${S}_{\rm L}(3)$ on (a) the laser scattering profile and (b) the predicted electron spectrum. All other latent space parameters are kept fixed at zero (i.e., their average values from the training dataset), while ${S}_{\rm L}(3)$ is varied over the range of $\pm 3$ standard deviations in the training dataset.

Figure 9(a) shows that positive ${S}_{\rm L}(3)$ correlates with an increased laser scattering peak at the entrance to the gas jet ( $z = 0$ ) and for the last half of the plasma, while suppressing the signal for $1>z>7$ mm. This also results in an increased predicted total charge as well as an increased predicted maximum electron energy (see Figure 9(b)), a clearly beneficial effect for many applications. The scattered laser intensity is associated with Raman side-scattering and wavebreaking radiation, generated as the laser self-guides and self-compresses to a high peak intensity in the plasma channel[ Reference Thomas, Mangles, Najmudin, Kaluza, Murphy and Krushelnick 52 , Reference Matsuoka, McGuffey, Cummings, Horovitz, Dollar, Chvykov, Kalintchenko, Rousseau, Yanovsky, Bulanov, Thomas, Maksimchuk and Krushelnick 53 ]. Therefore, the increase of this scattering signal seen in Figure 9(a) indicates an increased possibility for the injection of electrons into the plasma wakefield at $z = 0$ mm, while maintaining a high amplitude plasma wave for $z\,{>}\,7$  mm, resulting in the enhanced electron spectrum predicted in Figure 9(b).

5 Conclusion

In conclusion, we have constructed and trained a predictive model for an LWFA that is capable of predicting the electron spectrum for a given shot, based on secondary diagnostics of the laser and plasma conditions. The model is constructed from separately trained variational convolutional autoencoders, with a VNN used to map a reduced parameter set to the latent space of an electron spectra decoder. An ensemble of models was trained on subsets of the training data, with the range of model predictions providing an estimate of the uncertainty. The predictive model ensemble performs better than the naive assumption that the electron spectrum is constant, and so has utility in estimating the electron spectrum in the case of destructive processes, such as a radiation reaction. The model fidelity is most likely limited by the lack of on-shot spatio-temporal information about the laser pulse, which is known to have a strong influence on the accelerated electron beam[ Reference Maier, Delbos, Eichner, Hübner, Jalas, Jeppe, Jolly, Kirchen, Leroux, Messner, Schnepp, Trunk, Walker, Werle and Winkler 26 ]. It is expected that this technique can be improved by including additional diagnostics of the laser spatial and spectral phase, and by increasing the size of the training dataset, especially for reducing the prediction error for the outliers. Further diagnostics of the laser–plasma interaction, such as spectrally resolving the scattering signal, may also provide additional information to improve the prediction accuracy. Neural networks of this kind could be an important tool for understanding the performance sensitivities of plasma accelerators, and also in providing synthetic diagnostics for applications of their electron beams and secondary sources.

Data availability

The data and code for this publication are available from the online repository zenodo.org at https://zenodo.org/record/7510352#.Y9K2XezP30o .

Acknowledgements

This work was supported by UK STFC ST/V001639/1, UK EPSRC EP/V049577/1 and EP/V044397/1 and Horizon 2020 funding under European Research Council (ERC) Grant Agreement No. 682399. M.J.V.S. acknowledges support from the Royal Society URF-R1221874. A.G.R.T and A.S.J acknowledge support from US DOE grant DE-SC0016804.

References

Leemans, W. P., Nagler, B., Gonsalves, A. J., Tóth, C., Nakamura, K., Geddes, C. G. R., Esarey, E., Schroeder, C. B., and Hooker, S. M., Nat. Phys. 2, 696 (2006).CrossRefGoogle Scholar
Kneip, S., Nagel, S. R., Martins, S. F., Mangles, S. P. D., Bellei, C., Chekhlov, O., Clarke, R. J., Delerue, N., Divall, E. J., Doucas, G., Ertel, K., Fiuza, F., Fonseca, R., Foster, P., Hawkes, S. J., Hooker, C. J., Krushelnick, K., Mori, W. B., Palmer, C. A. J., Phuoc, K. T., Rajeev, P. P., Schreiber, J., Streeter, M. J. V., Urner, D., Vieira, J., Silva, L. O., and Najmudin, Z., Phys. Rev. Lett. 103, 035002 (2009).CrossRefGoogle Scholar
Clayton, C. E., Ralph, J. E., Albert, F., Fonseca, R. A., Glenzer, S. H., Joshi, C., Lu, W., Marsh, K. A., Martins, S. F., Mori, W. B., Pak, A., Tsung, F. S., Pollock, B. B., Ross, J. S., Silva, L. O., and Froula, D. H., Phys. Rev. Lett. 105, 105003 (2010).CrossRefGoogle Scholar
Wang, X., Zgadzaj, R., Fazel, N., Li, Z., Yi, S. A., Zhang, X., Henderson, W., Chang, Y.-Y., Korzekwa, R., Tsai, H.-E., Pai, C.-H., Quevedo, H., Dyer, G., Gaul, E., Martinez, M., Bernstein, A. C., Borger, T., Spinks, M., Donovan, M., Khudik, V., Shvets, G., Ditmire, T., and Downer, M. C., Nat. Commun. 4, 1988 (2013).CrossRefGoogle Scholar
Leemans, W. P., Gonsalves, A. J., Mao, H.-S., Nakamura, K., Benedetti, C., Schroeder, C. B., Tóth, C., Daniels, J., Mittelberger, D. E., Bulanov, S. S., Vay, J.-L., Geddes, C. G. R., and Esarey, E., Phys. Rev. Lett. 113, 245002 (2014).CrossRefGoogle Scholar
Gonsalves, A. J., Nakamura, K., Daniels, J., Benedetti, C., Pieronek, C., de Raadt, T. C. H., Steinke, S., Bin, J. H., Bulanov, S. S., van Tilborg, J., Geddes, C. G. R., Schroeder, C. B., T/oth, C., Esarey, E., Swanson, K., Fan-Chiang, L., Bagdasarov, G., Bobrova, N., Gasilov, V., Korn, G., Sasorov, P., and Leemans, W. P., Phys. Rev. Lett. 122, 084801 (2019).CrossRefGoogle Scholar
Schlenvoigt, H. P., Haupt, K., Debus, A., Budde, F., Jäckel, O., Pfotenhauer, S., Schwoerer, H., Rohwer, E., Gallacher, J. G., Brunetti, E., Shanks, R. P., Wiggins, S. M., and Jaroszynski, D. A., Nat. Phys. 4, 130 (2008).CrossRefGoogle Scholar
Maier, A. R., Meseck, A., Reiche, S., Schroeder, C. B., Seggebrock, T., and Grüner, F., Phys. Rev. X 2, 031019 (2012).Google Scholar
Albert, F. and Thomas, A. G., Plasma Phys. Control. Fusion 58, 103001 (2016).CrossRefGoogle Scholar
Wang, W., Feng, K., Ke, L., Yu, C., Xu, Y., Qi, R., Chen, Y., Qin, Z., Zhang, Z., Fang, M., Liu, J., Jiang, K., Wang, H., Wang, C., Yang, X., Wu, F., Leng, Y., Liu, J., Li, R., and Xu, Z., Nature 595, 516 (2021).CrossRefGoogle Scholar
Sarri, G., Schumaker, W., Di Piazza, A., Vargas, M., Dromey, B., Dieckmann, M. E., Chvykov, V., Maksimchuk, A., Yanovsky, V., He, Z. H., Hou, B. X., Nees, J. A., Thomas, A. G., Keitel, C. H., Zepf, M., and Krushelnick, K., Phys. Rev. Lett. 110, 255002 (2013).CrossRefGoogle Scholar
Sarri, G., Corvan, D. J., Schumaker, W., Cole, J. M., Di Piazza, A., Ahmed, H., Harvey, C., Keitel, C. H., Krushelnick, K., Mangles, S. P. D., Najmudin, Z., Symes, D., Thomas, A. G. R., Yeung, M., Zhao, Z., and Zepf, M., Phys. Rev. Lett. 113, 224801 (2014).CrossRefGoogle Scholar
Cros, B. and Muggli, P., arXiv:1901.08436 (2019).Google Scholar
Thomas, A. G. R., Ridgers, C. P., Bulanov, S. S., Griffin, B. J., and Mangles, S. P. D., Phys. Rev. X 2, 041004 (2012).Google Scholar
Blackburn, T. G., Ridgers, C. P., Kirk, J. G., and Bell, A. R., Phys. Rev. Lett. 112, 015001 (2014).CrossRefGoogle Scholar
Cole, J. M., Behm, K. T., Gerstmayr, E., Blackburn, T. G., Wood, J. C., Baird, C. D., Duff, M. J., Harvey, C., Ilderton, A., Joglekar, A. S., Krushelnick, K., Kuschel, S., Marklund, M., McKenna, P., Murphy, C. D., Poder, K., Ridgers, C. P., Samarin, G. M., Sarri, G., Symes, D. R., Thomas, A. G. R., Warwick, J., Zepf, M., Najmudin, Z., and Mangles, S. P. D., Phys. Rev. X 8, 011020 (2018).Google Scholar
Poder, K., Tamburini, M., Sarri, G., Di Piazza, A., Kuschel, S., Baird, C. D., Behm, K., Bohlen, S., Cole, J. M., Corvan, D. J., Duff, M., Gerstmayr, E., Keitel, C. H., Krushelnick, K., Mangles, S. P. D., McKenna, P., Murphy, C. D., Najmudin, Z., Ridgers, C. P., Samarin, G. M., Symes, D. R., Thomas, A. G. R., Warwick, J., and Zepf, M., Phys. Rev. X 8, 031004 (2018).Google Scholar
Thomas, A. G. R., Najmudin, Z., Mangles, S. P. D., Murphy, C. D., Dangor, A. E., Kamperidis, C., Lancaster, K. L., Mori, W. B., Norreys, P. A., Rozmus, W., and Krushelnick, K., Phys. Rev. Lett. 98, 095004 (2007).CrossRefGoogle Scholar
Streeter, M. J. V., Kneip, S., Bloom, M. S., Bendoyro, R. A., Chekhlov, O., Dangor, A. E., Döpp, A., Hooker, C. J., Holloway, J., Jiang, J., Lopes, N. C., Nakamura, H., Norreys, P. A., Palmer, C. A. J., Rajeev, P. P., Schreiber, J., Symes, D. R., Wing, M., Mangles, S. P. D., and Najmudin, Z., Phys. Rev. Lett. 120, 254801 (2018).CrossRefGoogle Scholar
Xia, C., Liu, J., Wang, W., Lu, H., Cheng, W., Deng, A., Li, W., Zhang, H., Liang, X., Leng, Y., Lu, X., Wang, C., Wang, J., Nakajima, K., Li, R., and Xu, Z., Phys. Plasmas 18, 113101 (2011).CrossRefGoogle Scholar
Sävert, A., Mangles, S. P. D., Schnell, M., Siminos, E., Cole, J. M., Leier, M., Reuter, M., Schwab, M. B., Möller, M., Poder, K., Jäckel, O., Paulus, G. G., Spielmann, C., Skupin, S., Najmudin, Z., and Kaluza, M. C., Phys. Rev. Lett. 115, 055002 (2015).CrossRefGoogle Scholar
Wang, J., Feng, J., Zhu, C., Li, Y., He, Y., Li, D., Tan, J., Ma, J., and Chen, L., Plasma Phys. Control. Fusion 60, 034004 (2018).CrossRefGoogle Scholar
Zhang, C. J., Wan, Y., Guo, B., Hua, J. F., Pai, C.-H., Li, F., Zhang, J., Ma, Y., Wu, Y. P., Xu, X. L., Mori, W. B., Chu, H.-H., Wang, J., Lu, W., and Joshi, C., Plasma Phys. Control. Fusion 60, 044013 (2018).CrossRefGoogle Scholar
Osterhoff, J., Popp, A., Major, Z., Marx, B., Rowlands-Rees, T., Fuchs, M., Geissler, M., Hörlein, R., Hidding, B., Becker, S., Peralta, E., Schramm, U., Grüner, F., Habs, D., Krausz, F., Hooker, S., and Karsch, S., Phys. Rev. Lett. 101, 085002 (2008).CrossRefGoogle Scholar
Hafz, N. A. M., Jeong, T. M., Choi, I. W., Lee, S. K., Pae, K. H., Kulagin, V. V., Sung, J. H., Yu, T. J., Hong, K.-H., Hosokai, T., Cary, J. R., Ko, D.-K., and Lee, J., Nat. Photonics 2, 571 (2008).CrossRefGoogle Scholar
Maier, A. R., Delbos, N. M., Eichner, T., Hübner, L., Jalas, S., Jeppe, L., Jolly, S. W., Kirchen, M., Leroux, V., Messner, P., Schnepp, M., Trunk, M., Walker, P. A., Werle, C., and Winkler, P., Phys. Rev. X 10, 031039 (2020).Google Scholar
He, Z.-H., Hou, B., Lebailly, V., Nees, J. A., Krushelnick, K., and Thomas, A. G. R., Nat. Commun. 6, 7156 (2015).CrossRefGoogle Scholar
Dann, S. J. D., Baird, C. D., Bourgeois, N., Chekhlov, O., Eardley, S., Gregory, C. D., Gruse, J.-N., Hah, J., Hazra, D., Hawkes, S. J., Hooker, C. J., Krushelnick, K., Mangles, S. P. D., Marshall, V. A., Murphy, C. D., Najmudin, Z., Nees, J. A., Osterhoff, J., Parry, B., Pourmoussavi, P., Rahul, S. V., Rajeev, P. P., Rozario, S., Scott, J. D. E., Smith, R. A., Springate, E., Tang, Y., Tata, S., Thomas, A. G. R., Thornton, C., Symes, D. R., and Streeter, M. J. V., Phys. Rev. Accel. Beams 22, 041303 (2019).CrossRefGoogle Scholar
Shalloo, R. J., Dann, S. J., Gruse, J. N., Underwood, C. I., Antoine, A. F., Arran, C., Backhouse, M., Baird, C. D., Balcazar, M. D., Bourgeois, N., Cardarelli, J. A., Hatfield, P., Kang, J., Krushelnick, K., Mangles, S. P., Murphy, C. D., Lu, N., Osterhoff, J., Põder, K., Rajeev, P. P., Ridgers, C. P., Rozario, S., Selwood, M. P., Shahani, A. J., Symes, D. R., Thomas, A. G., Thornton, C., Najmudin, Z., and Streeter, M. J., Nat. Commun. 11, 6355 (2020).CrossRefGoogle Scholar
Jalas, S., Kirchen, M., Messner, P., Winkler, P., Hübner, L., Dirkwinkel, J., Schnepp, M., Lehe, R., and Maier, A. R., Phys. Rev. Lett. 126, 104801 (2021).CrossRefGoogle Scholar
Kirchen, M., Jalas, S., Messner, P., Winkler, P., Eichner, T., Hübner, L., Hülsenbusch, T., Jeppe, L., Parikh, T., Schnepp, M., and Maier, A. R., Phys. Rev. Lett. 126, 174801 (2021).CrossRefGoogle Scholar
Samarin, G. M., Zepf, M., and Sarri, G., J. Mod. Opt. 65, 1362 (2018).CrossRefGoogle Scholar
Baird, C. D., Murphy, C. D., Blackburn, T. G., Ilderton, A., Mangles, S. P., Marklund, M., and Ridgers, C. P., New J. Phys. 21, 053030 (2019).CrossRefGoogle Scholar
Arran, C., Cole, J. M., Gerstmayr, E., Blackburn, T. G., Mangles, S. P., and Ridgers, C. P., Plasma Phys. Control. Fusion 61, 074009 (2019).CrossRefGoogle Scholar
Lin, J., Qian, Q., Murphy, J., Hsu, A., Hero, A., Ma, Y., Thomas, A. G. R., and Krushelnick, K., Phys. Plasmas 28, 083102 (2021).CrossRefGoogle Scholar
Rawat, W. and Wang, Z., Neural Comput. 29, 2352 (2017).CrossRefGoogle Scholar
Kristiadi, A., Hein, M., and Hennig, P., in Proceedings of the 37th International Conference on Machine Learning (2020), p. 5436.Google Scholar
Emma, C., Edelen, A., Hogan, M. J., O’Shea, B., White, G., and Yakimenko, V., Phys. Rev. Accel. Beams 21, 112802 (2018).CrossRefGoogle Scholar
Ren, X., Edelen, A., Lutman, A., Marcus, G., Maxwell, T., and Ratner, D., Phys. Rev. Accel. Beams 23, 040701 (2020).CrossRefGoogle Scholar
Hanuka, A., Emma, C., Maxwell, T., Fisher, A. S., Jacobson, B., Hogan, M. J., and Huang, Z., Sci. Rep. 11, 2945 (2021).CrossRefGoogle Scholar
Rowlands-Rees, T. P., Kamperidis, C., Kneip, S., Gonsalves, A. J., Mangles, S. P. D., Gallacher, J. G., Brunetti, E., Ibbotson, T., Murphy, C. D., Foster, P. S., Streeter, M. J. V., Budde, F., Norreys, P. A., Jaroszynski, D. A., Krushelnick, K., Najmudin, Z., and Hooker, S. M., Phys. Rev. Lett. 100, 105005 (2008).CrossRefGoogle Scholar
Pak, A., Marsh, K. A., Martins, S. F., Lu, W., Mori, W. B., and Joshi, C., Phys. Rev. Lett. 104, 025003 (2010).CrossRefGoogle Scholar
McGuffey, C., Thomas, A. G. R., Schumaker, W., Matsuoka, T., Chvykov, V., Dollar, F. J., Kalintchenko, G., Yanovsky, V., Maksimchuk, A., Krushelnick, K., Bychenkov, V. Y., Glazyrin, I. V., and Karpeev, A. V., Phys. Rev. Lett. 104, 025004 (2010).CrossRefGoogle Scholar
Chen, M., Esarey, E., Schroeder, C. B., Geddes, C. G. R., and Leemans, W. P., Phys. Plasmas 19, 033101 (2012).CrossRefGoogle Scholar
Kingma, D. P. and Welling, M., arXiv:1312.6114 (2013).Google Scholar
Burgess, C. P., Higgins, I., Pal, A., Matthey, L., Watters, N., Desjardins, G., and Lerchner, A., arXiv:1804.03599 (2018).Google Scholar
Xu, B., Wang, N., Chen, T., and Li, M., arXiv:1505.00853 (2015).Google Scholar
Kingma, D. P. and Ba, J., arXiv:1412.6980 (2014).Google Scholar
Barnsley, F., Matthews, B., Griffin, T., Salt, J. W. A., Ross, D., Cătălin, C., Dibbo, A., and Crompton, S. Y., in 11th New Opportunities for Better User Group Software (NOBUGS) (2016), p. 23.Google Scholar
Lu, W., Tzoufras, M., Joshi, C., Tsung, F., Mori, W., Vieira, J., Fonseca, R., and Silva, L., Phys. Rev. Spec. Top. Accel. Beams 10, 061301 (2007).CrossRefGoogle Scholar
Bloom, M. S., Streeter, M. J. V., Kneip, S., Bendoyro, R. A., Cheklov, O., Cole, J. M., Döpp, A., Hooker, C. J., Holloway, J., Jiang, J., Lopes, N. C., Nakamura, H., Norreys, P. A., Rajeev, P. P., Symes, D. R., Schreiber, J., Wood, J. C., Wing, M., Najmudin, Z., and Mangles, S. P. D., Phys. Rev. Accel. Beams 23, 061301 (2020).CrossRefGoogle Scholar
Thomas, A. G., Mangles, S. P. D., Najmudin, Z., Kaluza, M. C., Murphy, C. D., and Krushelnick, K., Phys. Rev. Lett. 98, 054802 (2007).CrossRefGoogle Scholar
Matsuoka, T., McGuffey, C., Cummings, P. G., Horovitz, Y., Dollar, F., Chvykov, V., Kalintchenko, G., Rousseau, P., Yanovsky, V., Bulanov, S. S., Thomas, A. G. R., Maksimchuk, A., and Krushelnick, K., Phys. Rev. Lett. 105, 034801 (2010).CrossRefGoogle Scholar
Figure 0

Figure 1 Illustration of the experimental setup (not to scale). The primary laser focus was aligned to the front edge of a supersonic gas jet emitted from a 15 mm diameter nozzle positioned 10 mm below the laser pulse propagation axis. The input laser energy was measured by integrating the signal on a near-field camera before the compressor, which was cross-calibrated with an energy meter and adjusted for the 60% compressor throughput. The scattered laser signal was observed from above by an optical camera, and the plasma channel electron density profile was measured using interferometry with a transverse short-pulse probe laser. The small ($\lesssim 0.1\%$) transmission of the focusing laser pulse through a dielectric mirror was directed onto a CCD camera to obtain an on-shot far-field image. Electron beams from the LWFA were deflected by a magnetic dipole onto two Lanex screens (only the first is shown here), which were used to determine the electron spectrum in the range of $0.3 GeV.

Figure 1

Figure 2 Variational autoencoder (VAE) architecture for determining the latent space representation of the diagnostics. The type and dimension of each layer are indicated in the labels. The inset plots show an example laser scattering signal ${S}_{\rm L}$ and the approximation returned by the VAE. The input (and output) size ${N}_{\rm i}$ is equal to the data binning of the results for each individual diagnostic. Max pooling was used at the output of each convolution layer, which combined neighbouring output pairs and returned only the maximum of each pair. The average signal, in this case $\left\langle {S}_{\rm L}\right\rangle$, was passed as an additional latent space parameter for the encoder and was used to scale the output of the decoder. The autoencoder structure was the same for each diagnostic, except for the size of the latent space.

Figure 2

Figure 3 Diagram of the translator network architecture. Shown in the inset is an example measurement from the experimental data (black), with the mean prediction of the LWFA model ensemble (red) and individual model predictions (pink).

Figure 3

Table 1 Summary of autoencoder parameters used for each diagnostic and for the translator model.

Figure 4

Figure 4 (a) Measured electron spectra and reproduced electron spectra using (b) the trained variational autoencoder and (c) the mean prediction of the ensemble of the LWFA models. The individual shots are sorted by cut-off energy, determined as the highest energy for which the spectra exceed a threshold value.

Figure 5

Figure 5 Individual shots selected at equally spaced intervals of the sorted shot index from Figure 4. The measured spectra (black) are shown alongside the predictions of each LWFA model from the trained ensemble (red) and an individual spectrum measurement closest to the median of the training data (blue). The sorted shot index is shown in the top right of each panel.

Figure 6

Figure 6 Relative influence of the translator VNN input parameters on the predicted electron spectra. Each parameter is set to the mean value of the training dataset and then varied over $\pm 3$ standard deviations in 11 steps, with the variation in the spectrum quantified by the average RMS change to the spectrum. The nth latent space parameters for the scattering and density profile encoders are labelled ${S}_{\rm L}(n)$ and ${n}_{\rm e}(n)$, respectively. Here, ${S}_{\rm L}(6)$ and ${n}_{\rm e}(5)$ are proportional to the average laser scattering signal and plasma electron density, respectively.

Figure 7

Figure 7 The model predicted effect of varying the laser energy on (a) the predicted electron spectra and (b) the total electron beam charge. The data for each shot in the training data (red) are shown in (b), overlaid from the values calculated from the predicted spectra of the LWFA model (black points) with a linear fit (black dashed line).

Figure 8

Figure 8 The effect of changing ${n}_{\rm e}(5)$ on (a) the electron density profile and (b) the predicted electron spectrum. All other latent space parameters are kept fixed at zero (i.e., their average values from the training dataset), while ${n}_{\rm e}(5)$ is varied over the range of $\pm 3$ standard deviations in the training dataset.

Figure 9

Figure 9 The effect of changing ${S}_{\rm L}(3)$ on (a) the laser scattering profile and (b) the predicted electron spectrum. All other latent space parameters are kept fixed at zero (i.e., their average values from the training dataset), while ${S}_{\rm L}(3)$ is varied over the range of $\pm 3$ standard deviations in the training dataset.