Neural network modeling and prediction of HfO2 thin film properties tuned by thermal annealing

Min Gao; Chaoyi Yin; Jianda Shao; Meiping Zhu

doi:10.1017/hpl.2024.6

Neural network modeling and prediction of HfO2 thin film properties tuned by thermal annealing

Published online by Cambridge University Press: 22 February 2024

Min Gao

Chaoyi Yin ,

Jianda Shao and

Meiping Zhu

Show author details

Min Gao: Affiliation:
School of Microelectronics, Shanghai University, Shanghai, China Laboratory of Thin Film Optics, Key Laboratory of Materials for High Power Laser, Shanghai Institute of Optics and Fine Mechanics, Chinese Academy of Sciences, Shanghai, China
Chaoyi Yin: Affiliation:
Laboratory of Thin Film Optics, Key Laboratory of Materials for High Power Laser, Shanghai Institute of Optics and Fine Mechanics, Chinese Academy of Sciences, Shanghai, China
Jianda Shao: Affiliation:
Laboratory of Thin Film Optics, Key Laboratory of Materials for High Power Laser, Shanghai Institute of Optics and Fine Mechanics, Chinese Academy of Sciences, Shanghai, China Center of Materials Science and Optoelectronics Engineering, University of Chinese Academy of Sciences, Beijing, China Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, China
Meiping Zhu*: Affiliation:
School of Microelectronics, Shanghai University, Shanghai, China Laboratory of Thin Film Optics, Key Laboratory of Materials for High Power Laser, Shanghai Institute of Optics and Fine Mechanics, Chinese Academy of Sciences, Shanghai, China Center of Materials Science and Optoelectronics Engineering, University of Chinese Academy of Sciences, Beijing, China Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, China
*: Correspondence to: Meiping Zhu, Shanghai Institute of Optics and Fine Mechanics, Chinese Academy of Sciences, Shanghai 201800, China. Email: [email protected]

Article contents

Abstract
Introduction
Materials and methods
Results and discussion
Conclusions
References

Rights & Permissions

Abstract

Plasma-enhanced atomic layer deposition (PEALD) is gaining interest in thin films for laser applications, and post-annealing treatments are often used to improve thin film properties. However, research to improve thin film properties is often based on an expensive and time-consuming trial-and-error process. In this study, PEALD-HfO2 thin film samples were deposited and treated under different annealing atmospheres and temperatures. The samples were characterized in terms of their refractive indices, layer thicknesses and O/Hf ratios. The collected data were split into training and validation sets and fed to multiple back-propagation neural networks with different hidden layers to determine the best way to construct the process–performance relationship. The results showed that the three-hidden-layer back-propagation neural network (THL-BPNN) achieved stable and accurate fitting. For the refractive index, layer thickness and O/Hf ratio, the THL-BPNN model achieved accuracy values of 0.99, 0.94 and 0.94, respectively, on the training set and 0.99, 0.91 and 0.90, respectively, on the validation set. The THL-BPNN model was further used to predict the laser-induced damage threshold of PEALD-HfO2 thin films and the properties of the PEALD-SiO2 thin films, both showing high accuracy. This study not only provides quantitative guidance for the improvement of thin film properties but also proposes a general model that can be applied to predict the properties of different types of laser thin films, saving experimental costs for process optimization.

Keywords

laser-induced damage threshold laser thin film neural network plasma-enhanced atomic layer deposition

Type: Research Article
Information: High Power Laser Science and Engineering , Volume 12 , 2024 , e21

DOI: https://doi.org/10.1017/hpl.2024.6 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial licence (https://creativecommons.org/licenses/by-nc/4.0), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original article is properly cited. The written permission of Cambridge University Press must be obtained prior to any commercial use.
Copyright: © The Author(s), 2024. Published by Cambridge University Press in association with Chinese Laser Press

1 Introduction

Optical thin films are key components of laser systems, and their optical properties and laser-induced damage threshold (LIDT) directly affect their output energy^[ Reference Xu, Zhu, Chai, Roshanzadeh, Boyd, Rudolph, Zhao, Chen and Shao ¹ ^– Reference Xing, Fan, Huang, Cheng and Du ³ ^]. Traditional preparation methods for laser thin films include electron-beam evaporation^[ Reference Field, Galloway, Kletecka, Rambo, Smith, Gruzdev, Carr, Ristau and Menoni ⁴ ^– Reference Shuai, Liu, Zhao, Qiu, Li, Gong, Sun, Zhou, Jiang, Dai, Shao and Xia ⁶ ^] and ion-beam sputtering^[ Reference Malobabic, Jupe and Ristau ⁷ ^]. Recently, plasma-enhanced atomic layer deposition (PEALD) has attracted attention because of its precise thickness controllability^[ Reference Mahata, Byun, An, Choi, An and Kim ⁸ ^], excellent conformality^[ Reference Faraz, Knoops, Verheijen, van Helvoirt, Karwal, Sharma, Beladiya, Szeghalmi, Hausmann, Henri, Creatore and Kessels ⁹ ^], low-temperature growth properties^[ Reference Kim, Kim, Park, Jeong, Kim, Chung, Kim and Park ¹⁰ ^] and high LIDT^[ Reference Liu, Jensen, Ma and Ristau ¹¹ ^]. Furthermore, post-treatment annealing improves the properties of thin films grown via PEALD^[ Reference Abromavičius, Kičas and Buzelis ¹² ^]. However, owing to the diversity and wide range of process parameters, process optimization and thin film performance improvement often require extensive, expensive and time-consuming experiments.

Back-propagation neural networks (BPNNs), a subset of machine learning, have shown potential for mapping the relationship between experimental parameters and material properties^[ Reference Li, Chen, Xiong, Liu, Dou, Zhan, Zhu, Chu, Li and Ma ¹³ ^, Reference Liu, Yu, Wan, Shu, Sun, Gui and Xu ¹⁴ ^]. This approach can identify underlying regularities in the training data by updating the internal weight parameters^[ Reference Lininger, Hinczewski and Strangi ¹⁵ ^, Reference Xia, Hu, Chen and Li ¹⁶ ^]. In recent years, researchers have begun to study the application of neural networks in the field of thin films to predict the growth rate^[ Reference Kimaev and Ricardez-Sandoval ¹⁷ ^– Reference Bahramian ²⁰ ^], hydrophobicity^[ Reference Gukeh, Moitra, Ibrahim, Derrible and Megaridis ²¹ ^], permeate flux and foulant rejection^[ Reference Fetanat, Keshtiara, Keyikoglu, Khataee, Daiyan and Razmjou ²² ^]. Although these reports demonstrate the application of BPNNs in various thin films, studies on the properties of laser thin films are lacking. Furthermore, the adopted models were mainly shallow structures with single or double hidden layers. Shallow-structure neural networks can meet most modeling and prediction needs but may require a large number of neurons to accurately represent the relationship between the input and output^[ Reference Montufar ²³ ^], which increases the likelihood of errors in models^[ Reference Fetanat, Keshtiara, Keyikoglu, Khataee, Daiyan and Razmjou ²² ^]. In 2022, Mengu et al. ^[ Reference Mengu, Rahman, Luo, Li, Kulce and Ozcan ²⁴ ^], while studying the emerging symbiotic relationship between deep learning and optics, reported the advantages of deep neural networks with three or more hidden layers in terms of approximation and generalization capability. However, as the number of hidden layers increases, deep neural networks may suffer from poor performance or training failure owing to issues such as vanishing/exploding gradients^[ Reference Liu, Chen, Du, Jin and Shang ²⁵ ^]. Therefore, it is necessary to determine the optimal number of hidden layers for solving a special task.

In this study, we employ several BPNN models to establish the relationship between the annealing process and the properties of PEALD-HfO₂ thin films for laser applications. Firstly, comparing the performance of BPNN models with different numbers of hidden layers, it is deduced that the three-hidden-layer back-propagation neural network (THL-BPNN) performs best. The THL-BPNN model was then used to model and predict the relationship between the annealing process and the PEALD-HfO₂ thin film properties and was compared with the other two models. Finally, the LIDT of the PEALD-grown thin films and the properties of the PEALD-SiO₂ thin films were predicted using the THL-BPNN model, and the applicability of the THL-BPNN model was verified. We believe that the THL-BPNN model can help predict the properties of other laser thin films.

2 Materials and methods

2.1 Data preparation

The HfO₂ thin films used to construct the annealing process–thin film property relationship were grown on Si substrates using a commercial PEALD device (Picosun Advanced R200, Finland) with an integrated remote plasma source. HfO₂ thin films were grown by alternating exposure to the precursor tetrakis-ethylmethylamino hafnium (Hf(N(CH₃)(CH₂CH₃))₄, TEMAH) and O₂/Ar gas mixture plasma reactant at a deposition temperature of 150°C. The number of deposition cycles was 500, and the pulse sequence for each HfO₂ growth cycle was as follows: TEMAH feeding (1.6 s), N₂ purging (19 s), Ar/O₂ mixture feeding (11 s) and Ar purging (10 s). The samples were then annealed in quartz tube annealing equipment (RS 80/300/11, Nabertherm) for 3 h. The annealing process included a combination of three atmospheres (vacuum, O₂ and N₂) and six annealing temperatures (300°C to 800°C in 100°C increments). For vacuum annealing, the pressure in the tubular annealing chamber was approximately 1 × 10^–4 Pa. For O₂ and N₂ atmosphere annealing, the gas flow rate was 150 SCCM for both O₂ and N₂. The HfO₂ thin films were measured using an ellipsometer (Horiba Uvisel 2), and the thicknesses and refractive indices were extracted using the Tauc-Lorentz model in DeltaPsi2 software, neglecting the extinction coefficient (k). The O/Hf ratio of the HfO₂ thin films was analyzed using X-ray photoelectron spectroscopy (XPS) (Thermo Scientific) with a monochromatic Al Kα (1486.6 eV) X-ray source. The data used to construct the annealing process–thin film property relationship consisted of 19 samples, including 1 as-deposited sample and 18 annealed samples.

The HfO₂ thin film data used for LIDT modeling and prediction come from Ref. [Reference Lin, Zhu, Song, Liu, Yin, Zeng and Shao26], including 12 samples treated by different annealing process parameters. Among them, six samples were annealed in an O₂ atmosphere, and the other six samples were annealed in a N₂ atmosphere. The annealing temperature ranged from 300°C to 800°C.

The SiO₂ thin film data used for property modeling and prediction come from Ref. [Reference Yin, Zhu, Zeng, Song, Chai, Shao, Zhang, Zhao, Li and Shao27], including 10 samples grown by different deposition process parameters. Among them, four samples were grown at different temperatures ranging from 50°C to 200°C, and six samples were grown with different precursor source exposure times ranging from 0.2 to 0.7 s.

Table 1 lists the detailed parameters of the datasets used to model and predict the properties of HfO₂ and SiO₂ thin films, including the refractive index, thickness and stoichiometric ratio. As the annealing temperature increases, the thickness of the HfO₂ thin film decreases and the refractive index increases. In a vacuum environment, O₂ environment and N₂ environment, the thickness of HfO₂ thin films annealed at different temperatures changes in the range of 34.7–42.7, 38.5–49.1 and 36.3–46.7 nm, respectively, while the refractive index (at 355 nm) of HfO₂ thin films annealed at different temperatures changes in the range of 1.99–2.24, 1.83–1.97 and 1.88–2.00, respectively. This means that the packing density of the HfO₂ thin film increases with increasing annealing temperature^[Reference Lin, Zhu, Song, Liu, Yin, Zeng and Shao²⁶^]. In addition, the O/Hf ratio of HfO₂ thin films annealed in an O₂ environment fluctuates slightly around the ideal value of 2.0. However, the O/Hf ratio of HfO₂ thin films annealed in vacuum and N₂ environments decreases with increasing annealing temperature.

Table 1 Datasets for property prediction of HfO₂ and SiO₂ thin films.

^*Note: 0, 1, 2 and 3 represent the as-deposited sample, O₂, N₂ and vacuum, respectively.

Table 2 lists the detailed parameters of the datasets used for LIDT modeling and prediction. Compared with PEALD-HfO₂ thin films, PEALD-SiO₂ thin films have lower absorption and impurity content. Furthermore, properties such as absorption, impurity content and stoichiometric ratio influence each other. Detailed relationships are described in Refs. [Reference Lin, Zhu, Song, Liu, Yin, Zeng and Shao26,Reference Yin, Zhu, Zeng, Song, Chai, Shao, Zhang, Zhao, Li and Shao27]. The LIDT was tested in one-on-one mode according to ISO 21254 using a Gaussian-shape 3ω neodymium-doped yttrium aluminum garnet (Nd:YAG) laser (355 nm, 7.8 ns). The LIDT test was performed under normal incidence, and the maximum laser fluence with zero damage probability was determined as the LIDT. It is worth mentioning that the LIDT of HfO₂ thin films is lower than that of SiO₂ thin films, which is attributed to the fact that the bandgap of HfO₂ is lower than that of SiO₂.

Table 2 Datasets for LIDT prediction of HfO₂ and SiO₂ thin films.

^*Note: 1 and 2 represent HfO₂ samples and SiO₂ samples, respectively.

2.2 Models

Six models, namely four BPNN models with different numbers of hidden layers (single-hidden-layer BPNN, double-hidden-layer BPNN, three-hidden-layer BPNN and four-hidden-layer BPNN), a support vector machine regression (SVR) model^[ Reference Cortes and Vapnik ²⁸ ^] using a Gaussian kernel function and a linear regression (LR) model^[ Reference Wang, Wu, Zheng, Zeng, Ding and Zhang ²⁹ ^], were used to establish the correlation between the annealing process and the refractive index, layer thickness and O/Hf ratio of PEALD-HfO₂ thin films. Except for the LR model, which belongs to the category of linear regression fitting, the other models belong to the category of nonlinear regression fitting. All models performed regression fitting by training on a training set, tuning the modeling parameters to achieve the highest accuracy (i.e., lowest error) and then validating on a validation set. When constructing the relationship between the annealing process and the properties of the PEALD-HfO₂ thin films, 6 samples were randomly selected as the validation set, and the remaining 13 samples (12 annealed samples and 1 as-deposited sample) were used as the training set. When predicting the LIDT of PEALD-grown thin films, 6 samples (3 HfO₂ samples and 3 SiO₂ samples) were randomly selected as the validation set, and the remaining 16 samples (9 HfO₂ samples and 7 SiO₂ samples) were used as the training set. When predicting the properties of PEALD-SiO₂ thin films, the leave-one-out cross-validation method was adopted owing to limited data. For each test, one sample was used as a validation set, and the remaining samples were used as a training set until every sample was used as a validation set. Subsequently, the average performance deviation was calculated for each model.

Figure 1 shows a schematic of the THL-BPNN model, including an input layer (layer 0), three hidden layers (layers 1–3) and an output layer (layer 4), with each layer containing one or more neurons. The number of neurons in the input and output layers was determined by the number of input and output variables in the dataset, whereas the number of neurons in the hidden layers was initially determined using Equation (1) (an empirical formula) and finally determined by a global traversal search:

(1)

$$\begin{align}l=\sqrt{u+v}+a,\end{align}$$

where u, v and l are the numbers of neurons in the input, output and hidden layers, respectively, and a is a random number between 1 and 10.

Figure 1 THL-BPNN model with all neurons in adjacent layers connected, where x = [x ₁; x ₂], y ₁ and h _ij represent the input, output and intermediate processing signals, respectively.

The neurons receive input signals from the previous layer and generate output signals for the next layer^[ Reference Xu, Zhang, Fu and Liu ³⁰ ^, Reference Ma, Li, Liu, Zhang, Zhang, Zheng and Lu ³¹ ^]. For example, the first neuron in layer 1 (from top to bottom), the circle where h ₁₁ is located, receives input signals, x = [x ₁; x ₂], from layer 0. Then x undergoes linear transformation to get the weighted sum, z, which is expressed as follows:

(2)

$$\begin{align}z={\boldsymbol{w}}^{\mathrm{T}}\boldsymbol{x}+b,\end{align}$$

where w = [w ₁; w ₂] $\in$ R is a weight vector between the neurons, and b $\in$ R is a bias.

Subsequently, z passes through a nonlinear activation function $f\left(\cdotp \right)$ ^[ Reference Qiu ³² ^], and the output signal h ₁₁ is generated as follows:

(3)

$$\begin{align}{h}_{11}=f(z).\end{align}$$

These processes were performed for each neuron in each layer to form the final output signal, y ₁ ^[ Reference Wiecha, Arbouet, Girard and Muskens ³³ ^]. Obviously, mapping from the input space to the output space is initially established through layer-by-layer information transfer.

To further improve the mapping accuracy, a training loss was constructed in the output layer, and an appropriate training algorithm is selected to update the relevant parameters (weights w and bias b) in combination with the chain rule^[ Reference LeCun, Bengio and Hinton ³⁴ ^] until the loss or the number of iterations reaches the preset threshold^[ Reference Guo, Barrett, Wang and Lvovsky ³⁵ ^]. The Levenberg–Marquardt algorithm^[ Reference Hagan and Menhaj ³⁶ ^] was used to solve the nonlinear least squares problem. The hyperbolic tangent function was selected as the activation function for all hidden layers. The initialization state of each run was fixed to avoid interference from other factors.

2.3 Model specification and evaluation

2.3.1 Variable scaling

Considering that different distribution ranges of the input and output values may lead to biased assessments, Equation (4) is used to scale the input and output of the data to [–Reference Xu, Zhu, Chai, Roshanzadeh, Boyd, Rudolph, Zhao, Chen and Shao1, 1]:

(4)

$$\begin{align}{X}_\mathrm{norm}=\frac{\left({Y}_\mathrm{max}-{Y}_\mathrm{min}\right)\left(X-{X}_\mathrm{min}\right)}{X_\mathrm{max}-{X}_\mathrm{min}}+{Y}_\mathrm{min},\end{align}$$

where X is the input or output vector; X _max and X _min are the maximum and minimum values of the input or output vector, respectively; and Y _max and Y _min are the maximum and minimum values after normalization, respectively.

2.3.2 Model evaluation metrics

The coefficient of determination (R ²)^[ Reference Chicco, Warrens and Jurman ³⁷ ^] was used to evaluate the overall performance of each model. The average accuracy (AA) was used to evaluate the performance of each model on a validation set with only a single sample. The root mean square error (RMSE)^[ Reference Chai and Draxler ³⁸ ^] was used to measure the deviation between the predicted and measured values:

(5)

$$\begin{align}{R}^2&=1-{\frac{\sum \left({Y}_i-{T}_i\right)}{\sum {\left({Y}_i-\overline{Y}\right)}^2}}^2,\end{align}$$

(6)

$$\begin{align}\mathrm{AA}&=\frac{1}{n}\sum \limits_{i=1}^n\left(1-\frac{\left|{Y}_i-{T}_i\right|}{\left|{Y}_i\right|}\right),\end{align}$$

(7)

$$\begin{align}\mathrm{RMSE}&={\left[\frac{1}{n}\sum \limits_{i=1}^n{\left({Y}_i-{T}_i\right)}^2\right]}^{1/2},\end{align}$$

where n is the size of the dataset; Y_i and T_i are the measured and predicted values of the ith sample in the dataset, respectively; and $\overline{Y}$ is the average of the measured values. A lower RMSE (close to 0) and higher R ² and AA (close to 1) indicate smaller differences between the measured and predicted values.

3 Results and discussion

3.1 Analysis of the number of hidden layers of the BPNN model

The influence of the number of hidden layers in the BPNN model on the modeling accuracy was studied using the measured data of the refractive index, layer thickness and O/Hf ratio of the PEALD-HfO₂ thin films treated with different annealing process parameters. The optimal number of neurons in each hidden layer was determined by a global traversal search on the training set corresponding to the lowest mean absolute error, and then the optimal model was applied to the validation set. For the refractive index and layer thickness datasets, the total number of neurons in the BPNN model with multiple hidden layers was consistent with that of the single-hidden-layer BPNN model. For the O/Hf ratio dataset, because the optimal number of neurons in the single-hidden-layer BPNN model is only five, this value is set as the maximum number of neurons in each hidden layer in the BPNN model with multiple hidden layers. The modeling and prediction accuracies are shown in Figure 2. Overall, as the number of hidden layers increased from one to three, the difference between the R ² and RMSE in the training and validation sets decreased, indicating that the model moved from inexact to exact fitting. However, as the number of hidden layers was further increased to four, the difference between the R ² and RMSE in the training and validation sets increased. This may be due to the fact that the combination of neurons in each layer grows exponentially with the number of hidden layers, which introduces the risk of overfitting while potentially obtaining better solutions. The only exception is the modeling of the refractive index, where a single-hidden-layer BPNN also exhibits good performance, which could be attributed to the small variation in the properties and the uncomplicated relationship between the input and output. With the three-hidden-layer BPNN model, the R ² values of the refractive index, layer thickness and O/Hf ratio were higher than 0.90 in both the training and validation sets. The THL-BPNN model was selected for the follow-up study.

Figure 2 Accuracy of BPNNs with one to four hidden layers based on (a) the refractive index (at 355 nm), (b) layer thickness and (c) O/Hf ratio of PEALD-HfO₂. The four columns in each subgraph represent the R ² values of the model in the training and validation sets and the RMSE values in the training and validation sets, respectively. The table indicates the number of neurons in each hidden layer of each model.

3.2 Comparison of the THL-BPNN model with other models

The performance of the THL-BPNN model was further evaluated and compared with the LR and SVR models. The refractive index, layer thickness and O/Hf ratio of the HfO₂ thin films predicted by the three models were compared with the measured values, as shown in Figure 3 and Table 3. As shown in Figures 3(a), 3(d) and 3(g), the poor performance of the LR model on all three datasets indicates a nonlinear relationship between the annealing process and the thin film properties. As shown in Figures 3(b), 3(e) and 3(h), the SVR model obtains a better fit than the LR model on the layer thickness and O/Hf ratio datasets, but it still does not perform well enough on the refractive index dataset. As shown in Figures 3(c), 3(f) and 3(i), the predicted and measured values of most samples are in good agreement, particularly for the refractive index dataset, indicating that the THL-BPNN model has a high accuracy in modeling and predicting the relationship between the annealing process parameters and HfO₂ thin film properties.

Figure 3 Measured and predicted (a)–(c) refractive index, (d)–(f) layer thickness and (g)–(i) O/Hf ratio of HfO₂ thin films. The data in the left-hand, middle and right-hand columns are predicted by the LR model, SVR model and THL-BPNN model, respectively. The blue line (with a slope of 1) serves as a guideline for perfect prediction.

Table 3 Evaluation of the LR, SVR and THL-BPNN models.

Table 3 lists the specific performance of all models on the training and validation sets. The THL-BPNN model performs best among the three regression models, with R ² values not lower than 0.90 for the refractive index, layer thickness and O/Hf ratio datasets. High R ² values and low RMSE values indicate that the THL-BPNN model can capture the patterns and extend them to unknown data. In short, the THL-BPNN model shows good stability in constructing the relationship between the annealing process and HfO₂ thin film properties under several conditions.

3.3 Evaluation of the THL-BPNN model for other thin film applications

3.3.1 Prediction of the LIDT of PEALD-HfO₂ and PEALD-SiO₂ thin films

The LIDT value is a key specification for thin films used in laser systems^[ Reference Pu, Liu, Wang, Pan, Chen and Liu ³⁹ ^, Reference Du, Zhu, Shi, Liu, Sun, Yi and Shao ⁴⁰ ^]. Firstly, we analyzed the main factors affecting the LIDT. According to Ref. [Reference Lin, Zhu, Song, Liu, Yin, Zeng and Shao26], the main factors affecting the LIDT of HfO₂ thin films are the C impurity content, N impurity content, absorption and O/Hf ratio. Pearson’s correlation coefficient was used to further analyze the correlation between the main influencing factors and the LIDT. The results shown in Figure 4 indicate that, except for the O/Hf ratio, which is positively correlated with the LIDT, all other parameters are negatively correlated with the LIDT. The change in the C and N impurity contents can be represented by the total impurity content. Likewise, for SiO₂ thin films, factors affecting the LIDT include the total impurity contents, absorption and O/Si ratio. Then, we applied the THL-BPNN to the quantitative prediction of the LIDT based on these factors. The total impurity contents, absorption, stoichiometric ratio and type of thin film were fed into the THL-BPNN as input variables, and the LIDT was derived as the output variable.

Figure 4 Correlations between properties of HfO₂ thin films used in this section. Blue indicates a negative correlation, whereas red indicates a positive correlation. Darker colors and larger circles indicate higher correlations. The numbers inside the circles indicate the corresponding correlation coefficients of the two features.

Furthermore, the predicted LIDT and measured LIDT of each sample are shown in Figure 5. It is observed that the THL-BPNN model performs well in both training and validation sets with high accuracy and low error, which is smaller than the relative error of the LIDT. The relative error of damage probability is about ±15%, mainly due to the uncertainty of the nonuniformity among the samples (3%), the measurement of the laser spot area (5%) and the fluctuation of laser energy (5%)^[ Reference Liu, Wei, Wu, Yu, Cui, Yi and Shao ⁴¹ ^]. For the training set and validation set, the R ² values are 1.00 and 0.97, respectively, and the RMSE values are 0.48 and 2.32, respectively. The results show that the THL-BPNN model is effective for predicting LIDT values of HfO₂ and SiO₂ thin films.

Figure 5 Comparison of measured and predicted LIDT values on the (a) training set and (b) validation set.

3.3.2 Prediction of other properties of PEALD-SiO₂ thin films

SiO₂ is the most common low-refractive-index material used for laser thin films in the ultraviolet to near-infrared wavelength region. It is of great significance to study the correlation between the properties of SiO₂ thin films and the deposition parameters. Therefore, we applied the THL-BPNN model to evaluate the relationship between the deposition parameters and the properties of PEALD-SiO₂ thin films. Figure 6 shows the excellent performance of the THL-BPNN model in predicting the properties of PEALD-SiO₂ thin films on the validation set, including the refractive index, layer thickness and O/Si ratio. For most samples, the prediction deviation was smaller than the measurement error.

Figure 6 Comparison of measured and predicted values of (a) the refractive index (at 355 nm), (b) the layer thickness and (c) the O/Si ratio for SiO₂ thin films in the validation set.

Table 4 lists the R ², AA and RMSE values of the THL-BPNN model for SiO₂ thin film properties. Except for the average R ² value of the O/Si ratio on the training set of 0.81, the other values, including the average R ² value of the refractive index and layer thickness in the training set and the AA values of the three properties in the validation set, are higher than 0.98. Although the THL-BPNN model did not perform sufficiently well on the O/Si ratio training set, it still provided accurate predictions on the corresponding validation set. This could be attributed to the successful learning of correlations by the THL-BPNN model through training. Therefore, the THL-BPNN model can be used to construct the relationship between the deposition parameters and PEALD-SiO₂ thin film properties, thus proving the universality of the THL-BPNN model in studying the nonlinear relationship between the process parameters and thin film properties.

Table 4 Evaluation of the THL-BPNN model for SiO₂ thin film properties.

4 Conclusions

In this study, BPNN models with different numbers of hidden layers were used to establish the correlation between the properties of PEALD-HfO₂ thin films and annealing parameters. For modeling, the annealing parameters, including the annealing atmosphere and temperature, were used as inputs, and measured thin film properties, including the refractive index, layer thickness and O/Hf ratio, were used as outputs. The data were split into two categories: a training set and a validation set. Firstly, BPNN models with different numbers of hidden layers were compared. The results demonstrated that as the number of hidden layers was increased to achieve higher accuracy on the training sets, the risk of overfitting also increased. Considering the fitting accuracy and model stability, the THL-BPNN model was adopted in a follow-up study. The performance of the THL-BPNN model was then compared with that of the LR and SVR models. The poor performance of the LR model on most datasets indicated that the effect of the two input features on the dependent output variable was nonlinear. The THL-BPNN model achieved a high accuracy of not less than 0.90 on all training and validation datasets, confirming that the THL-BPNN model outperforms the SVR model, which also belongs to the category of nonlinear regression fitting. Finally, the THL-BPNN model was used to predict the LIDT of PEALD-HfO₂ and PEALD-SiO₂ thin films, and the mapping relationship between deposition parameters and PEALD-SiO₂ thin film properties was constructed. The modeling results showed that the predicted values are consistent with the measured values, proving that the THL-BPNN model is a reliable predictive learning-based model. We believe that the THL-BPNN model can be used to predict the properties of different types of thin films, thereby reducing the experimental cost of process optimization.

Acknowledgements

The authors express their appreciation to Wenyun Du and Zesheng Lin for their fruitful discussions. This work was supported by the Program of Shanghai Academic Research Leader (No. 23XD1424100), the CAS Project for Young Scientists in Basic Research (No. YSBR-081), the National Natural Science Foundation of China (No. 61975215) and the Science and Technology Planning Project of the Shanghai Municipal Science & Technology Commission (No. 21DZ1100400).

References

Xu, N., Zhu, M., Chai, Y., Roshanzadeh, B., Boyd, S. T. P., Rudolph, W., Zhao, Y., Chen, R., and Shao, J., Opt. Lett. 43, 4538 (2018).10.1364/OL.43.004538CrossRef Google Scholar

Ma, B., Han, J. Q., Li, J., Wang, K., Guan, S., Niu, X. S., Li, H. R., Zhang, J. L., Jiao, H. F., Cheng, X. B., and Wang, Z. S., Chin. Opt. Lett. 19, 081403 (2021).10.3788/COL202119.081403CrossRef Google Scholar

Xing, Z., Fan, W., Huang, D., Cheng, H., and Du, T., High Power Laser Sci. Eng. 10, e35 (2022).10.1017/hpl.2022.21CrossRef Google Scholar

Field, E. S., Galloway, B., Kletecka, D., Rambo, P., Smith, I., Gruzdev, V. E., Carr, C. W., Ristau, D., and Menoni, C. S., Proc. SPIE 11173, 1117314 (2019).Google Scholar

Ye, K., Xu, T., Zhong, Q., Dong, Y., Zheng, S., Xu, Z., and Hu, T., Opt. Express 30, 24852 (2022).10.1364/OE.460869CrossRef Google Scholar

Shuai, K., Liu, X., Zhao, Y., Qiu, K., Li, D., Gong, H., Sun, J., Zhou, L., Jiang, Y., Dai, Y., Shao, J., and Xia, Z., High Power Laser Sci. Eng. 10, e42 (2022).10.1017/hpl.2022.34CrossRef Google Scholar

Malobabic, S., Jupe, M., and Ristau, D., Light Sci. Appl. 5, e16044 (2016).10.1038/lsa.2016.44CrossRef Google Scholar

Mahata, C., Byun, Y. C., An, C. H., Choi, S., An, Y., and Kim, H., ACS Appl. Mater. Interfaces 5, 4195 (2013).10.1021/am400368xCrossRef Google Scholar

Faraz, T., Knoops, H. C. M., Verheijen, M. A., van Helvoirt, C. A. A., Karwal, S., Sharma, A., Beladiya, V., Szeghalmi, A., Hausmann, D. M., Henri, J., Creatore, M., and Kessels, W. M. M., ACS Appl. Mater. Interfaces 10, 13158 (2018).10.1021/acsami.8b00183CrossRef Google Scholar

Kim, L. H., Kim, K., Park, S., Jeong, Y. J., Kim, H., Chung, D. S., Kim, S. H., and Park, C. E., ACS Appl. Mater. Interfaces 6, 6731 (2014).10.1021/am500458dCrossRef Google Scholar

Liu, H., Jensen, L., Ma, P., and Ristau, D., Appl. Surf. Sci. 476, 521 (2019).10.1016/j.apsusc.2019.01.125CrossRef Google Scholar

Abromavičius, G., Kičas, S., and Buzelis, R., Opt. Mater. 95, 109245 (2019).10.1016/j.optmat.2019.109245CrossRef Google Scholar

Li, W., Chen, P., Xiong, B., Liu, G., Dou, S., Zhan, Y., Zhu, Z., Chu, T., Li, Y., and Ma, W., J. Phys. Mater. 5, 014003 (2022).10.1088/2515-7639/ac5914CrossRef Google Scholar

Liu, E. J., Yu, Z. M., Wan, Z. Q., Shu, L., Sun, K. X., Gui, L. L., and Xu, K., Chin. Opt. Lett. 19, 113901 (2021).10.3788/COL202119.113901CrossRef Google Scholar

Lininger, A., Hinczewski, M., and Strangi, G., ACS Photonics 8, 3641 (2021).10.1021/acsphotonics.1c01498CrossRef Google Scholar

Xia, L., Hu, Y. Z., Chen, W. Y., and Li, X. G., High Power Laser Sci. Eng. 8, e28 (2020).10.1017/hpl.2020.29CrossRef Google Scholar

Kimaev, G. and Ricardez-Sandoval, L. A., J. Phys. Chem. C 124, 18615 (2020).10.1021/acs.jpcc.0c05250CrossRef Google Scholar

Ko, Y. D., Moon, P., Kim, C. E., Ham, M. H., Jeong, M. K., Garcia-Diaz, A., Myoung, J. M., and Yun, I., Surf. Interface Anal. 45, 1334 (2013).10.1002/sia.5286CrossRef Google Scholar

Ko, Y. D., Moon, P., Kim, C. E., Ham, M. H., Myoung, J. M., and Yun, I., Expert Syst. Appl. 36, 4061 (2009).10.1016/j.eswa.2008.03.010CrossRef Google Scholar

Bahramian, A., Surf. Interface Anal. 45, 1727 (2013).10.1002/sia.5314CrossRef Google Scholar

Gukeh, M. Jafari, Moitra, S., Ibrahim, A. N., Derrible, S., and Megaridis, C. M., ACS Appl. Mater. Interfaces 13, 46171 (2021).10.1021/acsami.1c13262CrossRef Google Scholar

Fetanat, M., Keshtiara, M., Keyikoglu, R., Khataee, A., Daiyan, R., and Razmjou, A., Sep. Purif. Technol. 270, 118383 (2021).10.1016/j.seppur.2021.118383CrossRef Google Scholar

Montufar, G. F., Neural Comput. 26, 1386 (2014).10.1162/NECO_a_00601CrossRef Google Scholar

Mengu, D., Rahman, M. S. Sakib, Luo, Y., Li, J., Kulce, O., and Ozcan, A., Adv. Opt. Photonics 14, 209 (2022).10.1364/AOP.450345CrossRef Google Scholar

Liu, M., Chen, L., Du, X., Jin, L., and Shang, M., IEEE Trans. Neural Network Learn. Syst. 34, 2156 (2023).10.1109/TNNLS.2021.3106044CrossRef Google Scholar

Lin, Z., Zhu, M., Song, C., Liu, T., Yin, C., Zeng, T., and Shao, J., J. Alloys Compd. 946, 169443 (2023).10.1016/j.jallcom.2023.169443CrossRef Google Scholar

Yin, C., Zhu, M., Zeng, T., Song, C., Chai, Y., Shao, Y., Zhang, R., Zhao, J., Li, D., and Shao, J., J. Alloys Compd. 859, 157875 (2021).10.1016/j.jallcom.2020.157875CrossRef Google Scholar

Cortes, C. and Vapnik, V., Mach. Learn. 20, 273 (1995).Google Scholar

Wang, Y., Wu, W., Zheng, X., Zeng, Y., Ding, M., and Zhang, C., J. Therm. Spray Technol. 20, 1177 (2011).10.1007/s11666-011-9660-yCrossRef Google Scholar

Xu, Y., Zhang, X., Fu, Y., and Liu, Y., Photonics Res. 9, B135 (2021).10.1364/PRJ.417693CrossRef Google Scholar

Ma, L., Li, J., Liu, Z., Zhang, Y., Zhang, N., Zheng, S., and Lu, C., Chin. Opt. Lett. 19, 011301 (2021).10.3788/COL202119.011301CrossRef Google Scholar

Qiu, X., Neural Networks and Deep Learning (China Machine Press, 2020), p. 89.Google Scholar

Wiecha, P. R., Arbouet, A., Girard, C., and Muskens, O. L., Photonics Res. 9, B182 (2021).10.1364/PRJ.415960CrossRef Google Scholar

LeCun, Y., Bengio, Y., and Hinton, G., Nature 521, 436 (2015).10.1038/nature14539CrossRef Google Scholar

Guo, X., Barrett, T. D., Wang, Z. M., and Lvovsky, A. I., Photonics Res. 9, B71 (2021).10.1364/PRJ.411104CrossRef Google Scholar

Hagan, M. T. and Menhaj, M. B., IEEE Trans. Neural Netw. 5, 989 (1994).10.1109/72.329697CrossRef Google Scholar

Chicco, D., Warrens, M. J., and Jurman, G., PeerJ Comput. Sci. 7, e623 (2021).10.7717/peerj-cs.623CrossRef Google Scholar

Chai, T. and Draxler, R. R., Geosci. Model Dev. 7, 1247 (2014).10.5194/gmd-7-1247-2014CrossRef Google Scholar

Pu, T. Y., Liu, W. W., Wang, Y. L., Pan, X. M., Chen, L. Q., and Liu, X. F., High Power Laser Sci. Eng. 9, e19 (2021).10.1017/hpl.2021.4CrossRef Google Scholar

Du, W., Zhu, M., Shi, J., Liu, T., Sun, J., Yi, K., and Shao, J., High Power Laser Sci. Eng. 11, e61 (2023).10.1017/hpl.2023.37CrossRef Google Scholar

Liu, W., Wei, C., Wu, J., Yu, Z., Cui, H., Yi, K., and Shao, J., Opt. Express 21, 22476 (2013).10.1364/OE.21.022476CrossRef Google Scholar

Table 1 Datasets for property prediction of HfO2 and SiO2 thin films.

Table 2 Datasets for LIDT prediction of HfO2 and SiO2 thin films.

Figure 1 THL-BPNN model with all neurons in adjacent layers connected, where x = [x1; x2], y1 and hij represent the input, output and intermediate processing signals, respectively.

Figure 2 Accuracy of BPNNs with one to four hidden layers based on (a) the refractive index (at 355 nm), (b) layer thickness and (c) O/Hf ratio of PEALD-HfO2. The four columns in each subgraph represent the R2 values of the model in the training and validation sets and the RMSE values in the training and validation sets, respectively. The table indicates the number of neurons in each hidden layer of each model.

Figure 3 Measured and predicted (a)–(c) refractive index, (d)–(f) layer thickness and (g)–(i) O/Hf ratio of HfO2 thin films. The data in the left-hand, middle and right-hand columns are predicted by the LR model, SVR model and THL-BPNN model, respectively. The blue line (with a slope of 1) serves as a guideline for perfect prediction.

Table 3 Evaluation of the LR, SVR and THL-BPNN models.

Figure 4 Correlations between properties of HfO2 thin films used in this section. Blue indicates a negative correlation, whereas red indicates a positive correlation. Darker colors and larger circles indicate higher correlations. The numbers inside the circles indicate the corresponding correlation coefficients of the two features.

Figure 5 Comparison of measured and predicted LIDT values on the (a) training set and (b) validation set.

Figure 6 Comparison of measured and predicted values of (a) the refractive index (at 355 nm), (b) the layer thickness and (c) the O/Si ratio for SiO2 thin films in the validation set.

Table 4 Evaluation of the THL-BPNN model for SiO2 thin film properties.

Article contents

Neural network modeling and prediction of HfO2 thin film properties tuned by thermal annealing

Abstract

Keywords

1 Introduction

2 Materials and methods

2.1 Data preparation

2.2 Models

2.3 Model specification and evaluation

2.3.1 Variable scaling

2.3.2 Model evaluation metrics

3 Results and discussion

3.1 Analysis of the number of hidden layers of the BPNN model

3.2 Comparison of the THL-BPNN model with other models

3.3 Evaluation of the THL-BPNN model for other thin film applications

3.3.1 Prediction of the LIDT of PEALD-HfO2 and PEALD-SiO2 thin films

3.3.2 Prediction of other properties of PEALD-SiO2 thin films

4 Conclusions

Acknowledgements

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests

3.3.1 Prediction of the LIDT of PEALD-HfO₂ and PEALD-SiO₂ thin films

3.3.2 Prediction of other properties of PEALD-SiO₂ thin films