Nomenclature
- AI
-
artificial intelligence
- ARIMA
-
auto-regressive moving average
- CMAPSS
-
commercial modular aero-propulsion system simulation
- CNN
-
convolutional neural network
- d K
-
gradient scaling factor
- GRN
-
gated residual network
- HPC
-
high-pressure compressor
- HPT
-
high-pressure turbine
- K
-
key matrix
- LPC
-
low-pressure compressor
- LPT
-
low-pressure turbine
- LSTM
-
long short-term memory
- mn
-
slope of a normal degradation
- NLP
-
natural language processing
- PSO
-
particle swarm optimisation
- Q
-
query matrix
- OEM
-
original equipment manufacturer
- RMSE
-
root mean squared error
- RNN
-
recurrent neural network
- RUL
-
remaining useful life
- t
-
time
- tE
-
excitation energy
- T
-
transpose
- TFT
-
Temporal Fusion Transformer
- V
-
value matrix
- $\delta$ a
-
abnormal degradation
- $\delta$ n
-
normal degradation
- $\delta$ w
-
initial wear
- $\varepsilon$
-
process noise
1.0 Introduction
The modern aviation and power generation industries heavily rely on the efficiency, reliability and safety of gas turbine engines. These complex mechanical systems play a pivotal role in a wide range of applications, from propelling aircraft to generating electricity. Ensuring their optimal performance, minimising downtime and preventing catastrophic failures are critical objectives to achieve operational excellence and cost-effectiveness. In this pursuit, the field of gas turbine diagnostics and prognostics has emerged as a promising avenue for enhancing maintenance practices and extending the operational lifespan of these vital machines.
Unlike gas turbine diagnostics, the scope of gas turbine prognostics literature is comparatively more limited. The available prognostic methods can be categorised into physics-based [Reference Li and Nilkitsaranont1, Reference Alozie, Li, Wu, Shong and Ren2], data-driven [Reference Losi, Venturini and Manservigi3, Reference Zaccaria, Fentaye and Kyprianidis4] and hybrid-physics-data-driven [Reference Kordestani, Mousavi, Chaibakhsh, Orchard, Khorasani and Saif5, Reference Arias Chao, Kulkarni, Goebel and Fink6]. A review of model-based and data-driven gas turbine prognostic methods was provided by Tahan et al. [Reference Tahan, Tsoutsanis, Muhammad and Abdul Karim7]. According to Byington et al. [Reference Byington, Roemer and Galie8], early methods often relied on expert knowledge, first principles, empirical relationships and AI methods. These methods were mainly used to estimate a remaining useful life (RUL) based on degradation signatures and observed trends in engine sensor data. DePold et al. [Reference DePold and Gass9] used knowledge-based expert systems and neural networks for a combined diagnostic-prognostic system. Roemer and Kacprzynski [Reference Roemer and Kacprzynski10] evaluated the value of prognostics in risk assessment for decision-making and discussed the significance of data fusion in gas turbine prognostics. Marinai et al. [Reference Marinai, Singh, Curnock and Probert11] applied ARIMA (auto-regression integrated moving average) to gas turbine degradation prediction, with the help of physics-based mathematical degradation models.
Following the publication of the NASA’s Commercial Modular Aero-Propulsion System Simulation (CMAPSS) dataset [Reference Saxena, Goebel, Simon and Eklund12] and the enhanced version N-CMAPSS dataset [Reference Arias Chao, Kulkarni, Goebel and Fink13], recent advancements in gas turbine prognostics have primarily focussed on deep learning methods [Reference Ramasso and Saxena14]. Notably, techniques such as convolutional neural networks (CNNs) [Reference Li, Ding and Sun15, Reference Muneer, Taib, Fati and Alhussian16], long short-term memory (LSTM) [Reference Deng17, Reference Lin, Liu, Guo, Lv and Tong18], and combined CNN-LSTM [Reference Al-Dulaimi, Zabihi, Asif and Mohammadi19, Reference Kong, Cui, Xia and Lv20] were employed. Chao et al. [Reference Arias Chao, Kulkarni, Goebel and Fink6] proposed a physics-informed deep LSTM approach for RUL prediction in fleets of turbofan engines. The hybrid framework uses physics-based models to estimate hidden parameters related to system health, which are then fused with sensor data to create an LSTM based prognostics model. LSTM and other recurrent neural networks (RNNs) have gained prominence in sequential data processing tasks due to their capability to model temporal dependencies and sequential patterns in sensor data. These methods are powerful at capturing short-term dynamics within sequences but might encounter difficulties when dealing with long-term dependencies and complex interactions between different variables.
While deep learning methods have shown promising performance in predicting the RUL of aircraft engines, applications using Transformer models have been limited. More recently, Xia et al. [Reference Xia, Feng, Teng, Chen and Song21], Hu et al. [Reference Hu, Zhao and Ren22], Fan et al. [Reference Fan, Li and Chang23] and Zhang et al. [Reference Zhang, Song and Li24] applied Transformer nets to predict the RUL of turbofan engines through supervised regression learning. They utilised the CMAPSS dataset to train and validate their proposed methods. The problem was treated as a time-series regression task, using sets of gas path measurements as input to predict the RUL of different engine components as the target variable. It has been shown that transformer models outperform CNN and LSTM.
There are two general categories of prognostics in machines health management. The first, and more common, type is RUL prediction based on the current and past degradation profiles. RUL refers to the time left to approach the predefined acceptable operating margins of the machine or its components. For gas turbines, indicators such as exhaust gas temperature (EGT) margins and gas path components’ stall margins are commonly used to define RUL [Reference Saxena, Goebel, Simon and Eklund12]. These margins can be defined by users or original equipment manufacturers (OEMs). As highlighted above, previous studies on gas turbine prognostics focussed solely on RUL estimation. However, in harsh operating conditions, it is crucial to predict the likelihood that the machine/engine will keep working without significant performance loss or potential damage until some future scheduled inspection [Reference Sikorska, Hodkiewicz and Ma25]. This requires effective degradation tracking and forecasting system.
Fouling contributes to 70% to 85% of performance deterioration in gas turbines by impacting the efficiency of components and reducing the airflow across the gas path [Reference Diakunchak26]. Fouling induced performance degradation is temporary and can be restored through effective cleaning/washing. There exist two types of washing in gas turbine compressors: online/in-service washing and offline/out-of-service washing. Online washing involves injecting water or water-based detergent solution into the compressor(s) during operation. It is relatively gentle and helps prevent the buildup of foulant deposits. Offline washing (also known as crank washing) is conducted after the gas turbine is turned off and cooled. It involves a more thorough cleaning process, often utilising strong detergents and hand cleaning, to effectively remove accumulated deposits. The downtime associated with more frequent offline washing can result in decreased overall availability and productivity of the gas turbine system, potentially leading to financial losses or disruptions in service delivery.
There are various recommendations in the literature regarding the frequency of the two washing mechanisms, with a focus on cost-benefit analysis [Reference Aretakis, Roumeliotis, Doumouras and Mathioudakis27–Reference Hanachi, Liu, Ding, Kim and Mechefske29]. Stalder [Reference Stalder30] suggested online washing with water every day to every three days, followed by detergent cleaning at least once a week, depending on the type of foulant. Through a comprehensive analysis on data from six gas turbines, daily compressor online washing coupled with regular interval offline washing was suggested for gas turbines operating at high fouling rate [Reference Schneider, Demircioglu Bussjaeger, Franco and Therkorn31]. It was recommended to carry out at least four crank washings per year to remove accumulated deposits downstream [Reference Stalder30, Reference Kolkman32].
However, strict adherence to offline washing recommendations optimised solely through cost-benefit analysis is challenging. Firstly, there is no consensus in the literature on washing frequency due to differences in operating environments between land-based and aero-engines. Even among land-based engines or aero-engines, exposure to fouling agents can vary based on engine configuration, specific site location, and surrounding environment, impacting the optimal frequency and method of compressor washing for each engine type. Secondly, predefined schedules may not account for sudden environmental changes, leading to unforeseen performance declines and downtimes. Therefore, offline washing schedules should be based on actual degradation rates, determined through diagnostic and prognostic methods. This approach could offer the most effective approach for all gas turbine applications and conditions [Reference Rao and Naikan33].
There have been limited recent studies focusing on tracking fouling trends using deep learning for compressor offline washing optimisation. Chen et al. [Reference Chen, Tang, Lu and Zhang34] employed LSTM for fouling trend tracking in a single-shaft industrial gas turbine compressor. They integrated the fouling prediction model, the gas turbine performance model and an economic model of compressor washing for the optimisation. A particle swarm optimisation (PSO) algorithm is employed to determine the optimal offline washing schedule with variable intervals. This integration led to better financial outcomes compared to other cost/benefit-based optimisation techniques for the washing scheduling. A similar approach has also been presented in Ref. [Reference Jin35].
The primary aim of this paper is to develop a method for forecasting compressor fouling trends in gas turbines. It proposes a novel Temporal Fusion Transformer (TFT) algorithm capable of tracking and forecasting flow capacity trends due to fouling in aircraft engines. It then illustrates how predictions of flow capacity degradation trends can inform the scheduling of offline compressor washing in advance. Performance data from multiple high-bypass ratio-geared turbofan engine configurations for short-range applications are considered to demonstrate and validate the effectiveness of the proposed method. The main contributions are:
-
1. Use of transformer models for fouling trend prediction and forecasting: The proposed TFT-based compressor fouling trend tracking and forecasting method uses historical time-series data to estimate the current fouling level in terms of flow capacity loss and project the trend into the future. The method can also be applied to predict component efficiency trajectories, providing a comprehensive approach to prognose other failure causes. This approach has not yet been implemented in the field of gas turbine engine health management systems in this way.
-
2. Compressor offline washing scheduling based on flow capacity predictions: instead of strictly following OEM recommended compressor offline washing intervals, using fouling trend predictions to suggest upcoming crank washing events in advance could be more reliable and economical. There are quite limited studies available in demonstrating fouling trend predictions for compressor washing optimisation.
-
3. Robustness to handle engine-to-engine performance variations: it was also verified that the method proposed is advantageous in handling a considerable disparity between the training and test datasets, which is difficult for most of the traditional data-driven methods. This robustness is important to accommodate engine-to-engine degradation profile differences.
2.0 Method
2.1 Temporal fusion transformer
The Transformer model is a special kind of neural network structure introduced by Vaswani et al. in 2017 [Reference Vaswani36]. It is a breakthrough in natural language processing (NLP) and quickly evolved as one of the most powerful methods for sequence-to-sequence problems. The structure is composed of sets of encoder and decoder blocks, each with multi-head-self-attention and feed-forward layers. The self-attention scheme involved in the structure assigns weights to different parts of the input data based on their significance to the output. The feed-forward layers process the outputs of the self-attention layers and perform the final prediction. Different advancements and modification have been made to the original Transformer architecture over time including Transformer-XL, Bidirectional Encoder Representations from Transformers (BERT), Performer BERT, and Generative Pre-trained Transformer (GPT) [Reference Galanis, Vafiadis, Mirzaev and Papakostas37].
Temporal Fusion Transformers, on the other hand, are specifically designed for time-series prediction and forecasting [Reference Lim, Arık, Loeff and Pfister38]. They extend the Transformer architecture by incorporating autoregressive modelling, exogenous inputs and a gating mechanism to control the flow of information between the different components. In addition, TFTs introduce a new type of attention mechanism called Temporal Attention, which enables the model to attend to both past and future inputs when making predictions. The core idea behind TFTs is to model the temporal dependencies in time-series data by using a self-attention mechanism. The self-attention mechanism allows the model to attend to different parts of the input sequence and weight them differently depending on their relevance to the output prediction. TFTs have shown promising results on a variety of time-series forecasting tasks, including predicting energy consumption, traffic volume and weather prediction [Reference Lim, Arık, Loeff and Pfister38, Reference Wu, Wang and Zeng39].
2.2 Gas turbine prognostics using TFT
The TFT model aims to predict the current performance of a gas turbine and forecast the degradation progress in the future time steps, given historical measurements up to the current time. The methodology involved a series of steps to prepare the data, design the model architecture and train it for accurate predictions.
In the first place, the time-series performance data needs to be structured in a format that can be fed into the TFT model. This typically involves converting each observation into a fixed-size vector representation that the TFT model can process. This is done by combining temporal positional embeddings and feature embeddings. Different ranges of vector representation specify the dimensionality of the input features (i.e. the number of selected gas path measurements) and the length of the input sequence or window size. The window size indicates how many previous observations in the time-series are used to predict flow capacity and efficiency deviation in the next timestep. Input-output pairs are then formed for the training. The inputs are the embedded segments of measured values, while the outputs are degradation magnitudes to be predicted. Figure 1 illustrates the structure of input-output performance data to the TFT model before the embedding. It is good to note here that the current state can be estimated by a diagnostic system. When the estimated degradation by the diagnostic system reaches a predefined threshold to start the forecast, the prognostic model will be activated to forecast the progress.
The structure of the TFT model for degradation tracking and forecasting was determined through experimentation. Various numbers of layers for both the encoder and decoder were examined. Each encoder block primarily comprises input processing layers, static covariate and LSTM-encoder layers, and Add & Normalisation layer. Each decoder block consists of multi-headed feed-forward layers, Add & Normalisation layers, dropout layer and Dense layers. The encoders process the historical gas path measurements, capturing temporal dependencies using LSTM networks and relevant patterns using multi-headed self-attention layers. The decoder generates degradation forecasts based on the encoded input representations. It uses the attention mechanism to focus on relevant parts of the encoded information and combines it with its own processing to make forecasts. For the underlying mathematics and detail descriptions of the encoder and decoder layers, interested readers are encouraged to refer to Ref. [Reference Vaswani36]. Figure 2 shows a simplified layout of the TFT architecture applied.
2.2.1 Description to the major layers of the TFT model
-
LSTM-encoder/decoder layers: The TFT structure involves sets of LSTM-based encoder and decoder layers to capture temporal relationship between past and future degradation trends in the time-series and forecast future values.
-
Static covariate encoders: These layers enable incorporating static features into the encoder and decoder layers to give some context to the temporal dynamics.
-
Multi-headed self-attention layers: This module incorporates several attention layers computing in parallel to capture long term dependencies between different parts of the input sequence. Mathematically:
(1) \begin{align}A{\rm{ttention}}\left( {{\rm{Q}},{\rm{K}},{\rm{V}}} \right) = {\rm{softmax}}\left( {\frac{{{\rm{Q}}{{\rm{K}}^{\rm{T}}}}}{{\sqrt {{{\rm{d}}_{\rm{k}}}} }}} \right){\rm{V}}\end{align}where Q, K and V are the query matrix, key matrix and value matrix, respectively, SoftMax computes the attention scores for each query-key pair, and $\sqrt {{{\rm{d}}_{\rm{k}}}} $ is a scaling factor to prevent extremely small gradients during training. -
Add & norm layers: These layers add the residual connection from the previous layer and applies normalisation. Residual connections are used to avoid the vanishing gradient problem, and normalisation stabilises the training process.
-
Feed-forward layers: This part introduces non-linearity to the data and further learns useful features from it. The default feed-forward option is ‘Gated Residual Network (GRN)’. There are also other alternatives including ‘Gated Linear Unit (GLU)’, ‘Bilinear’, ‘ReGLU’, ‘Gated Enhanced Gated Linear Unit (GEGLU)’, ‘SwiGLU’.
-
Dense nets: These are fully connected regression layers used to estimate the degradation value for each timestep. They also provide probabilistic forecasts by estimating quantiles of the predictive distribution. This allows the model to capture prediction uncertainty and provide interval forecasts.
2.3 Using degradation forecasts to schedule maintenance events
The frequency of offline compressor washing depends on several factors such as engine type, weather condition, fouling rate, etc. The best option to handle these factors could be tracing fouling effects using diagnostic and prognostic tools and associating it with the washing scheduling. Some authors, like Ref. [Reference Scott40, Reference Diakunchak41], suggested conducting offline compressor washing when the flow capacity loss due to fouling reaches 2%–3%. As discussed in Section 2.2, the proposed TFT method can track and forecast flow capacity degradation trajectories. Hence, instead of following a fixed offline washing intervals despite the gas turbine type and the actual fouling rate, it would be more beneficial to use the TFT model to predict when the flow capacity loss due to fouling reaches 2%–3%.
Although this paper primarily focuses on compressor washing, the TFT algorithm can also predict hot section degradation profiles and utilise this information to schedule overhaul events. This task needs integrating the prognostic system with an effective diagnostic system, to identify affected components and root causes first. In a previous study, the authors of this paper proposed a hybrid physics-based and data-driven diagnostic system that can effectively detect and isolate gas turbine faults to the component level [Reference Fentaye, Zaccaria and Kyprianidis42, Reference Fentaye, Zaccaria and Kyprianidis43]. The current method can then be coupled with the diagnostic system to have an integrated diagnostic-prognostic framework. Once the root cause is identified and the current severity is estimated by the diagnostic system, the prognostic model associated with the affected component can forecast the future progress of the degradation and estimate the overhaul time based on economic considerations.
2.4 Case study on turbofan engines
To demonstrate the performance of the method proposed, a synthetic dataset was generated from a two-spool turbofan engine through an in-house dynamic model, EVA. EVA is a physics-based performance code employing Gibbs free energy minimisation for combustion modeling and comprehensive component modeling. It integrates first and second-order physical effects, including non-linear off-design behaviour through component maps. Its solver, a gradient-based Newton-Raphson approach, ensures robustness. EVA’s object-oriented design facilitates flexibility and customisation for various gas turbine configurations, adhering to international standards. Detail information about the software can be found in Refs [Reference Kyprianidis, Colmenares Quintero, Pascovici, Ogaji, Pilidis and Kalfas44, Reference Kyprianidis45]. Figure 3 shows a schematic representation of the case study engine along with the corresponding station numbers. The method used to generate degradation trajectories followed the well-established N-CMAPSS dataset generation procedure [Reference Arias Chao, Kulkarni, Goebel and Fink13], and the main steps are highlighted as follows:
-
1. Define input conditions: The simulation was performed at a cruise flight segment, and a snapshot was considered for each flight cycle simulation. For every engine considered in the simulation, a similar Mach number (M = 0.7), altitude (Alt = 20 Kft), and flight length was considered.
-
2. Define degradation trajectories: The health condition of the major rotary components was represented by efficiency and flow capacity parameters. The evolution of the degradation trajectories begins with some level of initial wear, to account the production and assembly tolerance among fleets of engines. It was modelled by adding normally distributed random values ranging 0 to 1% [Reference Saxena, Goebel, Simon and Eklund12]. Following the establishment of initial wear, the normal degradation effect is added into the trajectory. This phase simulates the gradual decline in engine performance over time due to usage, often considered as normal and is modelled as linear decreasing trend. Each sub-component is assumed to have a maximum excitation energy threshold. This threshold represents the limit beyond which the sub-component transitions to an abnormal state. The distribution of maximum excitation energy threshold values is modeled as Gaussian to account for material property variability in each unit. The abnormal degradation onset time due to a fault is defined as the point in time when the total excitation energy that a component has experienced from the initial time (t = 0) to a specific time ( ${t_E}$ ) exceeds the maximum excitation energy threshold. In the current work, short flight lengths (up to three hours) with low and medium operation intensity were considered for the case study engines. Abnormal degradation patterns are then introduced to the trajectory, based on the exponential decay model provided by Saxena and Goebel [Reference Saxena, Goebel, Simon and Eklund12]. These patterns simulate typical degradation events induced by rapid or abrupt failure conditions. These anomalies can accelerate degradation, leading to rapid declines in engine health. The degradation model used to compute these trajectories is given in Equations (2) and (3). The simulation continues up to a point, signifies a critical threshold beyond which the engine can no longer operate effectively.
(2) \begin{align}{\delta _n}\!\left( t \right) = {m_n}\!\left( t \right) + {\delta _{iw}}\;\;\;\;\forall t \lt {t_E}\end{align}(3) \begin{align}{\delta _a}\!\left( t \right) = 1 - exp\!\left[ {a{{\left( {t - {t_E}} \right)}^b}} \right] + {\delta _n}\!\left( {{t_E}} \right) + \varepsilon \end{align}where ${\delta _{iw}}$ is initial wear (where ${\delta _{iw}} \in \left[ {0\%, 1\% } \right]$ ), ${m_n} = - 0.001$ is slope of the normal degradation ${\delta _n}\left( t \right)$ , ${t_E}$ is the flight cycle where the excitation energy that the particular component has experienced exceeds the maximum, $\varepsilon $ is process noise, a = U(0.001, 0.003), b = U(1.4, 1.6) and $\varepsilon $ = N(0, 0.001) for efficiency and $\varepsilon $ = N(0, 0.003) for flow capacity. For a given engine, a and b are constant. The maximum tolerated degradation considered for all flow capacities was ±7.5%, while for all efficiencies was −4%. -
3. Generate sensor data: We impose degradation trajectories obtained from step 2 to the engine performance model and simulate measurements for a fleet of 30 engines, each with a different degradation mode. Intermediate pressure compressor (IPC) exit total temperature (T25), IPC exit total pressure (P25), high pressure compressor (HPC) exit total temperature (T3), HPC exit total pressure (P3), high pressure turbine (HPT) exit total temperature (P43), HPT exit total pressure (P43), low pressure turbine (LPT) exit total temperature (T5), low pressure shaft speed and high pressure shaft speed sensors were included in the model.
-
4. Add senser noise: After generating the required data, a white Gaussian noise was added to accommodate sensor non-repeatability. Signal-to-noise ratio of 60dB was considered as maximum noise level for each sensor involved.
-
5. After forming the input-output data pairs, the dataset was divided into training and test sets. Each dataset was normalised to ensure better model performance and convergence. Data from 25 engines were utilised for training and from 5 engines for test. The trained TFT model was then used to forecast degradation trajectories into the future timesteps.
3.0 Results and discussion
To generate a single time sequence data, the entire degradation trajectories considered for training and testing were stacked one after the other, and the associated input measurements were treated similarly. After this, the data was prepared (embedded and encoded) according to the specific input format requirements of the TFT model. Subsequently, the data was divided into training, testing and forecast subsets. The model was trained using the training set to determine its parameters. The test dataset, which chronologically follows the training data, was then used to assess the generalisation performance of the method and fine-tune the hyperparameters through an iterative process. After training and testing the model, it was used to generate forecasts for the future. In all three cases, the predicted values were compared to the actual values to calculate the RMSE of the predictions.
During the model’s hyperparameter tuning process, the training data was partitioned into varying batch sizes, ranging from 32 to 160. Several iterations were conducted to determine the optimal hyperparameters. This involved evaluating different feedforward dimensions (ranging from 8 to 256), incrementally adjusting dropout probabilities from 0.1 to 0.5, varying batch sizes from 32 to 256, evaluating learning rates at 0.1, 0.01, 0.001 and 0.0001, and considering the number of attention, encoder and decoder heads, ranging from 1 to 8 for each component. Ultimately, the model with the best performance was selected, a summary of the hyperparameters of the selected model is shown in Table 1.
In this paper, only results for the IPC flow capacity prognostics are presented aiming to demonstrate how the forecast could be incorporated into compressor washing scheduling. However, the method can also be used to predict the other health parameters in the gas path in a similar way. Figure 4 displays the test and forecast results obtained. Flow capacity patterns from 25 engines and the associated simulated sensor data were used for training. Degradation data from engine 26 to 29 (up to end-of-life), and the degradation data from engine 30 up to a random degradation level were used as a test dataset. The forecasting task was initiated randomly when the flow capacity deviation in the degradation trajectory of engine 30 reached −1.64%. RMSE values of 0.1128, 0.1121, 0.0978, 0.1345, 0.1739 and 0.273 were obtained for 10, 20, 30, 40, 50 and 60 flight cycles into the future, respectively.
3.1 Effect of the forecast starting time
To assess the method’s sensitivity to the initial degradation level for trajectory forecasting, three different scenarios were examined: Scenario 1 (starting at −1%), Scenario 2 (starting at −1.5%) and Scenario 3 (starting at −2%). Forecasting continued until the degradation level reached −3% from each reference point. The resulting RMSE values were 0.0963, 0.0973 and 0.0904, respectively. As shown in Fig. 5, in all three scenarios, the target −3% flow capacity was predicted two flight cycles earlier than expected. However, since offline washing is typically recommended within the range of 2%–3% flow capacity loss, the impact of these observed errors on the offline washing scheduling is negligible. Early prediction is preferrable, as late prediction could result in an in-service failure, which could have safety, economic and maintenance efficiency implications. Setting the washing time when the estimated flow capacity loss reaches −2.5% would provide a high-confidence margin in case of any unexpected prediction outliers.
The absolute error for engine 30‘s trajectory, including the test and forecasted horizons, is presented in Fig. 6. In the initial and linear degradation phases of the IPC, the prediction errors were large. This was mainly caused by measurement noise during normal operation. As the degradation gradually developed, the TFT model improved its ability to track the target degradation profile. Similarly, as the length of the forecasting time step into the future increased, deviations between the forecasted and actual degradation increased, as expected, but they remained within an acceptable range.
3.2 Effect of the training and test dataset ratio
In machine learning, it is common practice to split the dataset into training and test sets, with around 70%–80% of the data allocated for training and the remaining 20%–30% for testing. The model learns from the training dataset to make predictions and is evaluated on the unseen test dataset to assess its performance. Additionally, during the training phase, it’s typical to further split the training dataset into training and validation sets. The validation set is used for model tuning and hyperparameter optimisation. In this study, we investigated the impact of different test-to-training data ratios, including 60/40, 70/30 and 90/10. In all three cases, forecasting commenced when the flow capacity deviation of the last trajectory reached −1% and continued until it reached −3% flow capacity deviation.
The prediction accuracy of both the test and forecast data was evaluated in terms of root mean square error (RMSE) and summarised in Table 2. For visualisation purposes, the comparison between the target degradation trajectories vs. the model predicted values for these three cases is also displayed in Fig. 7. The test results showed that for the 60/40 ratio, an RMSE of 0.4833 was achieved, while the 70/30 ratio resulted in an RMSE of 0.4749, and the 90/10 ratio resulted in an RMSE of 0.4622. When compared to the test RMSE of 0.4396 obtained for the 80/20 ratio (as described in Section 3.1), the RMSE values for these three scenarios were slightly lower. For the 60/40 ratio, the RMSE value of the forecast up to −3% flow capacity loss is 0.4263. This implies that the TFT model prediction suggests washing when the actual degradation falls between −2.5% and −3%, in agreement with the recommended range of 2%–3% flow capacity deviation. The forecasting error up to −2.5% flow loss is an RMSE of 0.4167. This means that if washing is planned based on a predicted value of −2.5% flow capacity deviation, the washing will be performed prior to the actual flow capacity loss reaching −2.5% due to fouling, but still within the recommended range of 2%–3% flow capacity loss. Similarly, for the second case (the 70/30 ratio), the RMSE of the forecast up to the −3% flow capacity loss is 0.3369, and the forecasting error up to −2.5% flow loss is an RMSE of 0.3289. For the last case, the RMSE of the forecast up to the −3% flow capacity loss is 0.2406, and the forecasting error up to −2.5% flow loss is an RMSE of 0.2329.
In general, based on the TFT model prediction, scheduling offline washing within estimated −2.5% to −3% flow capacity deviation ensures that compressor washing will be performed within the recommended 2%–3% flow capacity loss. Earlier scheduling is a safer approach compared to having washing events occurring later than the expected −3% flow capacity change.
3.3 Effect of the abnormal degradation starting time
The time to exceed the maximum excitation energy is strongly correlated with the inlet temperature [Reference Kamtsiuris, Raddatz and Wende46]. Harsh flight conditions and long flights often result in a relatively rapid shift from a linear to an abnormal degradation trend. To assess the performance of the method on degradation profiles distinct from those employed during training and testing, due to variations in the timing of maximum excitation energy, four different cases were considered: trajectories with time to maximum excitation energy ( ${t_E}$ ) = 20, 40, 60 and 80 flight cycles. The degradation trajectory constants ‘a’ and ‘b,’ as well as the process noise ‘ε,’ were randomly chosen as discussed in Section 2.4. In both the training and test trajectories, a constant ${t_E}$ ( ${t_E}$ = 100 flight cycles) was utilised. The results of the test are presented in Fig. 8, where trajectories B, D, F and H represent the actual degradation trajectories for the four case studies, and trajectories A, C, E and G represent the corresponding predicted trajectories.
The RMSE of the forecast was approximately 0.26. If a −2% flow capacity deviation is considered as the lower threshold for triggering offline washing, there is a possibility that the model may predict the need for washing 10 to 15 flights earlier than the expected time. Based on the RMSE value and as shown in the zoomed in figure (Fig. 8(b)), the flow capacity changes observed between 10 to 15 flight cycles are generally less than −0.26% within the target −2% to −3% region. Therefore, scheduling the offline washing event based on a threshold of −2.3% to −3% of the model prediction would be safe.
4.0 Conclusions
This paper has introduced a novel prognostic method for forecasting gas turbine degradation based on time-series measurements. The Temporal Fusion Transformer model, which combines encoder and decoder layers with temporal-attention mechanisms, has demonstrated its ability to accurately predict degradation trajectories of several flight cycles into the future. This allows gas turbine operators proactively schedule maintenance events, optimise compressor washing frequency and minimise unexpected downtimes.
The study demonstrated the effectiveness of the proposed method by tracking compressor degradation using flow capacity as the primary parameter. Flow capacity changes are indicative of compressor fouling deposit progression, and these changes inform washing interval scheduling. The method can effectively forecast offline washing events in advance when fouling deposits reach detectable levels. It is essential to note that this method can also be applied to predict component efficiency trajectories, providing a comprehensive approach to prognose all types of gas path faults. Integrating the proposed prognostic method with a diagnostic method enables a holistic approach to assess engine component(s) health and predict maintenance intervals.
Future work in this area may include extending the method to estimate the remaining useful life of gas turbine components, conducting further analysis of forecasting accuracy while increasing the number of degradation trajectories considered for training, exploring the integration of economic analysis into the washing optimisation process and applying this methodology to a real engine to assess its performance in real-world scenarios.
Acknowledgements
This research was funded by the Swedish Knowledge Foundation (KKS) under the project PROGNOSIS, Grant Number 20190994. ChatGPT was used to refine the text in the manuscript particularly in relation to Section 2.1.
Competing interests
The authors declare that they have no competing interests.