Hostname: page-component-669899f699-2mbcq Total loading time: 0 Render date: 2025-04-27T07:32:03.879Z Has data issue: false hasContentIssue false

Transformer-based in-context policy learning for efficient active flow control across various airfoils

Published online by Cambridge University Press:  19 December 2024

Changdong Zheng
Affiliation:
Center for Engineering and Scientific Computation, Zhejiang University, Zhejiang 310027, PR China
Fangfang Xie*
Affiliation:
Center for Engineering and Scientific Computation, Zhejiang University, Zhejiang 310027, PR China
Tingwei Ji
Affiliation:
Center for Engineering and Scientific Computation, Zhejiang University, Zhejiang 310027, PR China
Hongjie Zhou
Affiliation:
Center for Engineering and Scientific Computation, Zhejiang University, Zhejiang 310027, PR China
Yao Zheng
Affiliation:
Center for Engineering and Scientific Computation, Zhejiang University, Zhejiang 310027, PR China
*
Email address for correspondence: [email protected]

Abstract

Active flow control based on reinforcement learning has received much attention in recent years. Indeed, the requirement for substantial data for trial-and-error in reinforcement learning policies has posed a significant impediment to their practical application, which also serves as a limiting factor in the training of cross-case agents. This study proposes an in-context active flow control policy learning framework grounded in reinforcement learning data. A transformer-based policy improvement operator is set up to model the process of reinforcement learning as a causal sequence and autoregressively give actions with sufficiently long context on new unseen cases. In flow separation problems, this framework demonstrates the capability to successfully learn and apply efficient flow control strategies across various airfoil configurations. Compared with general reinforcement learning, this learning mode without the need for updating the network parameter has even higher efficiency. This study presents an effective novel technique in using a single transformer model to address the flow separation active flow control problem on different airfoils. Additionally, the study provides an innovative demonstration of incorporating reinforcement-learning-based flow control with aerodynamic shape optimization, leading to collective enhancement in performance. This method efficiently lessens the training burden of the new flow control policy during shape optimization, and opens up a promising avenue for interdisciplinary intelligent co-design of future vehicles.

Type
JFM Papers
Copyright
© The Author(s), 2024. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Achiam, J., Held, D., Tamar, A. & Abbeel, P. 2017 Constrained policy optimization. In International Conference on Machine Learning, pp. 22–31. PMLR.Google Scholar
Belus, V., Rabault, J., Viquerat, J., Che, Z., Hachem, E. & Reglade, U. 2019 Exploiting locality and physical invariants to design effective deep reinforcement learning control of the unstable falling liquid film. Preprint, arXiv:1910.07788.Google Scholar
Botvinick, M., Ritter, S., Wang, J.X., Kurth-Nelson, Z., Blundell, C. & Hassabis, D. 2019 Reinforcement learning, fast and slow. Trends Cogn. Sci. (Regul. Ed.) 23 (5), 408422.CrossRefGoogle ScholarPubMed
Brown, T., et al. 2020 Language models are few-shot learners. Adv. Neural Inform. Proc. Syst. 33, 18771901.Google Scholar
Cattafesta, L.N. III & Sheplak, M. 2011 Actuators for active flow control. Annu. Rev. Fluid Mech. 43, 247272.CrossRefGoogle Scholar
Chang, P.K. 2014 Separation of Flow. Elsevier.Google Scholar
Chen, L., Lu, K., Rajeswaran, A., Lee, K., Grover, A., Laskin, M., Abbeel, P., Srinivas, A. & Mordatch, I. 2021 Decision transformer: reinforcement learning via sequence modeling. Adv. Neural Inform. Proc. Syst. 34, 1508415097.Google Scholar
Choi, H., Jeon, W.-P. & Kim, J. 2008 Control of flow over a bluff body. Annu. Rev. Fluid Mech. 40, 113139.CrossRefGoogle Scholar
Choi, H., Temam, R., Moin, P. & Kim, J. 1993 Feedback control for unsteady flow and its application to the stochastic Burgers equation. J. Fluid Mech. 253, 509543.CrossRefGoogle Scholar
Collis, S.S., Joslin, R.D., Seifert, A. & Theofilis, V. 2004 Issues in active flow control: theory, control, simulation, and experiment. Prog. Aerosp. Sci. 40 (4-5), 237289.CrossRefGoogle Scholar
Deem, E.A., Cattafesta, L.N., Hemati, M.S., Zhang, H., Rowley, C. & Mittal, R. 2020 Adaptive separation control of a laminar boundary layer using online dynamic mode decomposition. J. Fluid Mech. 903, A21.CrossRefGoogle Scholar
Dong, Q., Li, L., Dai, D., Zheng, C., Wu, Z., Chang, B., Sun, X., Xu, J. & Sui, Z. 2022 A survey on in-context learning. Preprint, arXiv:2301.00234.Google Scholar
Fan, D., Yang, L., Wang, Z., Triantafyllou, M.S. & Karniadakis, G.E. 2020 Reinforcement learning for bluff body active flow control in experiments and simulations. Proc. Natl Acad. Sci. 117 (42), 2609126098.CrossRefGoogle ScholarPubMed
Forrester, A., Sobester, A. & Keane, A. 2008 Engineering Design via Surrogate Modelling: A Practical Guide. John Wiley & Sons.CrossRefGoogle Scholar
Frazier, P.I. 2018 Bayesian optimization. In Recent Advances in Optimization and Modeling of Contemporary Problems, pp. 255–278. Informs.CrossRefGoogle Scholar
Gao, C., Zhang, W., Kou, J., Liu, Y. & Ye, Z. 2017 Active control of transonic buffet flow. J. Fluid Mech. 824, 312351.CrossRefGoogle Scholar
Garnier, P., Viquerat, J., Rabault, J., Larcher, A., Kuhnle, A. & Hachem, E. 2021 A review on deep reinforcement learning for fluid mechanics. Comput. Fluids 225, 104973.CrossRefGoogle Scholar
Gazzola, M., Hejazialhosseini, B. & Koumoutsakos, P. 2014 Reinforcement learning and wavelet adapted vortex methods for simulations of self-propelled swimmers. SIAM J. Sci. Comput. 36 (3), B622B639.CrossRefGoogle Scholar
Greenblatt, D. & Wygnanski, I.J. 2000 The control of flow separation by periodic excitation. Prog. Aerosp. Sci. 36 (7), 487545.CrossRefGoogle Scholar
Guastoni, L., Rabault, J., Schlatter, P., Azizpour, H. & Vinuesa, R. 2023 Deep reinforcement learning for turbulent drag reduction in channel flows. Eur. Phys. J. E 46 (4), 27.CrossRefGoogle ScholarPubMed
Han, K., et al. 2022 A survey on vision transformer. IEEE Trans. Pattern Anal. Mach. Intell. 45 (1), 87110.CrossRefGoogle ScholarPubMed
Han, Z.-H. & Zhang, K.-S. 2012 Surrogate-based optimization. Real-World Appl. Gen. Algorithms 343, 343362.Google Scholar
Jameson, A. 2003 Aerodynamic shape optimization using the adjoint method. Lectures at the Von Kármán Institute, Brussels.Google Scholar
Janner, M., Li, Q. & Levine, S. 2021 Offline reinforcement learning as one big sequence modeling problem. Adv. Neural Inform. Proc. Syst. 34, 12731286.Google Scholar
Kakade, S. & Langford, J. 2002 Approximately optimal approximate reinforcement learning. In Proceedings of the Nineteenth International Conference on Machine Learning, pp. 267–274. Morgan Kaufmann.Google Scholar
Konishi, M., Inubushi, M. & Goto, S. 2022 Fluid mixing optimization with reinforcement learning. Sci. Rep. 12 (1), 14268.CrossRefGoogle ScholarPubMed
Kulfan, B.M. 2008 Universal parametric geometry representation method. J. Aircraft 45 (1), 142158.CrossRefGoogle Scholar
Laskin, M., et al. 2022 In-context reinforcement learning with algorithm distillation. Preprint, arXiv:2210.14215.Google Scholar
Lee, C., Kim, J. & Choi, H. 1998 Suboptimal control of turbulent channel flow for drag reduction. J. Fluid Mech. 358, 245258.CrossRefGoogle Scholar
Lee, K.-H., et al. 2022 Multi-game decision transformers. Adv. Neural Inform. Proc. Syst. 35, 2792127936.Google Scholar
Li, J., Du, X. & Martins, J.R.R.A. 2022 Machine learning in aerodynamic shape optimization. Prog. Aerosp. Sci. 134, 100849.CrossRefGoogle Scholar
Min, S., Lyu, X., Holtzman, A., Artetxe, M., Lewis, M., Hajishirzi, H. & Zettlemoyer, L. 2022 Rethinking the role of demonstrations: what makes in-context learning work? Preprint, arXiv:2202.12837.CrossRefGoogle Scholar
Ogink, R.H.M. & Metrikine, A.V. 2010 A wake oscillator with frequency dependent coupling for the modeling of vortex-induced vibration. J. Sound Vib. 329 (26), 54525473.CrossRefGoogle Scholar
Paszke, A., et al. 2019 Pytorch: an imperative style, high-performance deep learning library. Adv. Neural Inform. Proc. Syst. 32. https://proceedings.neurips.cc/paper_files/paper/2019.Google Scholar
Pehlivanoglu, Y.V. & Yagiz, B. 2011 Optimization of active/passive flow control parameters on airfoils at transonic speeds. J. Aircraft 48 (1), 212219.CrossRefGoogle Scholar
Peitz, S., Stenner, J., Chidananda, V., Wallscheid, O., Brunton, S.L. & Taira, K. 2024 Distributed control of partial differential equations using convolutional reinforcement learning. Phys. D: Nonlinear Phenom. 461, 134096.CrossRefGoogle Scholar
Queeney, J., Paschalidis, Y. & Cassandras, C.G. 2021 Generalized proximal policy optimization with sample reuse. Adv. Neural Inform. Proc. Syst. 34, 1190911919.Google Scholar
Rabault, J., Kuchta, M., Jensen, A., Réglade, U. & Cerardi, N. 2019 Artificial neural networks trained through deep reinforcement learning discover control strategies for active flow control. J. Fluid Mech. 865, 281302.CrossRefGoogle Scholar
Rabault, J., Ren, F., Zhang, W., Tang, H. & Xu, H. 2020 Deep reinforcement learning in fluid mechanics: a promising method for both active flow control and shape optimization. J. Hydrodyn. 32, 234246.CrossRefGoogle Scholar
Raymer, D. 2012 Aircraft Design: A Conceptual Approach. AIAA.CrossRefGoogle Scholar
Reddy, G., Wong-Ng, J., Celani, A., Sejnowski, T.J. & Vergassola, M. 2018 Glider soaring via reinforcement learning in the field. Nature 562 (7726), 236239.CrossRefGoogle ScholarPubMed
Reed, S., et al. 2022 A generalist agent. Preprint, arXiv:2205.06175.Google Scholar
Ren, F., Rabault, J. & Tang, H. 2021 Applying deep reinforcement learning to active flow control in weakly turbulent conditions. Phys. Fluids 33 (3), 037121.Google Scholar
Schulman, J., Levine, S., Abbeel, P., Jordan, M. & Moritz, P. 2015 Trust region policy optimization. In International Conference on Machine Learning, pp. 1889–1897. PMLR.Google Scholar
Schulman, J., Wolski, F., Dhariwal, P., Radford, A. & Klimov, O. 2017 Proximal policy optimization algorithms. Preprint, arXiv:1707.06347.Google Scholar
Schulz, E., Speekenbrink, M. & Krause, A. 2018 A tutorial on gaussian process regression: modelling, exploring, and exploiting functions. J. Math. Psychol. 85, 116.CrossRefGoogle Scholar
Shao, K., Tang, Z., Zhu, Y., Li, N. & Zhao, D. 2019 A survey of deep reinforcement learning in video games. Preprint, arXiv:1912.10944.Google Scholar
Sonoda, T., Liu, Z., Itoh, T. & Hasegawa, Y. 2023 Reinforcement learning of control strategies for reducing skin friction drag in a fully developed turbulent channel flow. J. Fluid Mech. 960, A30.CrossRefGoogle Scholar
Sutton, R.S. & Barto, A.G. 2018 Reinforcement Learning: An Introduction. MIT Press.Google Scholar
Sutton, R.S., McAllester, D., Singh, S. & Mansour, Y. 1999 Policy gradient methods for reinforcement learning with function approximation. Adv. Neural Inform. Proc. Syst. 12. https://proceedings.neurips.cc/paper_files/paper/1999.Google Scholar
Tang, H., Rabault, J., Kuhnle, A., Wang, Y. & Wang, T. 2020 Robust active flow control over a range of Reynolds numbers using an artificial neural network trained through deep reinforcement learning. Phys. Fluids 32 (5), 053605.CrossRefGoogle Scholar
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł. & Polosukhin, I. 2017 Attention is all you need. Adv. Neural Inform. Proc. Syst. 30. https://proceedings.neurips.cc/paper/2017.Google Scholar
Verma, S., Novati, G. & Koumoutsakos, P. 2018 Efficient collective swimming by harnessing vortices through deep reinforcement learning. Proc. Natl Acad. Sci. 115 (23), 58495854.CrossRefGoogle ScholarPubMed
Vignon, C., Rabault, J., Vasanth, J., Alcántara-Ávila, F., Mortensen, M. & Vinuesa, R. 2023 a Effective control of two-dimensional Rayleigh–Bénard convection: invariant multi-agent reinforcement learning is all you need. Phys. Fluids 35 (6), 065146.CrossRefGoogle Scholar
Vignon, C., Rabault, J. & Vinuesa, R. 2023 b Recent advances in applying deep reinforcement learning for flow control: perspectives and future directions. Phys. Fluids 35 (3), 031301.CrossRefGoogle Scholar
Viquerat, J., Rabault, J., Kuhnle, A., Ghraieb, H., Larcher, A. & Hachem, E. 2021 Direct shape optimization through deep reinforcement learning. J. Comput. Phys. 428, 110080.CrossRefGoogle Scholar
Wang, Y.-Z., Hua, Y., Aubry, N., Chen, Z.-H., Wu, W.-T. & Cui, J. 2022 Accelerating and improving deep reinforcement learning-based active flow control: transfer training of policy network. Phys. Fluids 34 (7), 073609.Google Scholar
Wang, Z.P., Lin, R.J., Zhao, Z.Y., Chen, X., Guo, P.M., Yang, N., Wang, Z.C. & Fan, D.X. 2024 Learn to flap: foil non-parametric path palnning via deep reinforcement learning. J. Fluid Mech. 984, A9.CrossRefGoogle Scholar
Williamson, C.H.K. & Govardhan, R. 2004 Vortex-induced vibrations. Annu. Rev. Fluid Mech. 36, 413455.CrossRefGoogle Scholar
Williamson, C.H.K. & Govardhan, R. 2008 A brief review of recent results in vortex-induced vibrations. J. Wind Engng Ind. Aerodyn. 96 (6-7), 713735.CrossRefGoogle Scholar
Wolf, T., et al. 2020 Transformers: state-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pp. 38–45. ACL.Google Scholar
Xie, F., Zheng, C., Ji, T., Zhang, X., Bi, R., Zhou, H. & Zheng, Y. 2023 Deep reinforcement learning: a new beacon for intelligent active flow control. Aerosp. Res. Commun. 1, 11130.CrossRefGoogle Scholar
Yan, L., Chang, X., Tian, R., Wang, N., Zhang, L. & Liu, W. 2020 A numerical simulation method for bionic fish self-propelled swimming under control based on deep reinforcement learning. Proc. Inst. Mech. Engrs C J. Mech. Engng Sci. 234 (17), 33973415.CrossRefGoogle Scholar
Yao, W. & Jaiman, R.K. 2017 Feedback control of unstable flow and vortex-induced vibration using the eigensystem realization algorithm. J. Fluid Mech. 827, 394414.CrossRefGoogle Scholar
Zhang, M. & He, L. 2015 Combining shaping and flow control for aerodynamic optimization. AIAA J. 53 (4), 888901.CrossRefGoogle Scholar
Zheng, C., Ji, T., Xie, F., Zhang, X., Zheng, H. & Zheng, Y. 2021 From active learning to deep reinforcement learning: intelligent active flow control in suppressing vortex-induced vibration. Phys. Fluids 33 (6), 063607.CrossRefGoogle Scholar
Zheng, C., Xie, F., Ji, T., Zhang, X., Lu, Y., Zhou, H. & Zheng, Y. 2022 Data-efficient deep reinforcement learning with expert demonstration for active flow control. Phys. Fluids 34 (11), 113603.CrossRefGoogle Scholar