This paper studies deep learning approaches to find optimal reinsurance and dividend strategies for insurance companies. Due to the randomness of the financial ruin time to terminate the control processes, a Markov chain approximation-based iterative deep learning algorithm is developed to study this type of infinite-horizon optimal control problems. The optimal controls are approximated as deep neural networks in both cases of regular and singular types of dividend strategies. The framework of Markov chain approximation plays a key role in building the iterative equations and initialization of the algorithm. We implement our method to classic dividend and reinsurance problems and compare the learning results with existing analytical solutions. The feasibility of our method for complicated problems has been demonstrated by applying to an optimal dividend, reinsurance and investment problem under a high-dimensional diffusive model with jumps and regime switching.