Safety is an essential requirement as well as a major bottleneck for legged robots in the real world. Particularly for learning-based methods, their trial-and-error nature and unexplainable policy have raised widespread concerns. Existing methods usually treat this challenge as a trade-off between safety assurance and task performance. One reason for this drawback stems from the inaccurate inference for the robot’s safety. In this paper, we re-examine the segmentation of the robot’s state space in terms of safety. According to the current state and the prediction of the state transition trajectory, the states of legged robots are classified into safe, recoverable, unsafe, and failure, and a safety verification method is introduced to online infer the robot’s safety. Then, task, recovery, and fall protection policies are trained to ensure the robot’s safety in different states, forming a safety supervision framework independently from the learning algorithm. To validate the proposed method and framework, experiment results are conducted both in the simulation and on the real-world robot, indicating improvements in terms of safety and efficiency.