Cognitive Diagnosis Models (CDMs) are a special family of discrete latent variable models that are widely used in educational and psychological measurement. A key component of CDMs is the Q-matrix characterizing the dependence structure between the items and the latent attributes. Additionally, researchers also assume in many applications certain hierarchical structures among the latent attributes to characterize their dependence. In most CDM applications, the attribute–attribute hierarchical structures, the item-attribute Q-matrix, the item-level diagnostic models, as well as the number of latent attributes, need to be fully or partially pre-specified, which however may be subjective and misspecified as noted by many recent studies. This paper considers the problem of jointly learning these latent and hierarchical structures in CDMs from observed data with minimal model assumptions. Specifically, a penalized likelihood approach is proposed to select the number of attributes and estimate the latent and hierarchical structures simultaneously. An expectation-maximization (EM) algorithm is developed for efficient computation, and statistical consistency theory is also established under mild conditions. The good performance of the proposed method is illustrated by simulation studies and real data applications in educational assessment.