Despite the recognized importance of datasets in data-driven design approaches, their extensive study remains limited. We review the current landscape of design datasets and highlight the ongoing need for larger and more comprehensive datasets. Three categories of challenges in dataset development are identified. Analyses show critical dataset gaps in design process where future studies can be directed. Synthetic and end-to-end datasets are suggested as two less explored avenues. The recent application of Generative Pretrained Transformers (GPT) shows their potential in addressing these needs.