This paper describes the evaluation of three diagnostic wind models by direct comparison with wind field data. The models are the California Meteorological Model (CALMET), the Mass Consistent model (MCSCIPUF) associated with the Second Order Closure Integrated Puff (SCIPUFF) transport/dispersion model, and the Stationary Wind Field and Turbulence (SWIFT) model. The evaluation follows previous works by Chang, Franzese & Hanna, who compared the same three models, and by Bradley & Mazzola who evaluated SWIFT coupled with SCIPUFF. As with SWIFT, MCSCIPUF is incorporated in the Hazard Prediction and Assessment Capability (HPAC), while CALMET is linked with the California Puff model (CALPUFF), another transport and dispersion model. The Dipole Pride 26 (DP26) experiments, performed at the US Department of Energy (DOE) Nevada Test Site, are used as the source of the wind data. They provide a comprehensive set of meteorological data with wide-ranging atmospheric stability conditions over a complex terrain. Model calculations were compared with measured data in two phases. The first phase uses complete sets of data from eight locations (the 8M phase) as model inputs, and thus tests the ability of the models to reproduce input conditions. In the second phase, five of the measured wind sites are withheld from input, and instead used for validation of model calculations (the 3M phase). In the first phase, the errors were found (with some exceptions) to be quite small. In the second phase, mean absolute errors were found to be of the order of 1 ms−1 and 30°, with only small differences among models in terms of performance.