Published online by Cambridge University Press: 10 June 2020
To improve speech emotion recognition, a U-acoustic words emotion dictionary (AWED) features model is proposed based on an AWED. The method models emotional information from acoustic words level in different emotion classes. The top-list words in each emotion are selected to generate the AWED vector. Then, the U-AWED model is constructed by combining utterance-level acoustic features with the AWED features. Support vector machine and convolutional neural network are employed as the classifiers in our experiment. The results show that our proposed method in four tasks of emotion classification all provides significant improvement in unweighted average recall.