Music video is a popular type of entertainment by viewers. Currently, the novel indexing and retrieval approach based on the affective cues contained in music videos becomes more and more attractive to users. Music video affective analysis and understanding is one of the most popular topics in current multimedia community. In this paper, we propose a novel feature importance analysis approach to select most representative arousal and valence features for arousal and valence modeling. Compared with state-of-the-art work by Zhang on music video affective analysis, our main contributions are in the following aspects: (1) Another 3 affect-related features are extracted to enrich the feature set and exploit their correlation with arousal and valence. (2) All extracted features are ordered via feature importance analysis, and then optimal feature subset is selected after ordering. (3) Different regression methods are compared for arousal and valence modeling in order to find the fittest estimation function. Our method achieves 33.39% and 42.17% deduction in terms of mean absolute error compared with Zhang's method. Experimental results demonstrate our proposed method has a considerable improvement on music video affective understanding.
2010 ACM International Conference on Image and Video Retrieval (CIVR'10). Proceedings of the 2010 ACM International Conference on Image and Video Retrieval (Xi'an, China 5-7 July, 2010) p. 213-219