Movie affective content detection attracts ever-increasing research efforts. However, the affective content analysis is still a challenging task due to the gap between low-level perceptual features and high-level human perception of the media. Moreover, clues from multiple modalities should be considered for affective analysis, since they were used in movies to represent emotions and romance emotional atmosphere. In this paper, mid-level representations are generated from low-level features. These mid-level representations are from multiple modalities and used for affective content inference. Besides video shots which is commonly used for video content analysis, audio sounds, dialogue and subtitle are explored to contribute to detect affective content. Since affective analysis rely on movie genres, experiments are implemented in respective genres. The results shows that audio sounds, dialogues and subtitles are effective and efficient for affective content detection.
1st International Conference on Internet Multimedia Computing and Service (ICIMCS 2009). Proceedings of the 1st International Conference on Internet Multimedia Computing and Service, ICIMCS 2009 (Kunming, China 23-25 November, 2009) p. 215-221