This repository collects information about different data sets for Music Emotion Recognition.
Complementary website for article in IEEE Signal Processing Magazine, 36(6), 2021.
-- Juan Sebastián Gómez-Cañón, Estefanía Cano, Tuomas Eerola, Perfecto Herrera, Xiao Hu, Yi-Hsuan Yang, and Emilia Gómez
In this paper we present a review of the challenges and limitations of Music Emotion Recognition (MER), an interdisciplinary research area addressing the characterization of music in terms of emotion. It analyzes music in order to computationally predict the emotions perceived by or induced to a listener. Our aims are: (1) to provide insights on the typical approaches currently used in the MER workflow, and (2) hint at how previous research directs to specific future directions.
We propose a new user-centric conceptualization framework for MER, highlighting where future researchers should focus efforts: (1) open data and experimental reproducibility, (2) inherent subjectivity of concepts and annotations, (3) model explainability and interpretability, (4) cultural and contextual relevance, and (5) ethical implications for MER applications.
This website offers a detailed overview of music and emotion datasets and an extended bibliography.
Dataset | Year | Content | Format | Size | Type | Perceived/Induced |
---|---|---|---|---|---|---|
MoodsMIREX | 2007 | 269 excerpts (30s long) | MP3 | 736MB | Categorical (5 mood clusters) | Perceived |
CAL500 | 2007 | 500 full songs | MP3 | 366MB | Categorical (174 labels) | Perceived |
Yang-Dim | 2008 | 195 excerpts (25s long) | - | - | Dimensional | Perceived |
MoodSwings | 2008 | 240 excerpts (15s long) | - | - | Dimensional (Time-continuous A-V) | Perceived |
NTWICM | 2010 | 2648 full songs | MP3 | 11.7GB | "Discrete" Dimensional | Perceived |
Soundtracks | 2011 | 360+110 exceprts (15s-1m long) | MP3 | 216MB | Categorical (tension, anger, fear, happy, sad, tender) and Dimensional (valence, energy, tension) | Perceived |
DEAP | 2012 | 120 exceprts (60s long) | Links | - | Dimensional | Induced |
AMG1608 | 2015 | 1608 excerpts (30s long) | WAV | 4.3GB | "Discrete" Dimensional | Perceived |
Emotify | 2016 | 400 excerpts (60s long) | MP3 | 363MB | Categorical (GEMS) | Induced |
Moodo | 2016 | 200 tracks (15 seconds) | WAV | Percieved Color | "Discrete" Dimensional | Perceived |
CH818 | 2017 | 818 excerpts (30s long) | MP3 | 393MB | Dimensional | Perceived |
4Q-emotion | 2018 | 900 excerpts (30s long) | MP3 | 291MB | Categorical (Quadrants) | Perceived |
DEAM/Mediaeval | 2018 | 2058 excerpts (45s long) | MP3 | 1.4GB | Dimensional (Time-continuous A-V) | Perceived |
PMEmo | 2018 | 794 full songs | MP3 | 1.3GB | Dimensional (Time-continuous A-V) | Induced |
Jamendo Moods and Themes | 2019 | 18486 full songs | MP3 | 152GB | Categorical | Perceived |
VGMIDI | 2019 | 200 MIDI files | MIDI | 1.37GB | Dimensional | Perceived |
CCMED-WCMED | 2020 | 800 excerpts (8-20s long) | WAV | - | "Discrete" Dimensional | Perceived |
Moodo | 2016 | 200 tracks (15 seconds) | WAV | Percieved Color | "Discrete" Dimensional | Perceived |
[1] Laurier, C., & Herrera, P. (2009). Automatic detection of emotion in music: Interaction with emotionally sensitive machines. Handbook of Research on Synthetic Emotions and Sociable Robotics: New Applications in Affective Computing and Artificial Intelligence, pp. 9–33.
[2] Kim, Y. E., et al. (2010). Music Emotion Recognition: A state of the art review. Proceedings of the 11th International Society for Music Information Retrieval Conference (ISMIR), pp. 255–266.
[3] Yang, Y.-H., & Chen, H. H. (2012). Machine Recognition of Music Emotion: A Review. ACM Transactions on Intelligent Systems and Technology, 3.
[4] Yang, X, Dong, Y., & Li, J. (2018). Review of data features-based music emotion recognition methods. Multimedia Systems, 24(4), pp. 365–389.
[1] Balkwill, L. & Thompson, W. F. (1999). A Cross-Cultural Investigation of the Perception of Emotion in Music: Psychophysical and Cultural Cues. Music Perception, 17(1), pp. 43-64.
[2] Krumhansl, C., et al. (2000). Cross-cultural music cognition: cognitive methodology applied to North Sami yoiks. Cognition, 76(1), pp. 13-58.
[3] Juslin, P. & Laukka, P. (2004). Expression, Perception, and Induction of Musical Emotions: A Review and a Questionnaire Study of Everyday Listening. Journal of New Music Research, 33(3), pp. 217-238.
[4] Laukka, P. (2005). Categorical perception of vocal emotion expressions. Emotion, 5, pp. 277–295.
[5] Gabrielsson, A. (2006). Emotion perceived and emotion felt: Same and different. Musicae Scientiae, 10(2), pp.191–213.
[6] Henrich, J., Heine, S.J., & Norenzayan, A. (2010). The weirdest people in the world? Behavioral and Brain Sciences, 33, pp. 61–135.
[7] Vuoskoski, J. & Eerola, T. (2011). Measuring music-induced emotion: A comparison of emotion models, personality biases, and intensity of experiences. Musicae Scientiae, 15(2), pp. 159-173.
[8] Eerola, T. (2011). Are the emotions expressed in music genre-specific? An audio- based evaluation of datasets spanning classical, film, pop and mixed genres. Journal of New Music Research. 40, pp. 349–366.
[9] Coutinho, E., & Dibben, N. (2013). Psychoacoustic cues to emotion in speech prosody and music. Cognition and Emotion, 27(4), pp. 658–684.
[10] Argstatter, H. (2015). Perception of basic emotions in music: Culture-specific or multicultural? Psychology of Music, 44(4), pp. 674-690.
[11] Song, Y., et al. (2016). Perceived and Induced Emotion Responses to Popular Music: Categorical and Dimensional Models. Music Perception, 33(4), pp. 472-492.
[12] Cespedes-Guevara, J., & Eerola, T. (2018). Music communicates affects, not basic emotions - A constructionist account of attribution of emotional meanings to music. Frontiers in Psychology, 9, pp. 1–19.
[13] Keltner, D., et al. (2019). What Basic Emotion Theory Really Says for the Twenty-First Century Study of Emotion. Journal of Nonverbal Behavior, 43(2), pp.195–201.
[14] Warrenburg, L. A. (2020). Choosing the right tune: A review of music stimuli used in emotion research. Music Perception, 37(3), pp. 240–258.
[15] Micallef Grimaud, A. & Eerola, T. (2021). EmoteControl: an interactive system for real-time control of emotional expression in music. Personal and Ubiquitous Computing 25, pp. 677–689.
[1] Schuller, B., et al. (2010). Mister D.J. Cheer Me Up!: Musical and textual features for automatic mood classification. Journal of New Music Research, 39(1).
[2] Barthet, M., Fazekas, G., Sandler, M. (2012). Music Emotion Recognition: From content- to context-based models. In: Aramaki M., Barthet M., Kronland-Martinet R., Ystad S. (eds) From Sounds to Music and Emotions. CMMR 2012. Lecture Notes in Computer Science, vol 7900. Springer, Berlin, Heidelberg.
[3] Soleymani, M. et al. (2012) A multimodal database for affect recognition and implicit tagging. IEEE Transactions on Affective Computing, 3(1), pp. 42-55.
[4] Soleymani, M., Aljanaki, A., & Yang, Y.-H. (2014). Emotional Analysis of Music: A Comparison of Methods. Proceedings of the 22nd ACM International Conference on Multimedia Pages, pp. 1161–1164.
[5] Wang, J.C., et al. (2015). Modeling the Affective Content of Music with a Gaussian Mixture Model. IEEE Transcactions on Affective Computing, 6(1).
[6] Saari, P., et al. (2016). Genre-Adaptive Semantic Computing and Audio-Based Modelling for Music Mood Annotation. IEEE Transaction on Affective Computing, 7(2).
[7] Coutinho, E., & Schuller, B. (2017). Shared acoustic codes underlie emotional communication in music and -speech—evidence from deep transfer learning. PLoS ONE, 12(6).
[8] Yang, S., et al. (2017). Multi-scale Analysis of Agreement Levels in Perceived Emotion Ratings During Live Performance. Late-Breaking/Demo of the 18th International Society for Music Information Retrieval Conference (ISMIR), Suzhou, China.
[9] Cancino-Chacón, C.E., et al. (2018). Computational Models of Expressive Music Performance: A Comprehensive and Critical Review. Frontiers in Digital Humanities, 5.
[10] Korzeniowski, F., et al. (2020). Mood Classification using Listening Data. In Proceedings of the 21st International Society for Music Information Retrieval Conference (ISMIR). Virtual.
[11] Chaki, S., et al. (2020). Explaining perceived emotion predictions in music: an attentive approach. In Proceedings of the 21st International Society for Music Information Retrieval Conference (ISMIR). Virtual.
[12] Hung, H.T., et al. (2021). EMOPIA: A Multi-Modal Pop Piano Dataset For Emotion Recognition and Emotion-based Music Generation. In Proceedings of the 22nd International Society for Music Information Retrieval Conference (ISMIR). Virtual.
[13] Yang, S., et al. (2021). Examining emotion perception agreement in live music performance. IEEE Transactions on Affective Computing.
[14] Dufour, I. & Tzanetakis, G. (2021). Using circular models to improve music emotion recognition. IEEE Transactions on Affective Computing.
[1] Laurier, C. et al. (2008). Multimodal music mood classification using audio and lyrics. In Proceedings of the 7th International Conference on Machine Learning and Applications, San Diego, USA.
[2] Hu, X. & Downie, J.S. (2010) When lyrics outperform audio for music mood classification: a feature analysis. In Proceedings of the 11th International Society for Music Information Retrieval Conference (ISMIR), Utrecht, The Netherlands, pp. 619-624.
[3] Kermanidis, K. et al. (2014). Combining Language Modeling and LSA on Greek Song "Words" for Mood Classification. International Journal on Artificial Intelligence Tools, 23(2).
[4] Çano, E. & Morisio, M. (2017). MoodyLyrics: A Sentiment Annotated Lyrics Dataset. In Proceeedings of the International Conference on Intelligent Systems, Metaheuristics and Swarm Intelligence (ISMI), pp. 118-124.
[5] Delbouys, R., et al. (2018). Music Mood Detection Based on Audio and Lyrics with Deep Neural Net. In Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR). Paris, France.
[6] Malheiro, R., et al. (2018). Emotionally-Relevant Features for Classification and Regression of Music Lyrics. IEEE Transactions on Affective Computing, 9(2).
[1] Aljanaki, A., Wiering, F., & Veltkamp, R. C. (2015). Studying emotion induced by music through a crowdsourcing game. Information Processing and Management, 52, pp. 115–128.
[2] Çano, E. (2017). Crowdsourcing emotions in music domain. International Journal of Artificial Intelligence and Applications, 8(4), pp. 24-40.
[3] Gómez-Cañón, J.S., et al. (2020). Joyful for you and tender for us: the influence of individual characteristics on emotion labeling and classification. In Proceedings of the 21st International Society for Music Information Retrieval Conference (ISMIR). Montréal, Canada (virtual).
[4] Gómez-Cañón, J.S., et al. (2020). Improving emotion annotation of music using citizen science. In Proceedings of the 16th International Conference on Music Perception and Cognition (ICMPC/ESCOM). Sheffield, United Kingdom (virtual).
[5] Gutiérrez-Páez, N., et al. (2021). Emotion annotation of music: a citizen science approach. In Proceedings of the 27th International Conference on Collaboration Technologies and Social Computing (CollabTech). Trier, Germany (virtual).
[1] Jackson, J. C., et al. (2019). Emotion semantics show both cultural variation and universal structure. Science, 366(6472), pp. 1517–1522.
[2] Hu, X., & Yang, Y.-H. (2017). The Mood of Chinese Pop Music: Representation and Recognition. Journal of the American Society for Information Science and Technology, 68(8), pp. 1899–1910.
[3] Patra, B.G., et al. (2018). Multimodal mood classification of Hindi and Western songs. Journal of Intelligent Information Systems, 51, pp. 579-596.
[4] Sangnark, S., et al. (2019). Thai music emotion recognition based on western music. Proceedings of the 11th International Conference on Computer and Electrical Engineering. Tokyo, Japan.
[5] Fan, J., et al. (2020). A Comparative Study of Western and Chinese Classical Music based on Soundscape Models. In Proceedings of the 45th International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Barcelona, Spain.
[6] Gómez-Cañón, J.S., et al. (2020). Transfer learning from speech to music: towards language-sensitive emotion recognition models. In Proceedings of the 28th European Signal Processing Conference (EUSIPCO). Amsterdam, The Netherlands (virtual).
[7] Pandrea, A.G., et al. (2020). Cross-Dataset Music Emotion Recognition: an End-to-End Approach. Late-Breaking/Demo of the 21st International Society for Music Information Retrieval Conference (ISMIR). Montréal, Canada (virtual).
[8] Gómez-Cañón, J.S., et al. (2021). Language-sensitive Music Emotion Recognition models: are we really there yet? In Proceedings of the 46th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Toronto, Canada (virtual).
[1] Yang, Y.-H., et al. (2007). Music Emotion Recognition: The Role of Individuality. Proceedings of the International Workshop on Human-Centered Multimedia, pp. 13–22.
[2] Yang, Y.-H. & Liu, J.-Y. (2013). Quantitative Study of Music Listening Behavior in a Social and Affective Context. IEEE Transactions on Multimedia, 15(6), pp. 1304-1315.
[3] Yang, Y.-H. & Liu, J.-Y. (2015). Quantitative Study of Music Listening Behavior in a Smartphone Context. ACM Transactions on Interactive Intelligent Systems, 5(3).
[4] Wang, J.C., et al. (2017). Affective Music Information Retrieval. In: Tkalčič M., et al. (eds) Emotions and Personality in Personalized Services. Human–Computer Interaction Series. Springer.
[5] Gómez-Cañón, J.S., et al. (2021). Let’s agree to disagree: consensus entropy active learning for personalized music emotion recognition. In Proceedings of the 22nd International Society for Music Information Retrieval Conference (ISMIR). Virtual.
[1] Meyer, L. B. (1961). Emotion and Meaning. University of Chicago Press.
[2] Budd, M. (1992). Music and the Emotion. Routledge.
[3] Picard, R.W. (1997). Affective Computing. MIT Press.
[3] Huron, D. (2006). Sweet Anticipation. MIT Press.
[4] Patel, A. D. (2008). Music, Language and the Brain. Oxford University Press.
[5] Juslin, P. & Sloboda, J. (2010). Handbook of Music and Emotion: Theory, Research, Applications. Oxford University Press.
[6] Krippendorff, K. H. (2004). Content Analysis: An Introduction to Its Methodology. SAGE Publications.
[7] Hallam, S., Cross, I., & Thaut, M. (2016). The Oxford Handbook of Music Psychology. Oxford University Press.
[8] Grekow, J. (2018). From Content-based Music Emotion Recognition to Emotion Maps of Musical Pieces. Springer Nature.
[9] Honing, H. (2018). The origins of musicality. MIT Press.
[1] Coutinho, E. (2008). Computational and Psycho-Physiological Investigations of Musical Emotions. University of Plymouth.
[2] Hu, X. (2010). Improving music mood classification using lyrics, audio and social tags. University of Illinois.
[3] Yang, Y.-H. (2010). Dimensional Music Emotion Recognition for Content Retrieval. National Tsing Hua University.
[4] Laurier, C. (2011). Automatic Classification of Musical Mood by Content-Based Analysis. Universitat Pompeu Fabra.
[5] Vuoskoski, J.K. (2012). Emotions represented and induced by music: the role of individual differences. University of Jyväskylä.
[6] Schmidt, E. (2012). Modeling and Predicting Emotion in Music. Drexel University.
[7] Aljanaki, A. (2016). Emotion in Music: representation and computational modeling. Universiteit Utrecht.
[8] da Silva Mahleiro, R. M. (2016). Emotion-based Analysis and Classification of Music Lyrics. Universidade de Coimbra.
[9] Song, Y. (2016). The Role of Emotion and Context in Musical Preference. Queen Mary University of London.
[10] Malheiro, R. (2016). Emotion-based analysis and classification of music lyrics. Universidade de Coimbra.
[11] Barradas, G. T. (2017). A Cross-Cultural Approach to Psychological Mechanisms Underlying Emotional Reactions to Music. Uppsala Universitet.
[12] Çano, E. (2018). Text-based Sentiment Analysis and Music Emotion Recognition. Politecnico di Torino.
[13] Panda, R. (2019). Emotion-based Analysis and Classification of Audio Music Emotion. Universidade de Coimbra.
[14] Fan, J. (2020). Advances in Soundscape and Music Emotion Recognition. Simon Fraser University.
[15] Yang, S. (2020). Understanding Agreement and Disagreement in Listeners’ Perceived Emotion in Live Music Performance. Queen Mary University of London.
[16] Dufour, I. (). .
@article{GomezCanon2021SPM,
author = {Gómez-Cañón, Juan Sebastián and
Cano, Estefanía and
Eerola, Tuomas and
Herrera, Perfecto and
Hu, Xiao and
Yang, Yi-Hsuan and
Gómez Emilia},
title = {{Music Emotion Recognition: Toward new, robust standards in personalized and context-sensitive applications}},
journal = {IEEE Signal Processing Magazine},
volume = {38},
issue = {6},
year = {2021},
pages={106--114},
doi = {10.1109/MSP.2021.3106232}
}