This repo contains implementation of different architectures for emotion recognition in conversations.
For those enquiring about how to extract visual and audio features, please check this out: https://github.com/soujanyaporia/MUStARD
Date | Announcements |
---|---|
10/03/2024 | If you are interested in IQ testing LLMs, check out our new work: AlgoPuzzleVQA |
03/08/2021 | 🎆 🎆 We have released a new dataset M2H2: A Multimodal Multiparty Hindi Dataset For Humor Recognition in Conversations. Check it out: M2H2. The baselines for the M2H2 dataset are created based on DialogueRNN and bcLSTM. |
18/05/2021 | 🎆 🎆 We have released a new repo containing models to solve the problem of emotion cause recognition in conversations. Check it out: emotion-cause-extraction. Thanks to Pengfei Hong for compiling this. |
24/12/2020 | 🎆 🎆 Interested in the topic of recognizing emotion causes in conversations? We have just released a dataset for this. Head over to https://github.com/declare-lab/RECCON. |
06/10/2020 | 🎆 🎆 New paper and SOTA in Emotion Recognition in Conversations. Refer to the directory COSMIC for the code. Read the paper -- COSMIC: COmmonSense knowledge for eMotion Identification in Conversations. |
30/09/2020 | New paper and baselines in utterance-level dialogue understanding have been released. Read our paper Utterance-level Dialogue Understanding: An Empirical Study. Fork the codes. |
26/07/2020 | New DialogueGCN code has been released. Please visit https://github.com/declare-lab/conv-emotion/tree/master/DialogueGCN-mianzhang. All the credit goes to the Mian Zhang (https://github.com/mianzhang/) |
11/07/2020 | Interested in reading the papers on ERC or related tasks such as sarcasm detection in conversations? We have compiled a comprehensive reading list for papers. Please visit https://github.com/declare-lab/awesome-emotion-recognition-in-conversations |
07/06/2020: | New state-of-the-art results for the ERC task will be released soon. |
07/06/2020: | The conv-emotion repo will be maintained on https://github.com/declare-lab/ |
22/12/2019: | Code for DialogueGCN has been released. |
11/10/2019: | New Paper: Conversational Transfer Learning for Emotion Recognition. |
09/08/2019: | New paper on Emotion Recognition in Conversation (ERC). |
06/03/2019: | Features and codes to train DialogueRNN on the MELD dataset have been released. |
20/11/2018: | End-to-end version of ICON and DialogueRNN have been released. |
COSMIC is the best performing model in this repo and please visit the links below to compare the models on different ERC datasets.
This repository contains implementations for several emotion recognition in conversations methods as well algorithms for recognizing emotion cause in conversations:
Unlike other emotion detection models, these techniques consider the party-states and inter-party dependencies for modeling conversational context relevant to emotion recognition. The primary purpose of all these techniques are to pretrain an emotion detection model for empathetic dialogue generation.
Emotion recognition can be very useful for empathetic and affective dialogue generation -
These networks expect emotion/sentiment label and speaker info for each utterance present in a dialogue like
Party 1: I hate my girlfriend (angry)
Party 2: you got a girlfriend?! (surprise)
Party 1: yes (angry)
However, the code can be adpated to perform tasks where only the preceding utterances are available, without their corresponding labels, as context and goal is to label only the present/target utterance. For example, the context is
Party 1: I hate my girlfriend
Party 2: you got a girlfriend?!
the target is
Party 1: yes (angry)
where the target emotion is angry. Moreover, this code can also be molded to train the network in an end-to-end manner. We will soon push these useful changes.
Methods | IEMOCAP | DailyDialog | MELD | EmoryNLP | |||
---|---|---|---|---|---|---|---|
W-Avg F1 | Macro F1 | Micro F1 | W-Avg F1 (3-cls) | W-Avg F1 (7-cls) | W-Avg F1 (3-cls) | W-Avg F1 (7-cls) | |
RoBERTa | 54.55 | 48.20 | 55.16 | 72.12 | 62.02 | 55.28 | 37.29 |
RoBERTa DialogueRNN | 64.76 | 49.65 | 57.32 | 72.14 | 63.61 | 55.36 | 37.44 |
RoBERTa COSMIC | 65.28 | 51.05 | 58.48 | 73.20 | 65.21 | 56.51 | 38.11 |
COSMIC addresses the task of utterance level emotion recognition in conversations using commonsense knowledge. It is a new framework that incorporates different elements of commonsense such as mental states, events, and causal relations, and build upon them to learn interactions between interlocutors participating in a conversation. Current state-of-the-art methods often encounter difficulties in context propagation, emotion shift detection, and differentiating between related emotion classes. By learning distinct commonsense representations, COSMIC addresses these challenges and achieves new state-of-the-art results for emotion recognition on four different benchmark conversational datasets.
First download the RoBERTa and COMET features here and keep them in appropriate directories in COSMIC/erc-training
. Then training and evaluation on the four datasets are to be done as follows:
python train_iemocap.py --active-listener
python train_dailydialog.py --active-listener --class-weight --residual
python train_meld.py --active-listener --attention simple --dropout 0.5 --rec_dropout 0.3 --lr 0.0001 --mode1 2 --classify emotion --mu 0 --l2 0.00003 --epochs 60
python train_meld.py --active-listener --class-weight --residual --classify sentiment
python train_emorynlp.py --active-listener --class-weight --residual
python train_emorynlp.py --active-listener --class-weight --residual --classify sentiment
Please cite the following paper if you find this code useful in your work.
COSMIC: COmmonSense knowledge for eMotion Identification in Conversations. D. Ghosal, N. Majumder, A. Gelbukh, R. Mihalcea, & S. Poria. Findings of EMNLP 2020.
TL-ERC is a transfer learning-based framework for ERC. It pre-trains a generative dialogue model and transfers context-level weights that include affective knowledge into the target discriminative model for ERC.
Setup an environment with Conda:
conda env create -f environment.yml
conda activate TL_ERC
cd TL_ERC
python setup.py
Download dataset files IEMOCAP, DailyDialog and store them in ./datasets/
.
Download the pre-trained weights of HRED on Cornell and Ubuntu datasets and store them in ./generative_weights/
[Optional]: To train new generative weights from dialogue models, refer to https://github.com/ctr4si/A-Hierarchical-Latent-Structure-for-Variational-Conversation-Modeling .
cd bert_model
python train.py --load_checkpoint=../generative_weights/cornell_weights.pkl --data=iemocap
.
cornell
to ubuntu
and iemocap
to dailydialog
for other dataset combinations.load_checkpoint
to avoid initializing contextual weights.configs.py
python iemocap_preprocess.py
. Similarly for dailydialog
.Please cite the following paper if you find this code useful in your work.
Conversational transfer learning for emotion recognition. Hazarika, D., Poria, S., Zimmermann, R., & Mihalcea, R. (2020). Information Fusion.
DialogueGCN (Dialogue Graph Convolutional Network), is a graph neural network based approach to ERC. We leverage self and inter-speaker dependency of the interlocutors to model conversational context for emotion recognition. Through the graph network, DialogueGCN addresses context propagation issues present in the current RNN-based methods. DialogueGCN is naturally suited for multi-party dialogues.
Note: PyTorch Geometric makes heavy usage of CUDA atomic operations and is a source of non-determinism. To reproduce the results reported in the paper, we recommend to use the following execution command. Note that this script will execute in CPU. We obatined weighted average F1 scores of 64.67 in our machine and 64.44 in Google colaboratory for IEMOCAP dataset with the following command.
python train_IEMOCAP.py --base-model 'LSTM' --graph-model --nodal-attention --dropout 0.4 --lr 0.0003 --batch-size 32 --class-weight --l2 0.0 --no-cuda
Please cite the following paper if you find this code useful in your work.
DialogueGCN: A Graph Convolutional Neural Network for Emotion Recognition in Conversation. D. Ghosal, N. Majumder, S. Poria, N. Chhaya, & A. Gelbukh. EMNLP-IJCNLP (2019), Hong Kong, China.
Pytorch implementation to paper "DialogueGCN: A Graph Convolutional Neural Network for Emotion Recognition in Conversation".
You can run the whole process very easily. Take the IEMOCAP corpus for example:
./scripts/iemocap.sh preprocess
./scripts/iemocap.sh train
- | Dataset | Weighted F1 |
---|---|---|
Original | IEMOCAP | 64.18% |
This Implementation | IEMOCAP | 64.10% |
Mian Zhang (Github: mianzhang)
Please cite the following paper if you find this code useful in your work.
DialogueGCN: A Graph Convolutional Neural Network for Emotion Recognition in Conversation. D. Ghosal, N. Majumder, S. Poria, N. Chhaya, & A. Gelbukh. EMNLP-IJCNLP (2019), Hong Kong, China.
DialogueRNN is basically a customized recurrent neural network (RNN) that profiles each speaker in a conversation/dialogue on the fly, while models the context of the conversation at the same time. This model can easily be extended to multi-party scenario. Also, it can be used as a pretraining model for empathetic dialogue generation.
Note: the default settings (hyperparameters and commandline arguments) in the code are meant for BiDialogueRNN+Att. The user needs to optimize the settings for other the variants and changes.
Please extract the contents of DialogueRNN_features.zip
.
python train_IEMOCAP.py <command-line arguments>
python train_AVEC.py <command-line arguments>
--no-cuda
: Does not use GPU--lr
: Learning rate--l2
: L2 regularization weight--rec-dropout
: Recurrent dropout--dropout
: Dropout--batch-size
: Batch size--epochs
: Number of epochs--class-weight
: class weight (not applicable for AVEC)--active-listener
: Explicit lisnener mode--attention
: Attention type--tensorboard
: Enables tensorboard log--attribute
: Attribute 1 to 4 (only for AVEC; 1 = valence, 2 = activation/arousal, 3 = anticipation/expectation, 4 = power)Please cite the following paper if you find this code useful in your work.
DialogueRNN: An Attentive RNN for Emotion Detection in Conversations. N. Majumder, S. Poria, D. Hazarika, R. Mihalcea, E. Cambria, and G. Alexander. AAAI (2019), Honolulu, Hawaii, USA
Interactive COnversational memory Network (ICON) is a multimodal emotion detection framework that extracts multimodal features from conversational videos and hierarchically models the \textit{self-} and \textit{inter-speaker} emotional influences into global memories. Such memories generate contextual summaries which aid in predicting the emotional orientation of utterance-videos.
cd ICON
Unzip the data as follows:
/ICON/IEMOCAP/data/
. Sample command to achieve this: unzip {path_to_zip_file} -d ./IEMOCAP/
Train the ICON model:
python train_iemocap.py
for IEMOCAPICON: Interactive Conversational Memory Networkfor Multimodal Emotion Detection. D. Hazarika, S. Poria, R. Mihalcea, E. Cambria, and R. Zimmermann. EMNLP (2018), Brussels, Belgium
CMN is a neural framework for emotion detection in dyadic conversations. It leverages mutlimodal signals from text, audio and visual modalities. It specifically incorporates speaker-specific dependencies into its architecture for context modeling. Summaries are then generated from this context using multi-hop memory networks.
cd CMN
Unzip the data as follows:
/CMN/IEMOCAP/data/
. Sample command to achieve this: unzip {path_to_zip_file} -d ./IEMOCAP/
Train the ICON model:
python train_iemocap.py
for IEMOCAPPlease cite the following paper if you find this code useful in your work.
Hazarika, D., Poria, S., Zadeh, A., Cambria, E., Morency, L.P. and Zimmermann, R., 2018. Conversational Memory Network for Emotion Recognition in Dyadic Dialogue Videos. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers) (Vol. 1, pp. 2122-2132).
bc-LSTM-pytorch is a network for using context to detection emotion of an utterance in a dialogue. The model is simple but efficient which only uses a LSTM to model the temporal relation among the utterances. In this repo we gave the data of Semeval 2019 Task 3. We have used and provided the data released by Semeval 2019 Task 3 - "Emotion Recognition in Context" organizers. In this task only 3 utterances have been provided - utterance1 (user1), utterance2 (user2), utterance3 (user1) consecutively. The task is to predict the emotion label of utterance3. Emotion label of each utterance have not been provided. However, if your data contains emotion label of each utterance then you can still use this code and adapt it accordingly. Hence, this code is still aplicable for the datasets like MOSI, MOSEI, IEMOCAP, AVEC, DailyDialogue etc. bc-LSTM does not make use of speaker information like CMN, ICON and DialogueRNN.
cd bc-LSTM-pytorch
Train the bc-LSTM model:
python train_IEMOCAP.py
for IEMOCAPPlease cite the following paper if you find this code useful in your work.
Poria, S., Cambria, E., Hazarika, D., Majumder, N., Zadeh, A. and Morency, L.P., 2017. Context-dependent sentiment analysis in user-generated videos. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (Vol. 1, pp. 873-883).
Keras implementation of bc-LSTM.
cd bc-LSTM
Train the bc-LSTM model:
python baseline.py -config testBaseline.config
for IEMOCAPPlease cite the following paper if you find this code useful in your work.
Poria, S., Cambria, E., Hazarika, D., Majumder, N., Zadeh, A. and Morency, L.P., 2017. Context-dependent sentiment analysis in user-generated videos. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (Vol. 1, pp. 873-883).
This repository also contains implementations of different architectures to detect emotion cause in conversations.
Model | emo_f1 | pos_f1 | neg_f1 | macro_avg |
---|---|---|---|---|
ECPE-2d cross_road (0 transform layer) |
52.76 | 52.39 | 95.86 | 73.62 |
ECPE-2d window_constrained (1 transform layer) |
70.48 | 48.80 | 93.85 | 71.32 |
ECPE-2d cross_road (2 transform layer) |
52.76 | 55.50 | 94.96 | 75.23 |
ECPE-MLL | - | 48.48 | 94.68 | 71.58 |
Rank Emotion Cause | - | 33.00 | 97.30 | 65.15 |
RoBERTa-base | - | 64.28 | 88.74 | 76.51 |
RoBERTa-large | - | 66.23 | 87.89 | 77.06 |
Citation: Please cite the following papers if you use this code.
Citation: Please cite the following papers if you use this code.
Citation: Please cite the following papers if you use this code.
The RoBERTa and SpanBERT baselines as explained in the original RECCON paper. Refer to this.
Citation: Please cite the following papers if you use this code.