Rcnn Text Classification Save

Tensorflow Implementation of "Recurrent Convolutional Neural Network for Text Classification" (AAAI 2015)

Project README

Recurrent Convolutional Neural Network for Text Classification

Tensorflow implementation of "Recurrent Convolutional Neural Network for Text Classification".

rcnn

Data: Movie Review

  • Movie reviews with one sentence per review. Classification involves detecting positive/negative reviews (Pang and Lee, 2005).
  • Download "sentence polarity dataset v1.0" at the <U>Official Download Page</U>.
  • Located in <U>"data/rt-polaritydata/"</U> in my repository.
  • rt-polarity.pos contains 5331 positive snippets.
  • rt-polarity.neg contains 5331 negative snippets.

Implementation of Recurrent Structure

recurrent_structure

  • Bidirectional RNN (Bi-RNN) is used to implement the left and right context vectors.
  • Each context vector is created by shifting the output of Bi-RNN and concatenating a zero state indicating the start of the context.

Usage

Train

  • positive data is located in <U>"data/rt-polaritydata/rt-polarity.pos"</U>.

  • negative data is located in <U>"data/rt-polaritydata/rt-polarity.neg"</U>.

  • "GoogleNews-vectors-negative300" is used as pre-trained word2vec model.

  • Display help message:

     python train.py --help
    
  • Train Example:

     python train.py --cell_type "lstm" \
    -pos_dir "data/rt-polaritydata/rt-polarity.pos" \
    -neg_dir "data/rt-polaritydata/rt-polarity.neg"\
    -word2vec "GoogleNews-vectors-negative300.bin"
    

Evalutation

  • Movie Review dataset has no test data.

  • If you want to evaluate, you should make test dataset from train data or do cross validation. However, cross validation is not implemented in my project.

  • The bellow example just use full rt-polarity dataset same the train dataset.

  • Evaluation Example:

     python eval.py \
    -pos_dir "data/rt-polaritydata/rt-polarity.pos" \
    -neg_dir "data/rt-polaritydata/rt-polarity.neg" \
    -checkpoint_dir "runs/1523902663/checkpoints"
    

Result

  • Comparision between Recurrent Convolutional Neural Network and Convolutional Neural Network.
  • dennybritz's cnn-text-classification-tf is used for compared CNN model.
  • Same pre-trained word2vec used for both models.

Accuracy for validation set

accuracy

Loss for validation set

accuracy

Reference

  • Recurrent Convolutional Neural Network for Text Classification (AAAI 2015), S Lai et al. [paper]
Open Source Agenda is not affiliated with "Rcnn Text Classification" Project. README Source: roomylee/rcnn-text-classification

Open Source Agenda Badge

Open Source Agenda Rating