Self Attentive Emb Tf Save

Simple Tensorflow Implementation of "A Structured Self-attentive Sentence Embedding" (ICLR 2017)

Project README

A Structured Self-attentive Sentence Embedding

Tensorflow Implementation of "A Structured Self-attentive Sentence Embedding" (ICLR 2017).

image

Usage

Data

  • AG's news topic classification dataset.
  • The csv files (in my data directory) were available from here.

Train

  • "GoogleNews-vectors-negative300" is used as pre-trained word2vec model.

  • Display help message:

     python train.py --help
    
    rain.py:
    --[no]allow_soft_placement: Allow device soft device placement
    	(default: 'true')
    --batch_size: Batch Size
    	(default: '64')
    	(an integer)
    --checkpoint_every: Save model after this many steps
    	(default: '100')
    	(an integer)
    --d_a_size: Size of W_s1 embedding
    	(default: '350')
    	(an integer)
    --dev_sample_percentage: Percentage of the training data to use for validation
    	(default: '0.1')
    	(a number)
    --display_every: Number of iterations to display training info.
    	(default: '10')
    	(an integer)
    --embedding_dim: Dimensionality of word embedding
    	(default: '300')
    	(an integer)
    --evaluate_every: Evaluate model on dev set after this many steps
    	(default: '100')
    	(an integer)
    --fc_size: Size of fully connected layer
    	(default: '2000')
    	(an integer)
    --hidden_size: Size of LSTM hidden layer
    	(default: '256')
    	(an integer)
    --learning_rate: Which learning rate to start with.
    	(default: '0.001')
    	(a number)
    --[no]log_device_placement: Log placement of ops on devices
    	(default: 'false')
    --max_sentence_length: Max sentence length in train/test data
    	(default: '50')
    	(an integer)
    --num_checkpoints: Number of checkpoints to store
    	(default: '5')
    	(an integer)
    --num_epochs: Number of training epochs
    	(default: '10')
    	(an integer)
    --p_coef: Coefficient for penalty
    	(default: '1.0')
    	(a number)
    --r_size: Size of W_s2 embedding
    	(default: '30')
    	(an integer)
    --train_dir: Path of train data
    	(default: 'data/train.csv')
    --word2vec: Word2vec file with pre-trained embeddings
    
  • Train Example (with word2vec):

     python train.py --word2vec "GoogleNews-vectors-negative300.bin"
    

Evalutation

  • You must give "checkpoint_dir" argument, path of checkpoint(trained neural model) file, like below example.

  • If you don't want to visualize the attention, give option like --visualize False.

  • Evaluation Example:

     python eval.py --checkpoint_dir "runs/1523902663/checkpoints/"
    

Results

1) Accuracy test data = 0.920789

2) Visualization of Self Attention

viz

Reference

Open Source Agenda is not affiliated with "Self Attentive Emb Tf" Project. README Source: roomylee/self-attentive-emb-tf

Open Source Agenda Badge

Open Source Agenda Rating