A module for E-mail Summarization which uses clustering of skip-thought sentence embeddings.
A module for E-mail Summarization which uses clustering of skip-thought sentence embeddings.
This code in this repository compliments this Medium article.
git clone https://github.com/ryankiros/skip-thoughts
email_summarization.py
to the root of the cloned skip-thoughts repository. Do:
git clone https://github.com/jatana-research/email-summarization
cp email-summarization/email_summarization.py skip-thoughts/
pip install -r email-summarization/requirements.txt
python -c 'import nltk; nltk.download("punkt")'
mkdir skip-thoughts/models
wget -P ./skip-thoughts/models http://www.cs.toronto.edu/~rkiros/models/dictionary.txt
wget -P ./skip-thoughts/models http://www.cs.toronto.edu/~rkiros/models/utable.npy
wget -P ./skip-thoughts/models http://www.cs.toronto.edu/~rkiros/models/btable.npy
wget -P ./skip-thoughts/models http://www.cs.toronto.edu/~rkiros/models/uni_skip.npz
wget -P ./skip-thoughts/models http://www.cs.toronto.edu/~rkiros/models/uni_skip.npz.pkl
wget -P ./skip-thoughts/models http://www.cs.toronto.edu/~rkiros/models/bi_skip.npz
wget -P ./skip-thoughts/models http://www.cs.toronto.edu/~rkiros/models/bi_skip.npz.pkl
md5sum skip-thoughts/models/*
The output should be:
9a15429d694a0e035f9ee1efcb1406f3 bi_skip.npz
c9b86840e1dedb05837735d8bf94cee2 bi_skip.npz.pkl
022b5b15f53a84c785e3153a2c383df6 btable.npy
26d8a3e6458500013723b380a4b4b55e dictionary.txt
8eb7c6948001740c3111d71a2fa446c1 uni_skip.npz
e1a0ead377877ff3ea5388bb11cfe8d7 uni_skip.npz.pkl
5871cc62fc01b79788c79c219b175617 utable.npy
Lines:23-24
in the file skip-thoughts/skipthoughts.py
to provide the correct paths to the downloaded models.
path_to_models = 'models/'
path_to_tables = 'models/'
skip-thoughts/
folder and do:
>>> from email_summarization import summarize
>>> summaries = summarize(emails) # emails is a Python list containing English emails.