document embedding and machine learning script for beginners

The repository contains some corpus(Korean), python scripts for training and inferring test document vectors using doc2vec.

Korean word2vec-api / doc2vec-api

Simple web service providing a word embedding API. The methods are based on Gensim Word2Vec / Doc2Vec implementation. Models are passed as parameters and must be in the Word2Vec / Doc2Vec text or binary format. This web2vec-api script is forked from this word2vec-api github and get minor update to support Korean word2vec models.

  • Install Dependencies
pip2 install -r requirements.txt
  • Launching the service
python word2vec-api --model path/to/the/model [--host host --port 1234]
ex) python /home/ --model /home/model/all_terms_50vectors --path /word2vec --host --port 4000

python doc2vec-api --model path/to/the/model [--host host --port 1234]
ex) python /home/ --model /home/model/all_terms_50vectors --path /doc2vec --host --port 4000

  • Example calls
