JiaYuan Save

user profile of jiayuan.com

Project README

JiaYuan Spider and Data Analysis

Introduction

  • scrape data from shijijiayuan with BeautifulSoup and requests in Python3.5
  • machine learning algorithm in R
  • visualize data and generate report in in MS PowerPoint2016, R ggplot2, TAGUL

Prerequisites

  • Python3.X (Python 3.5 is recommended)
  • 3rd party library(requests, BeautifulSoup)

Note

  • for later research, a Linux OS(Ubuntu 16.04 or CentOS 7 will be fine) is required. If you use Windows, that may bring you some trouble

Results

  • Basic statistics info

    cover img1 img2

  • With NLP

    img5 img6 img7 img8

The Next

Next, I want to train this spider with the avatar image set based on Computer Vision, in order to enable this spider has ability to rank your face. Anyone who is interested in computer vision, deep learning please commit your issues.

For more details, please visit my article at Zhihu.

With pleasure!

Open Source Agenda is not affiliated with "JiaYuan" Project. README Source: lucasxlu/JiaYuan
Stars
40
Open Issues
1
Last Commit
7 years ago
Repository
License

Open Source Agenda Badge

Open Source Agenda Rating