This project walks through how you can create recommendations using Apache Spark machine learning. There are a number of jupyter notebooks that you can run on IBM Data Science Experience, and there a live demo of a movie recommendation web application you can interact with. The demo also uses IBM Message Hub (kafka) to push application events to topic where they are consumed by a spark streaming job running on IBM BigInsights (hadoop).
This project walks through how you can create recommendations using Apache Spark machine learning. There are a number of jupyter notebooks that you can run on IBM Data Science Experience, and there a live demo of a movie recommendation web application you can interact with. The demo also uses IBM Message Hub (kafka) to push application events to topic where they are consumed by a spark streaming job running on IBM BigInsights (hadoop).
There is an overview video on YouTube.
This project is a demo movie recommender application. This demo has been installed with approximately four thousand movies and 500,000 ratings. The ratings have been generated randomly. The purpose of this web application is to allow users to search for movies, rate movies, and receive recommendations for movies based on their ratings.
Start with Introduction to read more about this project.
You can import these notebooks into IBM Data Science Experience. I have occasionally experienced issues when trying to load from a URL. If that happens to you, try cloning or downloading this repo and importing the notebooks as files.
The overall architecture looks like this:
The technologies used in this demo are:
Core components (Web Application)
Optional components (Hadoop Warehouse)
The core demo can run without these components.
Click on the link below, then follow the instructions. Note that this step may take quite a long time (maybe 30 minutes).
After deploying to Bluemix, you will need to create a new DSX project and import the notebooks. The notebook Step 07 is responsible for creating recommendations and saving them to Cloudant. You will not get recommendations until you have setup this notebook with your Cloudant credentials and run the notebook from DSX.
The screenshot below shows some movies being rated by a user.
The screenshot below shows movie recommendations provided by Spark machine learning.