Improved SAR single node for top k recommendations. User can decide if the recommended top k items to be sorted or not.
New utilities or improvements
Added data related utility functions like movielens data download in Python and PySpark.
Added new data split method (timestamp based split) added.
New Notebooks or improvements
Added an O16N notebook for Spark ALS movie recommender on Azure production services such as Databricks, Cosmos DB, and Kubernetes Services.
Added SAR deep dive notebook with single-node implementation demonstrated.
Added Surprise SVD deep dive notebook.
Added Surprise SVD integration test.
Added Surprise SVD ranking metrics evaluation.
Made quick-start notebooks consistent in terms of running settings, i.e., experiment protocols (e.g., data split, evaluation metrics, etc.) and algorithm parameters (e.g., hyper parameters, remove seen items, etc.).
Added a comparison notebook for easy benchmarking different algorithms.
Other features
Updated SETUP with Azure Databricks.
Added SETUP troubleshooting for Azure DSVM and Databricks.
Updated READMEs under each notebook directory to provide comprehensive guidelines.
Added smoke/integration tests on large movielens dataset (10mil and 20mil).
Updated the Spark settings of CI/CD machine to eliminate unexpected build failures such as "no space left issue".
0.1.0
5 years ago
New Algorithms or improvements
Development of SAR algorithm on three implementations: