Recommenders Versions Save

Best Practices on Recommendation Systems

0.1.1

5 years ago

New Algorithms or improvements

  • Improved SAR single node for top k recommendations. User can decide if the recommended top k items to be sorted or not.

New utilities or improvements

  • Added data related utility functions like movielens data download in Python and PySpark.
  • Added new data split method (timestamp based split) added.

New Notebooks or improvements

  • Added an O16N notebook for Spark ALS movie recommender on Azure production services such as Databricks, Cosmos DB, and Kubernetes Services.
  • Added SAR deep dive notebook with single-node implementation demonstrated.
  • Added Surprise SVD deep dive notebook.
  • Added Surprise SVD integration test.
  • Added Surprise SVD ranking metrics evaluation.
  • Made quick-start notebooks consistent in terms of running settings, i.e., experiment protocols (e.g., data split, evaluation metrics, etc.) and algorithm parameters (e.g., hyper parameters, remove seen items, etc.).
  • Added a comparison notebook for easy benchmarking different algorithms.

Other features

  • Updated SETUP with Azure Databricks.
  • Added SETUP troubleshooting for Azure DSVM and Databricks.
  • Updated READMEs under each notebook directory to provide comprehensive guidelines.
  • Added smoke/integration tests on large movielens dataset (10mil and 20mil).
  • Updated the Spark settings of CI/CD machine to eliminate unexpected build failures such as "no space left issue".

0.1.0

5 years ago

New Algorithms or improvements

Development of SAR algorithm on three implementations:

New utilities or improvements

New Notebooks or improvements

Other features

  • Benchmark of the current algorithms.
  • Unit, smoke and integration tests for Python and PySpark environments.